function_lijus_2024-10-18

developer_uid: chai_backend_admin

submission_id: function_lijus_2024-10-18

model_name: reward_blend_default_full_bon

model_group:

status: torndown

timestamp: 2024-10-18T20:13:51+00:00

num_battles: 11482

num_wins: 5872

celo_rating: 1262.97

family_friendly_score: 0.5794573643410852

family_friendly_standard_error: 0.0048296777836941895

submission_type: function

display_name: reward_blend_default_full_bon

is_internal_developer: True

ranking_group: single

us_pacific_date: 2024-10-18

win_ratio: 0.5114091621668699

generation_params: {'temperature': 0.9, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 50, 'presence_penalty': 0.5, 'frequency_penalty': 0.5, 'stopping_words': ['\n', '</s>', '<|user|>', '###'], 'max_input_tokens': 512, 'best_of': 4, 'max_output_tokens': 64}

formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.062871217727661s
Received healthy response to inference request in 4.680122375488281s
Received healthy response to inference request in 9.763618469238281s
Received healthy response to inference request in 2.3323042392730713s
Received healthy response to inference request in 2.2210657596588135s
5 requests
0 failed requests
5th percentile: 2.0945101261138914
10th percentile: 2.126149034500122
20th percentile: 2.189426851272583
30th percentile: 2.243313455581665
40th percentile: 2.2878088474273683
50th percentile: 2.3323042392730713
60th percentile: 3.271431493759155
70th percentile: 4.210558748245239
80th percentile: 5.696821594238282
90th percentile: 7.730220031738282
95th percentile: 8.746919250488281
99th percentile: 9.560278625488282
mean time: 4.211996412277221
%s, retrying in %s seconds...
Received healthy response to inference request in 9.08602237701416s
Received healthy response to inference request in 1.5679919719696045s
Received healthy response to inference request in 4.69090461730957s
Received healthy response to inference request in 2.8330724239349365s
Received healthy response to inference request in 1.9871418476104736s
5 requests
0 failed requests
5th percentile: 1.6518219470977784
10th percentile: 1.735651922225952
20th percentile: 1.9033118724822997
30th percentile: 2.156327962875366
40th percentile: 2.4947001934051514
50th percentile: 2.8330724239349365
60th percentile: 3.57620530128479
70th percentile: 4.319338178634643
80th percentile: 5.569928169250489
90th percentile: 7.327975273132324
95th percentile: 8.206998825073242
99th percentile: 8.910217666625977
mean time: 4.033026647567749
%s, retrying in %s seconds...
Received healthy response to inference request in 2.35589861869812s
Received healthy response to inference request in 19.916496753692627s
Received healthy response to inference request in 2.439626932144165s
Received healthy response to inference request in 2.6133298873901367s
Received healthy response to inference request in 2.9623265266418457s
5 requests
0 failed requests
5th percentile: 2.372644281387329
10th percentile: 2.3893899440765383
20th percentile: 2.422881269454956
30th percentile: 2.4743675231933593
40th percentile: 2.5438487052917482
50th percentile: 2.6133298873901367
60th percentile: 2.75292854309082
70th percentile: 2.892527198791504
80th percentile: 6.353160572052005
90th percentile: 13.134828662872316
95th percentile: 16.525662708282468
99th percentile: 19.238329944610594
mean time: 6.0575357437133786
Pipeline stage StressChecker completed in 75.68s
Shutdown handler de-registered
function_lijus_2024-10-18 status is now deployed due to DeploymentManager action
function_lijus_2024-10-18 status is now inactive due to auto deactivation removed underperforming models
function_lijus_2024-10-18 status is now torndown due to DeploymentManager action