function_tasus_2025-12-20

developer_uid: chai_backend_admin

submission_id: function_tasus_2025-12-20

model_name: function_tasus_2025-12-20

model_group:

status: torndown

timestamp: 2025-12-23T15:51:21+00:00

num_battles: 7115

num_wins: 3609

celo_rating: 1298.08

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: function_tasus_2025-12-20

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-23

win_ratio: 0.5072382290934645

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 9.635560035705566s
Received healthy response to inference request in 18.651384592056274s
Received healthy response to inference request in 14.625645637512207s
Received healthy response to inference request in 5.6514081954956055s
Received healthy response to inference request in 10.103769302368164s
Received healthy response to inference request in 10.745441913604736s
Received healthy response to inference request in 4.828473329544067s
Received healthy response to inference request in 2.7325973510742188s
Received healthy response to inference request in 7.844499588012695s
Received healthy response to inference request in 6.433520078659058s
10 requests
0 failed requests
5th percentile: 3.6757415413856505
10th percentile: 4.618885731697082
20th percentile: 5.486821222305298
30th percentile: 6.198886513710022
40th percentile: 7.28010778427124
50th percentile: 8.74002981185913
60th percentile: 9.822843742370605
70th percentile: 10.296271085739136
80th percentile: 11.521482658386232
90th percentile: 15.028219532966613
95th percentile: 16.83980206251144
99th percentile: 18.28906808614731
mean time: 9.12523000240326
%s, retrying in %s seconds...
Received healthy response to inference request in 11.6411714553833s
Received healthy response to inference request in 3.8629870414733887s
Received healthy response to inference request in 18.749499082565308s
Received healthy response to inference request in 16.046889543533325s
Received healthy response to inference request in 6.07898211479187s
Received healthy response to inference request in 2.2710299491882324s
Received healthy response to inference request in 18.089046478271484s
Received healthy response to inference request in 10.297120809555054s
Received healthy response to inference request in 8.049479246139526s
Received healthy response to inference request in 3.010477304458618s
10 requests
0 failed requests
5th percentile: 2.603781259059906
10th percentile: 2.9365325689315798
20th percentile: 3.6924850940704346
30th percentile: 5.414183592796325
40th percentile: 7.261280393600464
50th percentile: 9.17330002784729
60th percentile: 10.834741067886352
70th percentile: 12.962886881828307
80th percentile: 16.45532093048096
90th percentile: 18.155091738700868
95th percentile: 18.452295410633088
99th percentile: 18.690058348178862
mean time: 9.80966830253601
%s, retrying in %s seconds...
Received healthy response to inference request in 7.739737272262573s
Received healthy response to inference request in 7.992274284362793s
Received healthy response to inference request in 4.160869359970093s
Received healthy response to inference request in 2.9721996784210205s
Received healthy response to inference request in 2.0741138458251953s
Received healthy response to inference request in 7.773394823074341s
Received healthy response to inference request in 7.066720485687256s
Received healthy response to inference request in 5.666540622711182s
Received healthy response to inference request in 5.031728982925415s
Received healthy response to inference request in 7.783646821975708s
10 requests
0 failed requests
5th percentile: 2.4782524704933167
10th percentile: 2.882391095161438
20th percentile: 3.9231354236602782
30th percentile: 4.770471096038818
40th percentile: 5.412615966796875
50th percentile: 6.366630554199219
60th percentile: 7.335927200317382
70th percentile: 7.7498345375061035
80th percentile: 7.775445222854614
90th percentile: 7.804509568214416
95th percentile: 7.898391926288604
99th percentile: 7.973497812747955
mean time: 5.826122617721557
Pipeline stage StressChecker completed in 252.36s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.84s
Shutdown handler de-registered
function_tasus_2025-12-20 status is now deployed due to DeploymentManager action
function_tasus_2025-12-20 status is now inactive due to auto deactivation removed underperforming models
function_tasus_2025-12-20 status is now torndown due to DeploymentManager action