developer_uid: chai_evaluation_service
submission_id: function_tasus_2025-12-16
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-19T11:21:17+00:00
num_battles: 8825
num_wins: 4451
celo_rating: 1296.14
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-19
win_ratio: 0.5043626062322946
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.7085936069488525s
Received healthy response to inference request in 4.5618507862091064s
Received healthy response to inference request in 4.669980525970459s
Received healthy response to inference request in 4.125970363616943s
Received healthy response to inference request in 4.15761399269104s
Received healthy response to inference request in 7.870586156845093s
Received healthy response to inference request in 6.912040948867798s
Received healthy response to inference request in 3.5432581901550293s
Received healthy response to inference request in 3.8327271938323975s
10 requests
1 failed requests
5th percentile: 3.084192669391632
10th percentile: 3.459791731834412
20th percentile: 3.774833393096924
30th percentile: 4.037997412681579
40th percentile: 4.1449565410614015
50th percentile: 4.359732389450073
60th percentile: 4.605102682113648
70th percentile: 5.34259865283966
80th percentile: 7.103749990463257
90th percentile: 9.099428081512446
95th percentile: 14.62921674251555
99th percentile: 19.053047671318055
mean time: 6.254162716865539
%s, retrying in %s seconds...
Received healthy response to inference request in 6.806924104690552s
Received healthy response to inference request in 10.795421838760376s
Received healthy response to inference request in 6.709969758987427s
Received healthy response to inference request in 9.9669029712677s
Received healthy response to inference request in 2.4803223609924316s
Received healthy response to inference request in 4.9174981117248535s
Received healthy response to inference request in 10.318480968475342s
Received healthy response to inference request in 2.731297492980957s
Received healthy response to inference request in 5.209204912185669s
Received healthy response to inference request in 2.7614870071411133s
10 requests
0 failed requests
5th percentile: 2.593261170387268
10th percentile: 2.7061999797821046
20th percentile: 2.755449104309082
30th percentile: 4.27069478034973
40th percentile: 5.092522192001343
50th percentile: 5.959587335586548
60th percentile: 6.748751497268676
70th percentile: 7.754917764663696
80th percentile: 10.037218570709229
90th percentile: 10.366175055503845
95th percentile: 10.58079844713211
99th percentile: 10.752497160434723
mean time: 6.269750952720642
Pipeline stage StressChecker completed in 128.10s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.57s
Shutdown handler de-registered
function_tasus_2025-12-16 status is now deployed due to DeploymentManager action
function_tasus_2025-12-16 status is now inactive due to auto deactivation removed underperforming models
function_tasus_2025-12-16 status is now torndown due to DeploymentManager action