developer_uid: chai_evaluation_service
submission_id: function_tahot_2025-12-15
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-18T23:51:23+00:00
num_battles: 7528
num_wins: 3771
celo_rating: 1293.73
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-18
win_ratio: 0.5009298618490967
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.6971209049224854s
Received healthy response to inference request in 3.0942885875701904s
Received healthy response to inference request in 3.7159905433654785s
Received healthy response to inference request in 2.0310540199279785s
Received healthy response to inference request in 3.1127469539642334s
Received healthy response to inference request in 3.5907647609710693s
Received healthy response to inference request in 3.3563733100891113s
Received healthy response to inference request in 4.635013103485107s
Received healthy response to inference request in 3.6284875869750977s
10 requests
1 failed requests
5th percentile: 1.8473908066749574
10th percentile: 1.9976607084274292
20th percentile: 2.881641674041748
30th percentile: 3.1072094440460205
40th percentile: 3.2589227676391603
50th percentile: 3.4735690355300903
60th percentile: 3.6058538913726808
70th percentile: 3.6547384738922117
80th percentile: 3.8997950553894043
90th percentile: 6.181939649581904
95th percentile: 13.143109107017501
99th percentile: 18.712044672966005
mean time: 4.896611833572388
%s, retrying in %s seconds...
Received healthy response to inference request in 1.8835742473602295s
Received healthy response to inference request in 2.754171371459961s
Received healthy response to inference request in 3.8089599609375s
Received healthy response to inference request in 3.7980904579162598s
Received healthy response to inference request in 2.1898751258850098s
Received healthy response to inference request in 4.230880975723267s
Received healthy response to inference request in 3.2811717987060547s
Received healthy response to inference request in 2.092758893966675s
Received healthy response to inference request in 2.7736027240753174s
Received healthy response to inference request in 3.0852532386779785s
10 requests
0 failed requests
5th percentile: 1.9777073383331298
10th percentile: 2.07184042930603
20th percentile: 2.1704518795013428
30th percentile: 2.5848824977874756
40th percentile: 2.7658301830291747
50th percentile: 2.929427981376648
60th percentile: 3.1636206626892087
70th percentile: 3.436247396469116
80th percentile: 3.800264358520508
90th percentile: 3.8511520624160767
95th percentile: 4.041016519069672
99th percentile: 4.192908084392547
mean time: 2.989833879470825
Pipeline stage StressChecker completed in 81.53s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.57s
Shutdown handler de-registered
function_tahot_2025-12-15 status is now deployed due to DeploymentManager action
function_tahot_2025-12-15 status is now inactive due to auto deactivation removed underperforming models
function_tahot_2025-12-15 status is now torndown due to DeploymentManager action