developer_uid: chai_evaluation_service
submission_id: function_rigom_2025-12-16
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-19T07:21:14+00:00
num_battles: 6445
num_wins: 3196
celo_rating: 1256.68
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-15
win_ratio: 0.4958882854926299
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 7.441110610961914s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.124005079269409s
Received healthy response to inference request in 5.0967888832092285s
Received healthy response to inference request in 2.951565980911255s
Received healthy response to inference request in 2.473052978515625s
Received healthy response to inference request in 2.6577115058898926s
Received healthy response to inference request in 2.828312635421753s
Received healthy response to inference request in 2.462876558303833s
Received healthy response to inference request in 3.4152486324310303s
Received healthy response to inference request in 1.883392572402954s
Received healthy response to inference request in 2.9084367752075195s
Received healthy response to inference request in 3.462048053741455s
Received healthy response to inference request in 1.6826095581054688s
Received healthy response to inference request in 2.263787031173706s
Received healthy response to inference request in 2.575468063354492s
Received healthy response to inference request in 2.3694775104522705s
Received healthy response to inference request in 1.9759232997894287s
10 requests
1 failed requests
5th percentile: 1.9250313997268678
10th percentile: 1.9666702270507812
20th percentile: 2.094388723373413
Received healthy response to inference request in 2.7312850952148438s
30th percentile: 2.221852445602417
40th percentile: 2.327201318740845
50th percentile: 2.4161770343780518
60th percentile: 2.5408105373382566
70th percentile: 2.7329290866851808
80th percentile: 2.9170626163482667
90th percentile: 4.674643039703363
95th percentile: 12.428489804267866
99th percentile: 18.631567215919496
mean time: 4.177951288223267
%s, retrying in %s seconds...
Received healthy response to inference request in 2.941526412963867s
Received healthy response to inference request in 4.226437330245972s
10 requests
0 failed requests
5th percentile: 2.038309097290039
10th percentile: 2.3940086364746094
20th percentile: 2.5549850463867188
30th percentile: 2.6845399856567385
40th percentile: 2.7895016193389894
50th percentile: 3.1217806339263916
60th percentile: 3.4339684009552003
70th percentile: 3.69136483669281
80th percentile: 4.400507640838623
90th percentile: 5.331221055984496
95th percentile: 6.386165833473203
99th percentile: 7.230121655464172
mean time: 3.5932361841201783
Pipeline stage StressChecker completed in 38.60s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Received healthy response to inference request in 2.5816879272460938s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.80s
Shutdown handler de-registered
function_rigom_2025-12-16 status is now deployed due to DeploymentManager action
function_rigom_2025-12-16 status is now inactive due to auto deactivation removed underperforming models
function_rigom_2025-12-16 status is now torndown due to DeploymentManager action