developer_uid: chai_evaluation_service
submission_id: function_ladok_2025-12-16
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-19T22:21:27+00:00
num_battles: 8979
num_wins: 4516
celo_rating: 1295.19
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-19
win_ratio: 0.5029513308831719
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.909587860107422s
Received healthy response to inference request in 3.5155632495880127s
Received healthy response to inference request in 5.246141672134399s
Received healthy response to inference request in 4.500831842422485s
Received healthy response to inference request in 3.8830466270446777s
Received healthy response to inference request in 3.0810770988464355s
Received healthy response to inference request in 2.8004021644592285s
Received healthy response to inference request in 3.671842575073242s
Received healthy response to inference request in 6.515254259109497s
10 requests
1 failed requests
5th percentile: 2.9267058849334715
10th percentile: 3.053009605407715
20th percentile: 3.428666019439697
30th percentile: 3.6249587774276733
40th percentile: 3.7985650062561036
50th percentile: 3.89631724357605
60th percentile: 4.1460854530334466
70th percentile: 4.72442479133606
80th percentile: 5.499964189529419
90th percentile: 7.912489724159236
95th percentile: 14.200049316883073
99th percentile: 19.230096991062165
mean time: 5.761135625839233
%s, retrying in %s seconds...
Received healthy response to inference request in 5.281750440597534s
Received healthy response to inference request in 5.607672214508057s
Received healthy response to inference request in 1.8635401725769043s
Received healthy response to inference request in 3.1846513748168945s
Received healthy response to inference request in 2.7048861980438232s
Received healthy response to inference request in 7.302424907684326s
Received healthy response to inference request in 3.476710557937622s
Received healthy response to inference request in 4.739753246307373s
Received healthy response to inference request in 6.786931753158569s
Received healthy response to inference request in 3.1636877059936523s
10 requests
0 failed requests
5th percentile: 2.2421458840370176
10th percentile: 2.6207515954971314
20th percentile: 3.0719274044036866
30th percentile: 3.178362274169922
40th percentile: 3.359886884689331
50th percentile: 4.108231902122498
60th percentile: 4.956552124023437
70th percentile: 5.379526972770691
80th percentile: 5.8435241222381595
90th percentile: 6.838481068611145
95th percentile: 7.070452988147735
99th percentile: 7.256030523777008
mean time: 4.411200857162475
Pipeline stage StressChecker completed in 105.04s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.56s
Shutdown handler de-registered
function_ladok_2025-12-16 status is now deployed due to DeploymentManager action
function_ladok_2025-12-16 status is now inactive due to auto deactivation removed underperforming models
function_ladok_2025-12-16 status is now torndown due to DeploymentManager action