developer_uid: chai_evaluation_service
submission_id: function_goger_2025-12-18
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-21T22:31:16+00:00
num_battles: 6621
num_wins: 3307
celo_rating: 1293.02
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-21
win_ratio: 0.4994713789457786
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.6042063236236572s
Received healthy response to inference request in 2.0149343013763428s
Received healthy response to inference request in 2.05389142036438s
Received healthy response to inference request in 3.1095831394195557s
Received healthy response to inference request in 3.228506565093994s
Received healthy response to inference request in 1.6467480659484863s
Received healthy response to inference request in 2.8899450302124023s
Received healthy response to inference request in 2.2316083908081055s
Received healthy response to inference request in 2.7357962131500244s
10 requests
1 failed requests
5th percentile: 1.8124318718910217
10th percentile: 1.9781156778335571
20th percentile: 2.0460999965667725
30th percentile: 2.178293299674988
40th percentile: 2.4551671504974366
50th percentile: 2.670001268386841
60th percentile: 2.7974557399749753
70th percentile: 2.955836462974548
80th percentile: 3.133367824554443
90th percentile: 4.915089321136469
95th percentile: 12.504711723327619
99th percentile: 18.57640964508057
mean time: 4.260955357551575
%s, retrying in %s seconds...
Received healthy response to inference request in 3.0956547260284424s
Received healthy response to inference request in 2.451716661453247s
Received healthy response to inference request in 3.496365785598755s
Received healthy response to inference request in 3.281876564025879s
Received healthy response to inference request in 4.252498149871826s
Received healthy response to inference request in 1.7907929420471191s
Received healthy response to inference request in 2.0823726654052734s
Received healthy response to inference request in 2.967196226119995s
Received healthy response to inference request in 3.3006060123443604s
Received healthy response to inference request in 3.003547430038452s
10 requests
0 failed requests
5th percentile: 1.9220038175582885
10th percentile: 2.053214693069458
20th percentile: 2.3778478622436525
30th percentile: 2.8125523567199706
40th percentile: 2.9890069484710695
50th percentile: 3.0496010780334473
60th percentile: 3.170143461227417
70th percentile: 3.2874953985214233
80th percentile: 3.339757966995239
90th percentile: 3.5719790220260617
95th percentile: 3.9122385859489435
99th percentile: 4.18444623708725
mean time: 2.972262716293335
Pipeline stage StressChecker completed in 75.11s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.58s
Shutdown handler de-registered
function_goger_2025-12-18 status is now deployed due to DeploymentManager action
function_goger_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_goger_2025-12-18 status is now torndown due to DeploymentManager action