developer_uid: chai_evaluation_service
submission_id: function_bugub_2025-12-16
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-19T11:51:21+00:00
num_battles: 9000
num_wins: 4450
celo_rating: 1289.59
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-19
win_ratio: 0.49444444444444446
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.746958494186401s
Received healthy response to inference request in 4.2111005783081055s
Received healthy response to inference request in 3.215625047683716s
Received healthy response to inference request in 2.5612266063690186s
Received healthy response to inference request in 5.656946659088135s
Received healthy response to inference request in 3.8591225147247314s
Received healthy response to inference request in 2.1116819381713867s
Received healthy response to inference request in 3.2990663051605225s
Received healthy response to inference request in 3.659945249557495s
10 requests
1 failed requests
5th percentile: 2.313977038860321
10th percentile: 2.5162721395492555
20th percentile: 3.0847453594207765
30th percentile: 3.2740339279174804
40th percentile: 3.515593671798706
50th percentile: 3.7595338821411133
60th percentile: 3.999913740158081
70th percentile: 4.371857953071594
80th percentile: 4.928956127166749
90th percentile: 7.101908111572261
95th percentile: 13.60423464775084
99th percentile: 18.80609587669373
mean time: 5.342823457717896
%s, retrying in %s seconds...
Received healthy response to inference request in 3.018734931945801s
Received healthy response to inference request in 5.963938236236572s
Received healthy response to inference request in 2.431204080581665s
Received healthy response to inference request in 3.563541889190674s
Received healthy response to inference request in 4.0643651485443115s
Received healthy response to inference request in 2.9011104106903076s
Received healthy response to inference request in 4.561880588531494s
Received healthy response to inference request in 6.712808132171631s
Received healthy response to inference request in 10.049372673034668s
Failed to get response for submission chaiml-ssnew-v5-dpo-lr5_24415_v4: ('http://guanaco-model-mesh-load-balancer.model-mesh.k2.chaiverse.com/models/chaiml-ssnew-v5-dpo-lr5_24415_v4/predict', '{"detail":"1 validation error for RuntimeResponse\\npredictions\\n Field required [type=missing, input_value={\'detail\': \\"503, message=...-lr5_24415_v4/predict\'\\"}, input_type=dict]\\n For further information visit https://errors.pydantic.dev/2.11/v/missing"}')
Received healthy response to inference request in 4.044967174530029s
10 requests
0 failed requests
5th percentile: 2.642661929130554
10th percentile: 2.8541197776794434
20th percentile: 2.9952100276947022
30th percentile: 3.4000998020172117
40th percentile: 3.852397060394287
50th percentile: 4.05466616153717
60th percentile: 4.263371324539184
70th percentile: 4.982497882843017
80th percentile: 6.113712215423584
90th percentile: 7.046464586257933
95th percentile: 8.547918629646297
99th percentile: 9.749081864356995
mean time: 4.731192326545715
Pipeline stage StressChecker completed in 105.76s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.16s
Shutdown handler de-registered
function_bugub_2025-12-16 status is now deployed due to DeploymentManager action
function_bugub_2025-12-16 status is now inactive due to auto deactivation removed underperforming models
function_bugub_2025-12-16 status is now torndown due to DeploymentManager action