developer_uid: chai_evaluation_service
submission_id: function_lesom_2025-12-18
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-21T09:51:17+00:00
num_battles: 7616
num_wins: 3836
celo_rating: 1295.9
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-21
win_ratio: 0.5036764705882353
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Failed to get response for submission function_lekim_2025-12-18: HTTPConnectionPool(host='chaiml-llama31-mer-v2-t-44570-v4-predictor.tenant-chaiml-guanaco.k2.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.209522247314453s
Received healthy response to inference request in 2.737698554992676s
Received healthy response to inference request in 2.23136830329895s
Received healthy response to inference request in 2.177123785018921s
Received healthy response to inference request in 3.035479784011841s
Received healthy response to inference request in 2.959958076477051s
Received healthy response to inference request in 1.8736615180969238s
Received healthy response to inference request in 2.9078738689422607s
Received healthy response to inference request in 2.8777058124542236s
10 requests
1 failed requests
5th percentile: 2.0102195382118224
10th percentile: 2.146777558326721
20th percentile: 2.2205193996429444
30th percentile: 2.585799479484558
40th percentile: 2.8217029094696047
50th percentile: 2.892789840698242
60th percentile: 2.928707551956177
70th percentile: 2.982614588737488
80th percentile: 3.0702882766723634
90th percentile: 4.899934053421014
95th percentile: 12.506787180900556
99th percentile: 18.59226968288422
mean time: 4.412403225898743
%s, retrying in %s seconds...
Received healthy response to inference request in 1.9819002151489258s
Received healthy response to inference request in 2.1659951210021973s
Received healthy response to inference request in 1.9554550647735596s
Received healthy response to inference request in 3.0708487033843994s
Received healthy response to inference request in 2.622790813446045s
Received healthy response to inference request in 3.3658201694488525s
Received healthy response to inference request in 1.9187798500061035s
Received healthy response to inference request in 2.9603614807128906s
Received healthy response to inference request in 3.0000193119049072s
Received healthy response to inference request in 2.6013424396514893s
10 requests
0 failed requests
5th percentile: 1.9352836966514588
10th percentile: 1.951787543296814
20th percentile: 1.9766111850738526
30th percentile: 2.110766649246216
40th percentile: 2.4272035121917725
50th percentile: 2.612066626548767
60th percentile: 2.757819080352783
70th percentile: 2.9722588300704955
80th percentile: 3.0141851902008057
90th percentile: 3.1003458499908447
95th percentile: 3.233083009719848
99th percentile: 3.3392727375030518
mean time: 2.564331316947937
Pipeline stage StressChecker completed in 72.90s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.58s
Shutdown handler de-registered
function_lesom_2025-12-18 status is now deployed due to DeploymentManager action
function_lesom_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_lesom_2025-12-18 status is now torndown due to DeploymentManager action