developer_uid: chai_backend_admin
submission_id: function_sehur_2025-12-21
model_name: function_sehur_2025-12-21
model_group:
status: torndown
timestamp: 2025-12-24T16:21:52+00:00
num_battles: 10707
num_wins: 5464
celo_rating: 1300.5
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: function_sehur_2025-12-21
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-24
win_ratio: 0.5103203511721304
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 9.936731576919556s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 6.202186107635498s
Received healthy response to inference request in 7.731057167053223s
Received healthy response to inference request in 8.270299196243286s
Received healthy response to inference request in 4.265926361083984s
Received healthy response to inference request in 9.658490657806396s
Received healthy response to inference request in 8.55517315864563s
Received healthy response to inference request in 4.924682378768921s
10 requests
2 failed requests
5th percentile: 4.562366569042206
10th percentile: 4.858806777000427
20th percentile: 5.946685361862182
30th percentile: 7.272395849227905
40th percentile: 8.05460238456726
50th percentile: 8.412736177444458
60th percentile: 8.996500158309935
70th percentile: 9.741962933540345
80th percentile: 11.970255231857301
90th percentile: 20.105250310897826
95th percentile: 20.109302377700807
99th percentile: 20.112544031143187
mean time: 9.976225090026855
%s, retrying in %s seconds...
Received healthy response to inference request in 7.219029903411865s
Received healthy response to inference request in 7.900971412658691s
Received healthy response to inference request in 7.851586103439331s
Received healthy response to inference request in 8.838312864303589s
Received healthy response to inference request in 8.445942401885986s
Received healthy response to inference request in 4.031769037246704s
Received healthy response to inference request in 7.764102935791016s
Received healthy response to inference request in 9.319502115249634s
Received healthy response to inference request in 7.270394563674927s
Received healthy response to inference request in 4.341033697128296s
10 requests
0 failed requests
5th percentile: 4.17093813419342
10th percentile: 4.310107231140137
20th percentile: 6.6434306621551515
30th percentile: 7.2549851655960085
40th percentile: 7.56661958694458
50th percentile: 7.807844519615173
60th percentile: 7.871340227127075
70th percentile: 8.06446270942688
80th percentile: 8.524416494369508
90th percentile: 8.886431789398193
95th percentile: 9.102966952323913
99th percentile: 9.27619508266449
mean time: 7.298264503479004
Pipeline stage StressChecker completed in 175.99s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.57s
Shutdown handler de-registered
function_sehur_2025-12-21 status is now deployed due to DeploymentManager action
function_sehur_2025-12-21 status is now inactive due to auto deactivation removed underperforming models
function_sehur_2025-12-21 status is now torndown due to DeploymentManager action