developer_uid: chai_evaluation_service
submission_id: function_busen_2025-12-14
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-17T14:01:12+00:00
num_battles: 8827
num_wins: 4409
celo_rating: 1269.6
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-14
win_ratio: 0.49949020052112836
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 9.784047365188599s
Received healthy response to inference request in 3.4470248222351074s
Received healthy response to inference request in 3.787484884262085s
Received healthy response to inference request in 3.176166534423828s
Received healthy response to inference request in 3.234611749649048s
Received healthy response to inference request in 7.891406059265137s
Received healthy response to inference request in 3.3410356044769287s
Received healthy response to inference request in 3.429658889770508s
Received healthy response to inference request in 4.566054821014404s
10 requests
1 failed requests
5th percentile: 3.202466881275177
10th percentile: 3.228767228126526
20th percentile: 3.3197508335113524
30th percentile: 3.403071904182434
40th percentile: 3.4400784492492678
50th percentile: 3.617254853248596
60th percentile: 4.098912858963012
70th percentile: 5.563660192489623
80th percentile: 8.26993432044983
90th percentile: 10.81635134220123
95th percentile: 15.461719238758075
99th percentile: 19.17801355600357
mean time: 6.276457786560059
%s, retrying in %s seconds...
Received healthy response to inference request in 2.2199747562408447s
Received healthy response to inference request in 2.1885156631469727s
Received healthy response to inference request in 2.7409918308258057s
Received healthy response to inference request in 3.583308219909668s
Received healthy response to inference request in 3.006796360015869s
Received healthy response to inference request in 4.221913814544678s
Received healthy response to inference request in 3.8549888134002686s
Received healthy response to inference request in 2.1180765628814697s
Received healthy response to inference request in 2.393664836883545s
Received healthy response to inference request in 4.213747024536133s
10 requests
0 failed requests
5th percentile: 2.149774158000946
10th percentile: 2.181471753120422
20th percentile: 2.21368293762207
30th percentile: 2.341557812690735
40th percentile: 2.6020610332489014
50th percentile: 2.8738940954208374
60th percentile: 3.2374011039733883
70th percentile: 3.6648123979568483
80th percentile: 3.9267404556274417
90th percentile: 4.214563703536987
95th percentile: 4.218238759040832
99th percentile: 4.221178803443909
mean time: 3.0541977882385254
Pipeline stage StressChecker completed in 95.91s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.58s
Shutdown handler de-registered
function_busen_2025-12-14 status is now deployed due to DeploymentManager action
function_busen_2025-12-14 status is now inactive due to auto deactivation removed underperforming models
function_busen_2025-12-14 status is now torndown due to DeploymentManager action