developer_uid: chai_evaluation_service
submission_id: function_fahur_2025-12-15
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-18T15:51:18+00:00
num_battles: 7395
num_wins: 3646
celo_rating: 1288.35
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-18
win_ratio: 0.49303583502366466
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.3751275539398193s
Received healthy response to inference request in 2.4661948680877686s
Received healthy response to inference request in 2.1496222019195557s
Received healthy response to inference request in 2.03031063079834s
Received healthy response to inference request in 3.8972883224487305s
Received healthy response to inference request in 2.531686782836914s
Received healthy response to inference request in 3.0931637287139893s
Received healthy response to inference request in 1.9099068641662598s
Received healthy response to inference request in 2.131481170654297s
10 requests
1 failed requests
5th percentile: 1.9640885591506958
10th percentile: 2.018270254135132
20th percentile: 2.1112470626831055
30th percentile: 2.144179892539978
40th percentile: 2.284925413131714
50th percentile: 2.420661211013794
60th percentile: 2.4923916339874266
70th percentile: 2.7001298666000366
80th percentile: 3.253988647460938
90th percentile: 5.520779967308039
95th percentile: 12.82649236917494
99th percentile: 18.67106229066849
mean time: 4.271698689460754
%s, retrying in %s seconds...
Received healthy response to inference request in 3.06899356842041s
Received healthy response to inference request in 2.55885910987854s
Received healthy response to inference request in 1.8300559520721436s
Received healthy response to inference request in 3.384009599685669s
Received healthy response to inference request in 1.959514856338501s
Received healthy response to inference request in 2.4333746433258057s
Received healthy response to inference request in 1.754408597946167s
Received healthy response to inference request in 2.8229105472564697s
Received healthy response to inference request in 1.799452304840088s
Received healthy response to inference request in 2.173330068588257s
10 requests
0 failed requests
5th percentile: 1.7746782660484315
10th percentile: 1.7949479341506958
20th percentile: 1.8239352226257324
30th percentile: 1.9206771850585938
40th percentile: 2.0878039836883544
50th percentile: 2.3033523559570312
60th percentile: 2.4835684299468994
70th percentile: 2.638074541091919
80th percentile: 2.872127151489258
90th percentile: 3.1004951715469358
95th percentile: 3.242252385616302
99th percentile: 3.3556581568717956
mean time: 2.378490924835205
Pipeline stage StressChecker completed in 70.03s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.75s
Shutdown handler de-registered
function_fahur_2025-12-15 status is now deployed due to DeploymentManager action
function_fahur_2025-12-15 status is now inactive due to auto deactivation removed underperforming models
function_fahur_2025-12-15 status is now torndown due to DeploymentManager action