developer_uid: chai_evaluation_service
submission_id: function_peges_2025-12-16
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-19T10:21:25+00:00
num_battles: 7899
num_wins: 4078
celo_rating: 1304.39
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-19
win_ratio: 0.5162678820103811
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.0261924266815186s
Received healthy response to inference request in 3.661266803741455s
Received healthy response to inference request in 3.3918702602386475s
Received healthy response to inference request in 2.744708299636841s
Received healthy response to inference request in 3.658121109008789s
Received healthy response to inference request in 3.0456714630126953s
Received healthy response to inference request in 4.141419410705566s
Received healthy response to inference request in 3.164707660675049s
Received healthy response to inference request in 2.577941656112671s
10 requests
1 failed requests
5th percentile: 2.2744795799255373
10th percentile: 2.5227667331695556
20th percentile: 2.711354970932007
30th percentile: 2.955382513999939
40th percentile: 3.1170931816101075
50th percentile: 3.278288960456848
60th percentile: 3.498370599746704
70th percentile: 3.6590648174285887
80th percentile: 3.7572973251342776
90th percentile: 5.740055918693537
95th percentile: 12.933920204639417
99th percentile: 18.689011633396152
mean time: 4.853968358039856
%s, retrying in %s seconds...
Received healthy response to inference request in 8.18202805519104s
Received healthy response to inference request in 3.277771234512329s
Received healthy response to inference request in 2.187404155731201s
Received healthy response to inference request in 3.201869487762451s
Received healthy response to inference request in 1.6884479522705078s
Received healthy response to inference request in 4.772605895996094s
Received healthy response to inference request in 3.6025431156158447s
Received healthy response to inference request in 2.1439270973205566s
Received healthy response to inference request in 4.508187770843506s
Received healthy response to inference request in 2.576941728591919s
10 requests
0 failed requests
5th percentile: 1.8934135675430297
10th percentile: 2.0983791828155516
20th percentile: 2.1787087440490724
30th percentile: 2.4600804567337033
40th percentile: 2.951898384094238
50th percentile: 3.23982036113739
60th percentile: 3.407679986953735
70th percentile: 3.874236512184143
80th percentile: 4.561071395874023
90th percentile: 5.113548111915587
95th percentile: 6.64778808355331
99th percentile: 7.875180060863495
mean time: 3.614172649383545
Pipeline stage StressChecker completed in 88.67s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.57s
Shutdown handler de-registered
function_peges_2025-12-16 status is now deployed due to DeploymentManager action
function_peges_2025-12-16 status is now inactive due to auto deactivation removed underperforming models
function_peges_2025-12-16 status is now torndown due to DeploymentManager action