developer_uid: chai_evaluation_service
submission_id: function_jilet_2025-12-13
model_name: richard
model_group:
status: inactive
timestamp: 2025-12-13T11:17:25+00:00
num_battles: 9891
num_wins: 4909
celo_rating: 1301.49
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-13
win_ratio: 0.49630977656455366
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.7387125492095947s
Received healthy response to inference request in 3.1636428833007812s
Received healthy response to inference request in 4.189877033233643s
Received healthy response to inference request in 5.024547338485718s
Received healthy response to inference request in 3.8009181022644043s
Received healthy response to inference request in 3.016054153442383s
Received healthy response to inference request in 2.891401767730713s
Received healthy response to inference request in 3.3911781311035156s
Received healthy response to inference request in 2.529752731323242s
10 requests
1 failed requests
5th percentile: 2.692494797706604
10th percentile: 2.855236864089966
20th percentile: 2.991123676300049
30th percentile: 3.119366264343262
40th percentile: 3.3001640319824217
50th percentile: 3.564945340156555
60th percentile: 3.7635947704315185
70th percentile: 3.9176057815551757
80th percentile: 4.356811094284057
90th percentile: 6.536383104324336
95th percentile: 13.339644050598128
99th percentile: 18.78225280761719
mean time: 5.188898968696594
%s, retrying in %s seconds...
Received healthy response to inference request in 3.568481922149658s
Received healthy response to inference request in 3.295552968978882s
Received healthy response to inference request in 2.5309104919433594s
Received healthy response to inference request in 2.8165395259857178s
Received healthy response to inference request in 2.7567100524902344s
Received healthy response to inference request in 3.1529877185821533s
Received healthy response to inference request in 2.806880474090576s
Received healthy response to inference request in 1.9748902320861816s
Received healthy response to inference request in 2.87849760055542s
Received healthy response to inference request in 3.3808844089508057s
10 requests
0 failed requests
5th percentile: 2.2250993490219115
10th percentile: 2.4753084659576414
20th percentile: 2.7115501403808593
30th percentile: 2.7918293476104736
40th percentile: 2.8126759052276613
50th percentile: 2.847518563270569
60th percentile: 2.9882936477661133
70th percentile: 3.195757293701172
80th percentile: 3.3126192569732664
90th percentile: 3.399644160270691
95th percentile: 3.4840630412101743
99th percentile: 3.5515981459617616
mean time: 2.916233539581299
Pipeline stage StressChecker completed in 84.84s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.77s
Shutdown handler de-registered
function_jilet_2025-12-13 status is now deployed due to DeploymentManager action
function_jilet_2025-12-13 status is now inactive due to auto deactivation removed underperforming models