developer_uid: chai_evaluation_service
submission_id: function_kapen_2025-12-14
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-17T16:41:15+00:00
num_battles: 6309
num_wins: 3181
celo_rating: 1297.21
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-14
win_ratio: 0.5042003487081946
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 4.983026504516602s
Received healthy response to inference request in 3.5675108432769775s
Received healthy response to inference request in 3.4512877464294434s
Received healthy response to inference request in 4.188388109207153s
Received healthy response to inference request in 3.2052817344665527s
Received healthy response to inference request in 3.4984071254730225s
Received healthy response to inference request in 2.8039817810058594s
Received healthy response to inference request in 4.621071100234985s
Received healthy response to inference request in 6.976277589797974s
Received healthy response to inference request in 5.607697010040283s
10 requests
0 failed requests
5th percentile: 2.9845667600631716
10th percentile: 3.1651517391204833
20th percentile: 3.402086544036865
30th percentile: 3.4842713117599486
40th percentile: 3.5398693561553953
50th percentile: 3.8779494762420654
60th percentile: 4.361461305618286
70th percentile: 4.72965772151947
80th percentile: 5.107960605621338
90th percentile: 5.7445550680160515
95th percentile: 6.360416328907012
99th percentile: 6.853105337619782
mean time: 4.290292954444885
Pipeline stage StressChecker completed in 44.94s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.60s
Shutdown handler de-registered
function_kapen_2025-12-14 status is now deployed due to DeploymentManager action
function_kapen_2025-12-14 status is now inactive due to auto deactivation removed underperforming models
function_kapen_2025-12-14 status is now torndown due to DeploymentManager action