developer_uid: chai_evaluation_service
submission_id: function_retub_2025-12-18
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-21T10:51:16+00:00
num_battles: 8118
num_wins: 4234
celo_rating: 1308.31
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-21
win_ratio: 0.5215570337521557
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.740915060043335s
Received healthy response to inference request in 1.8378150463104248s
Received healthy response to inference request in 2.507362127304077s
Received healthy response to inference request in 2.394176721572876s
Received healthy response to inference request in 2.411388874053955s
Received healthy response to inference request in 2.5805959701538086s
Received healthy response to inference request in 1.8172824382781982s
Received healthy response to inference request in 1.7364840507507324s
Received healthy response to inference request in 2.282632827758789s
Received healthy response to inference request in 2.253225564956665s
10 requests
0 failed requests
5th percentile: 1.7384780049324036
10th percentile: 1.7404719591140747
20th percentile: 1.8020089626312257
30th percentile: 1.8316552639007568
40th percentile: 2.087061357498169
50th percentile: 2.267929196357727
60th percentile: 2.3272503852844237
70th percentile: 2.3993403673172
80th percentile: 2.4305835247039793
90th percentile: 2.5146855115890503
95th percentile: 2.5476407408714294
99th percentile: 2.574004924297333
mean time: 2.156187868118286
Pipeline stage StressChecker completed in 22.80s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.62s
Shutdown handler de-registered
function_retub_2025-12-18 status is now deployed due to DeploymentManager action
function_retub_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_retub_2025-12-18 status is now torndown due to DeploymentManager action