developer_uid: chai_evaluation_service
submission_id: function_dilat_2025-12-15
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-18T20:21:20+00:00
num_battles: 11527
num_wins: 5667
celo_rating: 1287.28
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-18
win_ratio: 0.49162835082848966
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.4636616706848145s
Received healthy response to inference request in 2.8387160301208496s
Received healthy response to inference request in 2.0035104751586914s
Received healthy response to inference request in 2.565948247909546s
Received healthy response to inference request in 2.441526412963867s
Received healthy response to inference request in 2.2576560974121094s
Received healthy response to inference request in 2.396533727645874s
Received healthy response to inference request in 1.8328142166137695s
Received healthy response to inference request in 3.0970661640167236s
Received healthy response to inference request in 2.2641220092773438s
10 requests
0 failed requests
5th percentile: 1.9096275329589845
10th percentile: 1.9864408493041992
20th percentile: 2.2068269729614256
30th percentile: 2.2621822357177734
40th percentile: 2.343569040298462
50th percentile: 2.4190300703048706
60th percentile: 2.4503805160522463
70th percentile: 2.494347643852234
80th percentile: 2.620501804351807
90th percentile: 2.864551043510437
95th percentile: 2.98080860376358
99th percentile: 3.073814651966095
mean time: 2.4161555051803587
Pipeline stage StressChecker completed in 25.58s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.60s
Shutdown handler de-registered
function_dilat_2025-12-15 status is now deployed due to DeploymentManager action
function_dilat_2025-12-15 status is now inactive due to auto deactivation removed underperforming models
function_dilat_2025-12-15 status is now torndown due to DeploymentManager action