developer_uid: chai_evaluation_service
submission_id: function_defib_2025-12-18
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-21T08:21:20+00:00
num_battles: 7892
num_wins: 3985
celo_rating: 1296.76
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-21
win_ratio: 0.5049417131272175
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.3917503356933594s
Received healthy response to inference request in 2.8107409477233887s
Received healthy response to inference request in 2.347750663757324s
Received healthy response to inference request in 1.8092522621154785s
Received healthy response to inference request in 2.256939172744751s
Received healthy response to inference request in 2.2442240715026855s
Received healthy response to inference request in 1.787036657333374s
Received healthy response to inference request in 2.2854256629943848s
Received healthy response to inference request in 2.5627622604370117s
Received healthy response to inference request in 1.670823097229004s
10 requests
0 failed requests
5th percentile: 1.7231191992759705
10th percentile: 1.775415301322937
20th percentile: 1.8048091411590577
30th percentile: 2.113732528686523
40th percentile: 2.2518531322479247
50th percentile: 2.271182417869568
60th percentile: 2.3103556632995605
70th percentile: 2.360950565338135
80th percentile: 2.4259527206420897
90th percentile: 2.587560129165649
95th percentile: 2.6991505384445187
99th percentile: 2.7884228658676147
mean time: 2.216670513153076
Pipeline stage StressChecker completed in 23.40s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.93s
Shutdown handler de-registered
function_defib_2025-12-18 status is now deployed due to DeploymentManager action
function_defib_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_defib_2025-12-18 status is now torndown due to DeploymentManager action