developer_uid: chai_evaluation_service
submission_id: function_fufef_2025-12-18
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-21T23:31:15+00:00
num_battles: 6418
num_wins: 3154
celo_rating: 1287.34
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-21
win_ratio: 0.4914303521346214
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.5743086338043213s
Received healthy response to inference request in 1.948490858078003s
Received healthy response to inference request in 3.429121255874634s
Received healthy response to inference request in 2.922104835510254s
Received healthy response to inference request in 2.015834331512451s
Received healthy response to inference request in 3.299931287765503s
Received healthy response to inference request in 1.8357787132263184s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 4.616009712219238s
Received healthy response to inference request in 2.6414060592651367s
Received healthy response to inference request in 3.467498302459717s
10 requests
0 failed requests
5th percentile: 1.8864991784095764
10th percentile: 1.9372196435928344
20th percentile: 2.0023656368255613
30th percentile: 2.453734540939331
40th percentile: 2.809825325012207
50th percentile: 3.1110180616378784
60th percentile: 3.351607275009155
70th percentile: 3.4406343698501587
80th percentile: 3.4888603687286377
90th percentile: 3.6784787416458125
95th percentile: 4.147244226932525
99th percentile: 4.522256615161896
mean time: 2.9750483989715577
Pipeline stage StressChecker completed in 31.08s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.60s
Shutdown handler de-registered
function_fufef_2025-12-18 status is now deployed due to DeploymentManager action
function_fufef_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_fufef_2025-12-18 status is now torndown due to DeploymentManager action