developer_uid: chai_evaluation_service
submission_id: function_rofut_2025-12-17
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-20T20:41:09+00:00
num_battles: 8312
num_wins: 4020
celo_rating: 1281.73
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-20
win_ratio: 0.4836381135707411
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.095500946044922s
Received healthy response to inference request in 1.8599817752838135s
Received healthy response to inference request in 2.5225443840026855s
Received healthy response to inference request in 2.270493268966675s
Received healthy response to inference request in 2.5069503784179688s
Received healthy response to inference request in 4.320828199386597s
Received healthy response to inference request in 2.4030542373657227s
Received healthy response to inference request in 3.143038749694824s
Received healthy response to inference request in 2.152514696121216s
Received healthy response to inference request in 3.7475838661193848s
10 requests
0 failed requests
5th percentile: 1.9916215896606446
10th percentile: 2.1232614040374758
20th percentile: 2.246897554397583
30th percentile: 2.363285946846008
40th percentile: 2.46539192199707
50th percentile: 2.514747381210327
60th percentile: 2.7517270088195795
70th percentile: 3.1097622871398927
80th percentile: 3.2639477729797366
90th percentile: 3.804908299446106
95th percentile: 4.062868249416351
99th percentile: 4.269236209392548
mean time: 2.802249050140381
Pipeline stage StressChecker completed in 29.62s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.84s
Shutdown handler de-registered
function_rofut_2025-12-17 status is now deployed due to DeploymentManager action
function_rofut_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_rofut_2025-12-17 status is now torndown due to DeploymentManager action