developer_uid: chai_evaluation_service
submission_id: function_rukem_2025-12-17
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-20T15:06:31+00:00
num_battles: 9130
num_wins: 4572
celo_rating: 1293.84
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-20
win_ratio: 0.5007667031763418
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.6653714179992676s
Received healthy response to inference request in 2.1502439975738525s
Received healthy response to inference request in 1.715841293334961s
Received healthy response to inference request in 4.700356960296631s
Received healthy response to inference request in 2.552217483520508s
Received healthy response to inference request in 2.7216908931732178s
Received healthy response to inference request in 2.0261526107788086s
Received healthy response to inference request in 3.6799662113189697s
Received healthy response to inference request in 3.6484556198120117s
Received healthy response to inference request in 1.8574309349060059s
10 requests
0 failed requests
5th percentile: 1.7795566320419312
10th percentile: 1.8432719707489014
20th percentile: 1.992408275604248
30th percentile: 2.1130165815353394
40th percentile: 2.391428089141846
50th percentile: 2.636954188346863
60th percentile: 3.0923967838287347
70th percentile: 3.6535303592681885
80th percentile: 3.668290376663208
90th percentile: 3.7820052862167355
95th percentile: 4.241181123256682
99th percentile: 4.6085217928886415
mean time: 2.8717727422714234
Pipeline stage StressChecker completed in 30.07s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.66s
Shutdown handler de-registered
function_rukem_2025-12-17 status is now deployed due to DeploymentManager action
function_rukem_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_rukem_2025-12-17 status is now torndown due to DeploymentManager action