developer_uid: chai_evaluation_service
submission_id: function_nusab_2025-12-18
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-21T18:51:19+00:00
num_battles: 9910
num_wins: 4867
celo_rating: 1287.08
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-21
win_ratio: 0.49112008072653884
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.538240432739258s
Received healthy response to inference request in 2.86802339553833s
Received healthy response to inference request in 2.369328260421753s
Received healthy response to inference request in 3.1634576320648193s
Received healthy response to inference request in 2.7873220443725586s
Received healthy response to inference request in 2.1828174591064453s
Received healthy response to inference request in 3.418689012527466s
Received healthy response to inference request in 3.6636757850646973s
Received healthy response to inference request in 2.9618923664093018s
Received healthy response to inference request in 1.8760056495666504s
10 requests
0 failed requests
5th percentile: 2.014070963859558
10th percentile: 2.152136278152466
20th percentile: 2.3320261001586915
30th percentile: 2.4875667810440065
40th percentile: 2.6876893997192384
50th percentile: 2.8276727199554443
60th percentile: 2.9055709838867188
70th percentile: 3.022361946105957
80th percentile: 3.2145039081573485
90th percentile: 3.443187689781189
95th percentile: 3.553431737422943
99th percentile: 3.6416269755363464
mean time: 2.7829452037811278
Pipeline stage StressChecker completed in 29.44s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
function_nusab_2025-12-18 status is now deployed due to DeploymentManager action
function_nusab_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_nusab_2025-12-18 status is now torndown due to DeploymentManager action