developer_uid: chai_evaluation_service
submission_id: function_tusar_2025-12-18
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-21T13:06:24+00:00
num_battles: 8667
num_wins: 4369
celo_rating: 1296.14
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-21
win_ratio: 0.5040959963078343
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.5895724296569824s
Received healthy response to inference request in 2.3999428749084473s
Received healthy response to inference request in 4.204482555389404s
Received healthy response to inference request in 4.003351449966431s
Received healthy response to inference request in 3.708404779434204s
Received healthy response to inference request in 3.3545243740081787s
Received healthy response to inference request in 2.233081817626953s
Received healthy response to inference request in 3.094273805618286s
Received healthy response to inference request in 4.327811002731323s
Received healthy response to inference request in 2.8005034923553467s
10 requests
0 failed requests
5th percentile: 2.3081692934036253
10th percentile: 2.383256769180298
20th percentile: 2.551646518707275
30th percentile: 2.737224173545837
40th percentile: 2.97676568031311
50th percentile: 3.2243990898132324
60th percentile: 3.4960765361785886
70th percentile: 3.796888780593872
80th percentile: 4.043577671051025
90th percentile: 4.216815400123596
95th percentile: 4.272313201427459
99th percentile: 4.316711442470551
mean time: 3.2715948581695558
Pipeline stage StressChecker completed in 34.05s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.62s
Shutdown handler de-registered
function_tusar_2025-12-18 status is now deployed due to DeploymentManager action
function_tusar_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_tusar_2025-12-18 status is now torndown due to DeploymentManager action