developer_uid: chai_evaluation_service
submission_id: function_tumak_2025-12-17
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-20T01:21:21+00:00
num_battles: 8454
num_wins: 4282
celo_rating: 1297.78
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-19
win_ratio: 0.5065057960728649
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.825716018676758s
Received healthy response to inference request in 3.2837777137756348s
Received healthy response to inference request in 2.614680528640747s
Received healthy response to inference request in 2.745476245880127s
Received healthy response to inference request in 2.8354475498199463s
Received healthy response to inference request in 1.6254487037658691s
Received healthy response to inference request in 2.050675868988037s
Received healthy response to inference request in 2.2890284061431885s
Received healthy response to inference request in 1.6457481384277344s
Received healthy response to inference request in 2.762120485305786s
10 requests
0 failed requests
5th percentile: 1.6345834493637086
10th percentile: 1.6437181949615478
20th percentile: 1.9696903228759766
30th percentile: 2.217522644996643
40th percentile: 2.4844196796417237
50th percentile: 2.680078387260437
60th percentile: 2.7521339416503907
70th percentile: 2.7811991453170775
80th percentile: 2.8276623249053956
90th percentile: 2.880280566215515
95th percentile: 3.0820291399955746
99th percentile: 3.243427999019623
mean time: 2.4678119659423827
Pipeline stage StressChecker completed in 26.02s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.82s
Shutdown handler de-registered
function_tumak_2025-12-17 status is now deployed due to DeploymentManager action
function_tumak_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_tumak_2025-12-17 status is now torndown due to DeploymentManager action