developer_uid: chai_evaluation_service
submission_id: function_gagal_2025-12-15
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-18T16:21:16+00:00
num_battles: 8738
num_wins: 4373
celo_rating: 1293.56
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-18
win_ratio: 0.5004577706569009
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.5946247577667236s
Received healthy response to inference request in 1.8749067783355713s
Received healthy response to inference request in 2.4822463989257812s
Received healthy response to inference request in 2.615676164627075s
Received healthy response to inference request in 2.6649656295776367s
Received healthy response to inference request in 3.210023880004883s
Received healthy response to inference request in 2.891150951385498s
Received healthy response to inference request in 2.5681886672973633s
Received healthy response to inference request in 2.5037078857421875s
Received healthy response to inference request in 2.1376454830169678s
10 requests
0 failed requests
5th percentile: 1.9931391954421998
10th percentile: 2.111371612548828
20th percentile: 2.4133262157440187
30th percentile: 2.4972694396972654
40th percentile: 2.542396354675293
50th percentile: 2.5814067125320435
60th percentile: 2.6030453205108643
70th percentile: 2.6304630041122437
80th percentile: 2.710202693939209
90th percentile: 2.9230382442474365
95th percentile: 3.0665310621261592
99th percentile: 3.181325316429138
mean time: 2.5543136596679688
Pipeline stage StressChecker completed in 27.25s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.62s
Shutdown handler de-registered
function_gagal_2025-12-15 status is now deployed due to DeploymentManager action
function_gagal_2025-12-15 status is now inactive due to auto deactivation removed underperforming models
function_gagal_2025-12-15 status is now torndown due to DeploymentManager action