developer_uid: chai_backend_admin
submission_id: function_rosef_2025-12-17
model_name: abtest
model_group:
status: torndown
timestamp: 2025-12-20T23:51:20+00:00
num_battles: 5874
num_wins: 3274
celo_rating: 1333.06
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: abtest
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-20
win_ratio: 0.557371467483827
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.9125609397888184s
Received healthy response to inference request in 1.9790318012237549s
Received healthy response to inference request in 2.2709708213806152s
Received healthy response to inference request in 1.730123519897461s
Received healthy response to inference request in 1.844486951828003s
Received healthy response to inference request in 1.595191478729248s
Received healthy response to inference request in 1.5384995937347412s
Received healthy response to inference request in 2.3935883045196533s
Received healthy response to inference request in 2.2449800968170166s
Received healthy response to inference request in 1.8627867698669434s
10 requests
0 failed requests
5th percentile: 1.5640109419822692
10th percentile: 1.5895222902297974
20th percentile: 1.7031371116638183
30th percentile: 1.8101779222488403
40th percentile: 1.8554668426513672
50th percentile: 1.8876738548278809
60th percentile: 1.9391492843627929
70th percentile: 2.0588162899017335
80th percentile: 2.2501782417297362
90th percentile: 2.283232569694519
95th percentile: 2.338410437107086
99th percentile: 2.3825527310371397
mean time: 1.9372220277786254
Pipeline stage StressChecker completed in 20.59s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.86s
Shutdown handler de-registered
function_rosef_2025-12-17 status is now deployed due to DeploymentManager action
function_rosef_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_rosef_2025-12-17 status is now torndown due to DeploymentManager action