developer_uid: chai_backend_admin
submission_id: function_magik_2025-12-17
model_name: abtest
model_group:
status: torndown
timestamp: 2025-12-20T23:41:33+00:00
num_battles: 6246
num_wins: 3441
celo_rating: 1328.8
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: abtest
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-20
win_ratio: 0.5509125840537944
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.030857801437378s
Received healthy response to inference request in 2.0299575328826904s
Received healthy response to inference request in 2.01310396194458s
Received healthy response to inference request in 2.064347505569458s
Received healthy response to inference request in 1.9607787132263184s
Received healthy response to inference request in 2.0810842514038086s
Received healthy response to inference request in 2.0417628288269043s
Received healthy response to inference request in 2.2261507511138916s
Received healthy response to inference request in 2.0272107124328613s
Received healthy response to inference request in 2.6400296688079834s
10 requests
0 failed requests
5th percentile: 1.984325075149536
10th percentile: 2.0078714370727537
20th percentile: 2.024389362335205
30th percentile: 2.0291334867477415
40th percentile: 2.030497694015503
50th percentile: 2.036310315132141
60th percentile: 2.050796699523926
70th percentile: 2.069368529319763
80th percentile: 2.110097551345825
90th percentile: 2.2675386428833004
95th percentile: 2.4537841558456415
99th percentile: 2.602780566215515
mean time: 2.1115283727645875
Pipeline stage StressChecker completed in 22.51s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
function_magik_2025-12-17 status is now deployed due to DeploymentManager action
function_magik_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_magik_2025-12-17 status is now torndown due to DeploymentManager action