developer_uid: chai_backend_admin
submission_id: function_junut_2025-12-23
model_name: abtest_blend
model_group:
status: torndown
timestamp: 2025-12-26T09:21:36+00:00
num_battles: 6956
num_wins: 3984
celo_rating: 1344.33
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: abtest_blend
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-26
win_ratio: 0.5727429557216791
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.8090946674346924s
Received healthy response to inference request in 1.5501725673675537s
Received healthy response to inference request in 1.7292191982269287s
Received healthy response to inference request in 1.8089675903320312s
Received healthy response to inference request in 1.7633495330810547s
Received healthy response to inference request in 1.820559024810791s
Received healthy response to inference request in 2.1026995182037354s
Received healthy response to inference request in 2.0368402004241943s
Received healthy response to inference request in 1.740856647491455s
Received healthy response to inference request in 1.6221034526824951s
10 requests
0 failed requests
5th percentile: 1.5825414657592773
10th percentile: 1.614910364151001
20th percentile: 1.707796049118042
30th percentile: 1.7373654127120972
40th percentile: 1.7543523788452149
50th percentile: 1.786158561706543
60th percentile: 1.8090184211730957
70th percentile: 1.812533974647522
80th percentile: 1.8638152599334716
90th percentile: 2.0434261322021485
95th percentile: 2.0730628252029417
99th percentile: 2.0967721796035765
mean time: 1.798386240005493
Pipeline stage StressChecker completed in 19.42s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
function_junut_2025-12-23 status is now deployed due to DeploymentManager action
function_junut_2025-12-23 status is now inactive due to auto deactivation removed underperforming models
function_junut_2025-12-23 status is now torndown due to DeploymentManager action