developer_uid: chai_backend_admin
submission_id: function_judob_2025-12-23
model_name: abtest_blend
model_group:
status: torndown
timestamp: 2025-12-26T09:21:35+00:00
num_battles: 6895
num_wins: 4149
celo_rating: 1364.74
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: abtest_blend
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-26
win_ratio: 0.6017403915881073
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.4089293479919434s
Received healthy response to inference request in 2.195490598678589s
Received healthy response to inference request in 2.3775007724761963s
Received healthy response to inference request in 2.860002279281616s
Received healthy response to inference request in 2.1173553466796875s
Received healthy response to inference request in 2.6168289184570312s
Received healthy response to inference request in 2.0922203063964844s
Received healthy response to inference request in 2.208156108856201s
Received healthy response to inference request in 2.114102363586426s
Received healthy response to inference request in 2.932015895843506s
10 requests
0 failed requests
5th percentile: 2.102067232131958
10th percentile: 2.1119141578674316
20th percentile: 2.116704750061035
30th percentile: 2.1720500230789184
40th percentile: 2.2030899047851564
50th percentile: 2.2928284406661987
60th percentile: 2.390072202682495
70th percentile: 2.4712992191314695
80th percentile: 2.665463590621948
90th percentile: 2.867203640937805
95th percentile: 2.8996097683906554
99th percentile: 2.9255346703529357
mean time: 2.392260193824768
Pipeline stage StressChecker completed in 32.24s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
function_judob_2025-12-23 status is now deployed due to DeploymentManager action
function_judob_2025-12-23 status is now inactive due to auto deactivation removed underperforming models
function_judob_2025-12-23 status is now torndown due to DeploymentManager action