developer_uid: chai_backend_admin
submission_id: function_tupok_2025-12-23
model_name: abtest_blend
model_group:
status: torndown
timestamp: 2025-12-26T07:31:46+00:00
num_battles: 6650
num_wins: 3810
celo_rating: 1344.34
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: abtest_blend
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-25
win_ratio: 0.5729323308270676
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.938722610473633s
Received healthy response to inference request in 2.1560940742492676s
Received healthy response to inference request in 2.1148104667663574s
Received healthy response to inference request in 2.8626997470855713s
Received healthy response to inference request in 2.4543120861053467s
Received healthy response to inference request in 2.0133392810821533s
Received healthy response to inference request in 2.7059741020202637s
Received healthy response to inference request in 2.7150936126708984s
Received healthy response to inference request in 2.0629782676696777s
Received healthy response to inference request in 2.28938364982605s
10 requests
0 failed requests
5th percentile: 2.035676825046539
10th percentile: 2.0580143690109254
20th percentile: 2.1044440269470215
30th percentile: 2.1437089920043944
40th percentile: 2.2360678195953367
50th percentile: 2.3718478679656982
60th percentile: 2.554976892471313
70th percentile: 2.708709955215454
80th percentile: 2.744614839553833
90th percentile: 2.8703020334243776
95th percentile: 2.904512321949005
99th percentile: 2.9318805527687073
mean time: 2.431340789794922
Pipeline stage StressChecker completed in 25.58s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.61s
Shutdown handler de-registered
function_tupok_2025-12-23 status is now deployed due to DeploymentManager action
function_tupok_2025-12-23 status is now inactive due to auto deactivation removed underperforming models
function_tupok_2025-12-23 status is now torndown due to DeploymentManager action