developer_uid: chai_backend_admin
submission_id: function_riles_2025-12-17
model_name: abtest
model_group:
status: torndown
timestamp: 2025-12-20T23:41:34+00:00
num_battles: 6323
num_wins: 3465
celo_rating: 1326.93
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: abtest
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-20
win_ratio: 0.5479993673888977
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.9298663139343262s
Received healthy response to inference request in 2.0125701427459717s
Received healthy response to inference request in 1.9126396179199219s
Received healthy response to inference request in 1.930166482925415s
Received healthy response to inference request in 1.6876966953277588s
Received healthy response to inference request in 1.9830284118652344s
Received healthy response to inference request in 1.7582688331604004s
Received healthy response to inference request in 1.702627420425415s
Received healthy response to inference request in 2.1041204929351807s
Received healthy response to inference request in 1.9162750244140625s
10 requests
0 failed requests
5th percentile: 1.694415521621704
10th percentile: 1.7011343479156493
20th percentile: 1.7471405506134032
30th percentile: 1.8663283824920653
40th percentile: 1.9148208618164062
50th percentile: 1.9230706691741943
60th percentile: 1.9299863815307616
70th percentile: 1.9460250616073609
80th percentile: 1.988936758041382
90th percentile: 2.0217251777648926
95th percentile: 2.0629228353500366
99th percentile: 2.095880961418152
mean time: 1.8937259435653686
Pipeline stage StressChecker completed in 20.26s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.58s
Shutdown handler de-registered
function_riles_2025-12-17 status is now deployed due to DeploymentManager action
function_riles_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_riles_2025-12-17 status is now torndown due to DeploymentManager action