developer_uid: chai_backend_admin
submission_id: function_sojak_2026-01-24
model_name: abtest_tai
model_group:
status: inactive
timestamp: 2026-01-24T11:27:15+00:00
num_battles: 10637
num_wins: 5287
celo_rating: 1300.92
family_friendly_score: 0.534
family_friendly_standard_error: 0.00705470056061914
submission_type: function
display_name: abtest_tai
is_internal_developer: True
ranking_group: single
us_pacific_date: 2026-01-24
win_ratio: 0.4970386387139231
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.974836826324463s
Received healthy response to inference request in 2.01928973197937s
Received healthy response to inference request in 1.8500895500183105s
Received healthy response to inference request in 3.3731446266174316s
Received healthy response to inference request in 2.5460846424102783s
Received healthy response to inference request in 2.5439436435699463s
Received healthy response to inference request in 3.874260902404785s
Received healthy response to inference request in 2.843456983566284s
Received healthy response to inference request in 3.2600975036621094s
Received healthy response to inference request in 3.8513524532318115s
10 requests
0 failed requests
5th percentile: 1.906225824356079
10th percentile: 1.9623620986938477
20th percentile: 2.0103991508483885
30th percentile: 2.3865474700927733
40th percentile: 2.5452282428741455
50th percentile: 2.6947708129882812
60th percentile: 3.010113191604614
70th percentile: 3.294011640548706
80th percentile: 3.468786191940308
90th percentile: 3.853643298149109
95th percentile: 3.863952100276947
99th percentile: 3.8721991419792174
mean time: 2.813655686378479
Pipeline stage StressChecker completed in 29.83s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.61s
Shutdown handler de-registered
function_sojak_2026-01-24 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 3813.24s
Shutdown handler de-registered
function_sojak_2026-01-24 status is now inactive due to auto deactivation removed underperforming models