developer_uid: chai_backend_admin
submission_id: function_kijon_2026-01-24
model_name: abtest_tai
model_group:
status: inactive
timestamp: 2026-01-24T14:56:43+00:00
num_battles: 10405
num_wins: 5351
celo_rating: 1315.28
family_friendly_score: 0.5332
family_friendly_standard_error: 0.00705546256456655
submission_type: function
display_name: abtest_tai
is_internal_developer: True
ranking_group: single
us_pacific_date: 2026-01-24
win_ratio: 0.5142719846227775
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.4340882301330566s
Received healthy response to inference request in 2.171488046646118s
Received healthy response to inference request in 1.6480355262756348s
Received healthy response to inference request in 2.6193950176239014s
Received healthy response to inference request in 1.5295445919036865s
Received healthy response to inference request in 1.504744291305542s
Received healthy response to inference request in 1.6459224224090576s
Received healthy response to inference request in 2.4610869884490967s
Received healthy response to inference request in 2.435253143310547s
Received healthy response to inference request in 2.8078970909118652s
10 requests
0 failed requests
5th percentile: 1.515904426574707
10th percentile: 1.527064561843872
20th percentile: 1.6226468563079834
30th percentile: 1.6474015951156615
40th percentile: 1.962107038497925
50th percentile: 2.3027881383895874
60th percentile: 2.434554195404053
70th percentile: 2.443003296852112
80th percentile: 2.4927485942840577
90th percentile: 2.6382452249526978
95th percentile: 2.7230711579322815
99th percentile: 2.7909319043159484
mean time: 2.1257455348968506
Pipeline stage StressChecker completed in 22.94s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.58s
Shutdown handler de-registered
function_kijon_2026-01-24 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 2735.95s
Shutdown handler de-registered
function_kijon_2026-01-24 status is now inactive due to auto deactivation removed underperforming models