function_kijon_2026-01-24

developer_uid: chai_backend_admin

submission_id: function_kijon_2026-01-24

model_name: abtest_tai

model_group:

status: torndown

timestamp: 2026-01-27T14:58:30+00:00

num_battles: 10405

num_wins: 5351

celo_rating: 1315.28

family_friendly_score: 0.5332

family_friendly_standard_error: 0.00705546256456655

submission_type: function

display_name: abtest_tai

is_internal_developer: True

ranking_group: single

us_pacific_date: 2026-01-24

win_ratio: 0.5142719846227775

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': True}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.4340882301330566s
Received healthy response to inference request in 2.171488046646118s
Received healthy response to inference request in 1.6480355262756348s
Received healthy response to inference request in 2.6193950176239014s
Received healthy response to inference request in 1.5295445919036865s
Received healthy response to inference request in 1.504744291305542s
Received healthy response to inference request in 1.6459224224090576s
Received healthy response to inference request in 2.4610869884490967s
Received healthy response to inference request in 2.435253143310547s
Received healthy response to inference request in 2.8078970909118652s
10 requests
0 failed requests
5th percentile: 1.515904426574707
10th percentile: 1.527064561843872
20th percentile: 1.6226468563079834
30th percentile: 1.6474015951156615
40th percentile: 1.962107038497925
50th percentile: 2.3027881383895874
60th percentile: 2.434554195404053
70th percentile: 2.443003296852112
80th percentile: 2.4927485942840577
90th percentile: 2.6382452249526978
95th percentile: 2.7230711579322815
99th percentile: 2.7909319043159484
mean time: 2.1257455348968506
Pipeline stage StressChecker completed in 22.94s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.58s
Shutdown handler de-registered
function_kijon_2026-01-24 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 2735.95s
Shutdown handler de-registered
function_kijon_2026-01-24 status is now inactive due to auto deactivation removed underperforming models
function_kijon_2026-01-24 status is now torndown due to DeploymentManager action