function_mitun_2026-01-22

developer_uid: chai_backend_admin

submission_id: function_mitun_2026-01-22

model_name: function_mitun_2026-01-22

model_group:

status: torndown

timestamp: 2026-01-26T04:58:28+00:00

num_battles: 10091

num_wins: 5402

celo_rating: 1328.23

family_friendly_score: 0.6412

family_friendly_standard_error: 0.006783252317288514

submission_type: function

display_name: function_mitun_2026-01-22

is_internal_developer: True

ranking_group: single

us_pacific_date: 2026-01-22

win_ratio: 0.535328510553959

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': 'CUSTOM', 'prompt_template': 'CUSTOM', 'bot_template': 'CUSTOM', 'user_template': 'CUSTOM', 'response_template': 'CUSTOM', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.2089953422546387s
Received healthy response to inference request in 3.0040318965911865s
Received healthy response to inference request in 3.4660089015960693s
Received healthy response to inference request in 4.036807537078857s
Received healthy response to inference request in 3.4607481956481934s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.711310625076294s
Received healthy response to inference request in 4.3631086349487305s
Received healthy response to inference request in 12.463688135147095s
Received healthy response to inference request in 3.8899924755096436s
Received healthy response to inference request in 4.22598671913147s
10 requests
0 failed requests
5th percentile: 2.8430351972579957
10th percentile: 2.9747597694396974
20th percentile: 3.168002653121948
30th percentile: 3.385222339630127
40th percentile: 3.463904619216919
50th percentile: 3.6780006885528564
60th percentile: 3.948718500137329
70th percentile: 4.093561291694641
80th percentile: 4.253411102294922
90th percentile: 5.173166584968564
95th percentile: 8.818427360057822
99th percentile: 11.734635980129243
mean time: 4.483067846298217
Pipeline stage StressChecker completed in 46.28s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.79s
Shutdown handler de-registered
function_mitun_2026-01-22 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 3814.00s
Shutdown handler de-registered
function_mitun_2026-01-22 status is now inactive due to auto deactivation removed underperforming models
function_mitun_2026-01-22 status is now torndown due to DeploymentManager action