function_sedut_2025-12-07

developer_uid: chai_backend_admin

submission_id: function_sedut_2025-12-07

model_name: function_sedut_2025-12-07

model_group:

status: torndown

timestamp: 2025-12-12T18:30:21+00:00

num_battles: 5094

num_wins: 2567

celo_rating: 1293.95

family_friendly_score: 0.5936

family_friendly_standard_error: 0.006946064209320268

submission_type: function

display_name: function_sedut_2025-12-07

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-07

win_ratio: 0.5039261876717707

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.9632532596588135s
Received healthy response to inference request in 2.4463393688201904s
Received healthy response to inference request in 2.4105336666107178s
Received healthy response to inference request in 1.6975915431976318s
Received healthy response to inference request in 2.671581983566284s
Received healthy response to inference request in 2.272465705871582s
Received healthy response to inference request in 1.6390912532806396s
Received healthy response to inference request in 1.850383996963501s
Received healthy response to inference request in 2.5887386798858643s
Received healthy response to inference request in 1.7500488758087158s
10 requests
0 failed requests
5th percentile: 1.665416383743286
10th percentile: 1.6917415142059327
20th percentile: 1.739557409286499
30th percentile: 1.8202834606170655
40th percentile: 1.9181055545806884
50th percentile: 2.1178594827651978
60th percentile: 2.3276928901672362
70th percentile: 2.4212753772735596
80th percentile: 2.474819231033325
90th percentile: 2.5970230102539062
95th percentile: 2.634302496910095
99th percentile: 2.664126086235046
mean time: 2.129002833366394
Pipeline stage StressChecker completed in 23.57s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.86s
Shutdown handler de-registered
function_sedut_2025-12-07 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 2377.26s
Shutdown handler de-registered
function_sedut_2025-12-07 status is now inactive due to auto deactivation removed underperforming models
function_sedut_2025-12-07 status is now torndown due to DeploymentManager action