function_sotos_2026-01-14

developer_uid: chai_backend_admin

submission_id: function_sotos_2026-01-14

model_name: function_sotos_2026-01-14

model_group:

status: torndown

timestamp: 2026-01-17T20:18:33+00:00

num_battles: 10862

num_wins: 5504

celo_rating: 1301.0

family_friendly_score: 0.5618000000000001

family_friendly_standard_error: 0.007016847725296595

submission_type: function

display_name: function_sotos_2026-01-14

is_internal_developer: True

ranking_group: single

us_pacific_date: 2026-01-14

win_ratio: 0.5067206775916038

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.1648495197296143s
Received healthy response to inference request in 1.5637178421020508s
Received healthy response to inference request in 3.3639190196990967s
Received healthy response to inference request in 2.05355167388916s
Received healthy response to inference request in 1.881746768951416s
Received healthy response to inference request in 1.8817005157470703s
Received healthy response to inference request in 2.431284189224243s
Received healthy response to inference request in 1.9161169528961182s
Received healthy response to inference request in 1.606856107711792s
Received healthy response to inference request in 1.9424057006835938s
10 requests
0 failed requests
5th percentile: 1.5831300616264343
10th percentile: 1.6025422811508179
20th percentile: 1.8267316341400146
30th percentile: 1.8817328929901123
40th percentile: 1.9023688793182374
50th percentile: 1.929261326789856
60th percentile: 1.9868640899658203
70th percentile: 2.0869410276412963
80th percentile: 2.21813645362854
90th percentile: 2.524547672271728
95th percentile: 2.9442333459854115
99th percentile: 3.27998188495636
mean time: 2.0806148290634154
Pipeline stage StressChecker completed in 24.04s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.60s
Shutdown handler de-registered
function_sotos_2026-01-14 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 3046.06s
Shutdown handler de-registered
function_sotos_2026-01-14 status is now inactive due to auto deactivation removed underperforming models
Falling back to EndpointApi.from_submission implementation
function_sotos_2026-01-14 status is now torndown due to DeploymentManager action