function_temom_2025-12-09

developer_uid: chai_backend_admin

submission_id: function_temom_2025-12-09

model_name: function_temom_2025-12-09

model_group:

status: torndown

timestamp: 2025-12-12T21:31:36+00:00

num_battles: 5418

num_wins: 2868

celo_rating: 1314.03

family_friendly_score: 0.5252

family_friendly_standard_error: 0.007062081279622884

submission_type: function

display_name: function_temom_2025-12-09

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-09

win_ratio: 0.5293466223698782

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.9864506721496582s
Received healthy response to inference request in 1.6435706615447998s
Received healthy response to inference request in 1.9529802799224854s
Received healthy response to inference request in 1.550757646560669s
Received healthy response to inference request in 1.634056806564331s
Received healthy response to inference request in 1.5429654121398926s
Received healthy response to inference request in 1.938908576965332s
Received healthy response to inference request in 1.6810417175292969s
Received healthy response to inference request in 1.8805594444274902s
Received healthy response to inference request in 1.9370591640472412s
10 requests
0 failed requests
5th percentile: 1.5464719176292419
10th percentile: 1.5499784231185914
20th percentile: 1.6173969745635985
30th percentile: 1.6407165050506591
40th percentile: 1.666053295135498
50th percentile: 1.7808005809783936
60th percentile: 1.9031593322753906
70th percentile: 1.9376139879226684
80th percentile: 1.9417229175567627
90th percentile: 1.9563273191452026
95th percentile: 1.9713889956474304
99th percentile: 1.9834383368492126
mean time: 1.7748350381851197
Pipeline stage StressChecker completed in 20.44s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.66s
Shutdown handler de-registered
function_temom_2025-12-09 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 2733.61s
Shutdown handler de-registered
function_temom_2025-12-09 status is now inactive due to auto deactivation removed underperforming models
function_temom_2025-12-09 status is now torndown due to DeploymentManager action