developer_uid: chai_backend_admin
submission_id: function_dugur_2025-12-05
model_name: function_dugur_2025-12-05
model_group:
status: torndown
timestamp: 2025-12-12T18:28:48+00:00
num_battles: 19073
num_wins: 11481
celo_rating: 1365.39
family_friendly_score: 0.5342
family_friendly_standard_error: 0.007054507211705152
submission_type: function
display_name: function_dugur_2025-12-05
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-05
win_ratio: 0.6019504010905469
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.9691600799560547s
Received healthy response to inference request in 4.1773951053619385s
Received healthy response to inference request in 4.394996881484985s
Received healthy response to inference request in 3.7622077465057373s
Received healthy response to inference request in 3.6477229595184326s
Received healthy response to inference request in 3.9949240684509277s
Received healthy response to inference request in 2.523724317550659s
Received healthy response to inference request in 3.0512235164642334s
Received healthy response to inference request in 5.6857099533081055s
Received healthy response to inference request in 0.5214865207672119s
10 requests
0 failed requests
5th percentile: 1.4224935293197634
10th percentile: 2.3235005378723144
20th percentile: 2.9457236766815185
30th percentile: 3.468773126602173
40th percentile: 3.7164138317108155
50th percentile: 3.865683913230896
60th percentile: 3.979465675354004
70th percentile: 4.049665379524231
80th percentile: 4.220915460586548
90th percentile: 4.524068188667297
95th percentile: 5.1048890709877
99th percentile: 5.569545776844024
mean time: 3.5728551149368286
Pipeline stage StressChecker completed in 37.28s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.68s
Shutdown handler de-registered
function_dugur_2025-12-05 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 3950.07s
Shutdown handler de-registered
function_dugur_2025-12-05 status is now inactive due to auto deactivation removed underperforming models
function_dugur_2025-12-05 status is now torndown due to DeploymentManager action