developer_uid: chai_backend_admin
submission_id: function_sehaf_2025-12-05
model_name: function_sehaf_2025-12-05
model_group:
status: torndown
timestamp: 2025-12-12T18:30:22+00:00
num_battles: 28921
num_wins: 15453
celo_rating: 1317.11
family_friendly_score: 0.5833999999999999
family_friendly_standard_error: 0.006972007458401059
submission_type: function
display_name: function_sehaf_2025-12-05
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-05
win_ratio: 0.5343176238719269
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 4.549048900604248s
Received healthy response to inference request in 2.6810197830200195s
Received healthy response to inference request in 3.390791893005371s
Received healthy response to inference request in 0.7097418308258057s
Received healthy response to inference request in 0.43393588066101074s
Received healthy response to inference request in 0.48586368560791016s
Received healthy response to inference request in 0.48466038703918457s
Received healthy response to inference request in 0.6101217269897461s
Received healthy response to inference request in 0.611884355545044s
Received healthy response to inference request in 0.38704967498779297s
10 requests
0 failed requests
5th percentile: 0.40814846754074097
10th percentile: 0.42924726009368896
20th percentile: 0.4745154857635498
30th percentile: 0.48550269603729246
40th percentile: 0.5604185104370117
50th percentile: 0.611003041267395
60th percentile: 0.6510273456573485
70th percentile: 1.3011252164840694
80th percentile: 2.82297420501709
90th percentile: 3.506617593765258
95th percentile: 4.027833247184752
99th percentile: 4.444805769920349
mean time: 1.4344118118286133
Pipeline stage StressChecker completed in 16.86s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.75s
Shutdown handler de-registered
function_sehaf_2025-12-05 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 3339.41s
Shutdown handler de-registered
function_sehaf_2025-12-05 status is now inactive due to auto deactivation removed underperforming models
function_sehaf_2025-12-05 status is now torndown due to DeploymentManager action