function_juhar_2025-12-05

developer_uid: chai_backend_admin

submission_id: function_juhar_2025-12-05

model_name: function_juhar_2025-12-05

model_group:

status: torndown

timestamp: 2025-12-12T18:29:24+00:00

num_battles: 18663

num_wins: 11177

celo_rating: 1363.16

family_friendly_score: 0.538

family_friendly_standard_error: 0.007050616994277877

submission_type: function

display_name: function_juhar_2025-12-05

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-05

win_ratio: 0.598885495365161

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 5.366772890090942s
Received healthy response to inference request in 4.691319704055786s
Received healthy response to inference request in 3.935786485671997s
Received healthy response to inference request in 2.8981034755706787s
Received healthy response to inference request in 0.34157538414001465s
Received healthy response to inference request in 3.377861499786377s
Received healthy response to inference request in 3.3407514095306396s
Received healthy response to inference request in 4.021898031234741s
Received healthy response to inference request in 0.4861457347869873s
Received healthy response to inference request in 4.182969331741333s
10 requests
0 failed requests
5th percentile: 0.4066320419311523
10th percentile: 0.47168869972229005
20th percentile: 2.4157119274139407
30th percentile: 3.2079570293426514
40th percentile: 3.363017463684082
50th percentile: 3.656823992729187
60th percentile: 3.9702311038970945
70th percentile: 4.070219421386719
80th percentile: 4.2846394062042235
90th percentile: 4.758865022659301
95th percentile: 5.062818956375121
99th percentile: 5.3059821033477785
mean time: 3.2643183946609495
Pipeline stage StressChecker completed in 34.28s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.61s
Shutdown handler de-registered
function_juhar_2025-12-05 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 4103.90s
Shutdown handler de-registered
function_juhar_2025-12-05 status is now inactive due to auto deactivation removed underperforming models
function_juhar_2025-12-05 status is now torndown due to DeploymentManager action