developer_uid: chai_backend_admin
submission_id: function_bihit_2025-12-08
model_name: function_bihit_2025-12-08
model_group:
status: torndown
timestamp: 2025-12-12T18:28:37+00:00
num_battles: 5721
num_wins: 3155
celo_rating: 1329.95
family_friendly_score: 0.5212
family_friendly_standard_error: 0.007064708911200801
submission_type: function
display_name: function_bihit_2025-12-08
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-08
win_ratio: 0.5514770145079532
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.19016432762146s
Received healthy response to inference request in 2.0206594467163086s
Received healthy response to inference request in 1.8146030902862549s
Received healthy response to inference request in 2.4332849979400635s
Received healthy response to inference request in 1.843590259552002s
Received healthy response to inference request in 2.2990264892578125s
Received healthy response to inference request in 1.6919825077056885s
Received healthy response to inference request in 2.573627233505249s
Received healthy response to inference request in 2.0680339336395264s
Received healthy response to inference request in 2.8375637531280518s
10 requests
0 failed requests
5th percentile: 1.7471617698669433
10th percentile: 1.8023410320281983
20th percentile: 1.8377928256988525
30th percentile: 1.9675386905670165
40th percentile: 2.0490841388702394
50th percentile: 2.129099130630493
60th percentile: 2.233709192276001
70th percentile: 2.339304041862488
80th percentile: 2.4613534450531005
90th percentile: 2.6000208854675293
95th percentile: 2.71879231929779
99th percentile: 2.8138094663619997
mean time: 2.1772536039352417
Pipeline stage StressChecker completed in 23.44s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
function_bihit_2025-12-08 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 2542.98s
Shutdown handler de-registered
function_bihit_2025-12-08 status is now inactive due to auto deactivation removed underperforming models
function_bihit_2025-12-08 status is now torndown due to DeploymentManager action