developer_uid: chai_backend_admin
submission_id: function_gefir_2025-12-24
model_name: function_gefir_2025-12-24
model_group:
status: torndown
timestamp: 2025-12-28T13:01:50+00:00
num_battles: 1758
num_wins: 946
celo_rating: 1319.37
family_friendly_score: 0.5312
family_friendly_standard_error: 0.007057287864328619
submission_type: function
display_name: function_gefir_2025-12-24
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-25
win_ratio: 0.5381114903299203
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.906348943710327s
Received healthy response to inference request in 3.4169297218322754s
Received healthy response to inference request in 2.7034733295440674s
Received healthy response to inference request in 3.1373658180236816s
Received healthy response to inference request in 3.2196428775787354s
Received healthy response to inference request in 2.9600210189819336s
Received healthy response to inference request in 3.0617892742156982s
Received healthy response to inference request in 3.544813394546509s
Received healthy response to inference request in 2.8453314304351807s
Received healthy response to inference request in 2.6920251846313477s
10 requests
0 failed requests
5th percentile: 2.6971768498420716
10th percentile: 2.7023285150527956
20th percentile: 2.816959810256958
30th percentile: 2.888043689727783
40th percentile: 2.938552188873291
50th percentile: 3.010905146598816
60th percentile: 3.0920198917388917
70th percentile: 3.1620489358901978
80th percentile: 3.259100246429443
90th percentile: 3.4297180891036985
95th percentile: 3.4872657418251034
99th percentile: 3.5333038640022276
mean time: 3.0487740993499757
Pipeline stage StressChecker completed in 31.89s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.62s
Shutdown handler de-registered
function_gefir_2025-12-24 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 3119.58s
Shutdown handler de-registered
function_gefir_2025-12-24 status is now inactive due to admin request
function_gefir_2025-12-24 status is now torndown due to DeploymentManager action