developer_uid: chai_backend_admin
submission_id: function_difis_2025-12-09
model_name: function_difis_2025-12-09
model_group:
status: torndown
timestamp: 2025-12-12T21:11:10+00:00
num_battles: 5737
num_wins: 2982
celo_rating: 1306.41
family_friendly_score: 0.5226
family_friendly_standard_error: 0.007063840881560116
submission_type: function
display_name: function_difis_2025-12-09
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-09
win_ratio: 0.5197838591598396
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.092149257659912s
Received healthy response to inference request in 1.517120122909546s
Received healthy response to inference request in 1.721153974533081s
Received healthy response to inference request in 1.529689073562622s
Received healthy response to inference request in 2.067460536956787s
Received healthy response to inference request in 2.0753207206726074s
Received healthy response to inference request in 1.5368077754974365s
Received healthy response to inference request in 1.8844001293182373s
Received healthy response to inference request in 3.173553466796875s
Received healthy response to inference request in 2.0402586460113525s
10 requests
0 failed requests
5th percentile: 1.52277615070343
10th percentile: 1.5284321784973145
20th percentile: 1.5353840351104737
30th percentile: 1.6658501148223877
40th percentile: 1.819101667404175
50th percentile: 1.962329387664795
60th percentile: 2.051139402389526
70th percentile: 2.0698185920715333
80th percentile: 2.0786864280700685
90th percentile: 2.200289678573608
95th percentile: 2.6869215726852405
99th percentile: 3.0762270879745484
mean time: 1.9637913703918457
Pipeline stage StressChecker completed in 20.94s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.63s
Shutdown handler de-registered
function_difis_2025-12-09 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 2964.23s
Shutdown handler de-registered
function_difis_2025-12-09 status is now torndown due to DeploymentManager action