developer_uid: chai_backend_admin
submission_id: function_nobeb_2025-12-04
model_name: function_nobeb_2025-12-04
model_group:
status: inactive
timestamp: 2025-12-04T19:42:07+00:00
num_battles: 6229
num_wins: 3678
celo_rating: 1358.17
family_friendly_score: 0.5794
family_friendly_standard_error: 0.0069813414183808545
submission_type: function
display_name: function_nobeb_2025-12-04
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-04
win_ratio: 0.5904639589019104
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 6.964078664779663s
Received healthy response to inference request in 5.041116714477539s
Received healthy response to inference request in 15.404323101043701s
Received healthy response to inference request in 3.6291663646698s
Received healthy response to inference request in 5.923581600189209s
Received healthy response to inference request in 6.2444469928741455s
Received healthy response to inference request in 1.0863478183746338s
Received healthy response to inference request in 4.410624980926514s
Received healthy response to inference request in 0.44206809997558594s
Received healthy response to inference request in 0.5929651260375977s
10 requests
0 failed requests
5th percentile: 0.5099717617034912
10th percentile: 0.5778754234313965
20th percentile: 0.9876712799072266
30th percentile: 2.8663208007812493
40th percentile: 4.098041534423828
50th percentile: 4.725870847702026
60th percentile: 5.394102668762207
70th percentile: 6.01984121799469
80th percentile: 6.388373327255249
90th percentile: 7.808103108406064
95th percentile: 11.606213104724874
99th percentile: 14.644701101779939
mean time: 4.973871946334839
Pipeline stage StressChecker completed in 51.04s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.66s
Shutdown handler de-registered
function_nobeb_2025-12-04 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 2968.21s
Shutdown handler de-registered