function_fihar_2025-08-07

developer_uid: chai_backend_admin

submission_id: function_fihar_2025-08-07

model_name: function_fihar_2025-08-07

model_group:

status: torndown

timestamp: 2025-08-07T18:10:13+00:00

num_battles: 6152

num_wins: 3294

celo_rating: 1290.12

family_friendly_score: 0.5284

family_friendly_standard_error: 0.007059652116074842

submission_type: function

display_name: function_fihar_2025-08-07

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-08-07

win_ratio: 0.5354356306892067

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 4.176832914352417s
Received healthy response to inference request in 3.3439812660217285s
Received healthy response to inference request in 5.492914199829102s
Received healthy response to inference request in 3.2062487602233887s
Received healthy response to inference request in 2.5852746963500977s
5 requests
0 failed requests
5th percentile: 2.709469509124756
10th percentile: 2.833664321899414
20th percentile: 3.0820539474487303
30th percentile: 3.2337952613830567
40th percentile: 3.2888882637023924
50th percentile: 3.3439812660217285
60th percentile: 3.6771219253540037
70th percentile: 4.010262584686279
80th percentile: 4.440049171447754
90th percentile: 4.966481685638428
95th percentile: 5.229697942733765
99th percentile: 5.440270948410034
mean time: 3.7610503673553466
%s, retrying in %s seconds...
Received healthy response to inference request in 1.9007234573364258s
Received healthy response to inference request in 3.2356021404266357s
Received healthy response to inference request in 2.942605495452881s
Received healthy response to inference request in 2.0772409439086914s
Received healthy response to inference request in 4.464784860610962s
5 requests
0 failed requests
5th percentile: 1.9360269546508788
10th percentile: 1.971330451965332
20th percentile: 2.0419374465942384
30th percentile: 2.2503138542175294
40th percentile: 2.5964596748352053
50th percentile: 2.942605495452881
60th percentile: 3.059804153442383
70th percentile: 3.1770028114318847
80th percentile: 3.4814386844635012
90th percentile: 3.973111772537232
95th percentile: 4.218948316574097
99th percentile: 4.415617551803589
mean time: 2.924191379547119
Pipeline stage StressChecker completed in 36.75s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.69s
Shutdown handler de-registered
function_fihar_2025-08-07 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3418.48s
Shutdown handler de-registered
function_fihar_2025-08-07 status is now inactive due to auto deactivation removed underperforming models
function_fihar_2025-08-07 status is now torndown due to DeploymentManager action