developer_uid: chai_backend_admin
submission_id: function_rurat_2025-06-26
model_name: function_rurat_2025-06-26
model_group:
status: torndown
timestamp: 2025-06-26T21:40:39+00:00
num_battles: 9700
num_wins: 5075
celo_rating: 1303.99
family_friendly_score: 0.5236000000000001
family_friendly_standard_error: 0.007063186816161669
submission_type: function
display_name: function_rurat_2025-06-26
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-06-26
win_ratio: 0.5231958762886598
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.027920961380005s
Received healthy response to inference request in 5.017756223678589s
Received healthy response to inference request in 4.230998516082764s
Received healthy response to inference request in 4.562787294387817s
Received healthy response to inference request in 3.840380907058716s
5 requests
0 failed requests
5th percentile: 3.1904129505157472
10th percentile: 3.352904939651489
20th percentile: 3.6778889179229735
30th percentile: 3.9185044288635256
40th percentile: 4.074751472473144
50th percentile: 4.230998516082764
60th percentile: 4.363714027404785
70th percentile: 4.496429538726806
80th percentile: 4.653781080245972
90th percentile: 4.8357686519622805
95th percentile: 4.926762437820434
99th percentile: 4.999557466506958
mean time: 4.135968780517578
%s, retrying in %s seconds...
Received healthy response to inference request in 7.229426383972168s
Received healthy response to inference request in 4.186354875564575s
Received healthy response to inference request in 4.396915435791016s
Received healthy response to inference request in 6.0808165073394775s
Received healthy response to inference request in 3.0662782192230225s
5 requests
0 failed requests
5th percentile: 3.290293550491333
10th percentile: 3.5143088817596437
20th percentile: 3.9623395442962646
30th percentile: 4.228466987609863
40th percentile: 4.3126912117004395
50th percentile: 4.396915435791016
60th percentile: 5.0704758644104
70th percentile: 5.744036293029785
80th percentile: 6.310538482666016
90th percentile: 6.769982433319091
95th percentile: 6.99970440864563
99th percentile: 7.18348198890686
mean time: 4.991958284378052
%s, retrying in %s seconds...
Received healthy response to inference request in 2.4861345291137695s
Received healthy response to inference request in 2.654405355453491s
Received healthy response to inference request in 4.229931116104126s
Received healthy response to inference request in 3.0543479919433594s
Received healthy response to inference request in 2.853980302810669s
5 requests
0 failed requests
5th percentile: 2.519788694381714
10th percentile: 2.553442859649658
20th percentile: 2.6207511901855467
30th percentile: 2.6943203449249267
40th percentile: 2.774150323867798
50th percentile: 2.853980302810669
60th percentile: 2.934127378463745
70th percentile: 3.0142744541168214
80th percentile: 3.289464616775513
90th percentile: 3.7596978664398195
95th percentile: 3.9948144912719723
99th percentile: 4.182907791137695
mean time: 3.055759859085083
Pipeline stage StressChecker completed in 64.92s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.62s
Shutdown handler de-registered
function_rurat_2025-06-26 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3902.13s
Shutdown handler de-registered
function_rurat_2025-06-26 status is now inactive due to auto deactivation removed underperforming models
function_rurat_2025-06-26 status is now torndown due to DeploymentManager action