function_bukel_2025-07-29

developer_uid: chai_backend_admin

submission_id: function_bukel_2025-07-29

model_name: function_bukel_2025-07-29

model_group:

status: torndown

timestamp: 2025-07-29T20:25:28+00:00

num_battles: 5289

num_wins: 2734

celo_rating: 1289.62

family_friendly_score: 0.5194

family_friendly_standard_error: 0.00706574327300391

submission_type: function

display_name: function_bukel_2025-07-29

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-07-29

win_ratio: 0.5169219134051806

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 5.13249135017395s
Received healthy response to inference request in 3.3864147663116455s
Received healthy response to inference request in 4.691150903701782s
Received healthy response to inference request in 4.082300186157227s
Received healthy response to inference request in 4.57013463973999s
5 requests
0 failed requests
5th percentile: 3.5255918502807617
10th percentile: 3.664768934249878
20th percentile: 3.9431231021881104
30th percentile: 4.179867076873779
40th percentile: 4.375000858306885
50th percentile: 4.57013463973999
60th percentile: 4.618541145324707
70th percentile: 4.666947650909424
80th percentile: 4.779418992996216
90th percentile: 4.955955171585083
95th percentile: 5.044223260879517
99th percentile: 5.114837732315063
mean time: 4.372498369216919
%s, retrying in %s seconds...
Received healthy response to inference request in 2.4103193283081055s
Received healthy response to inference request in 2.889756917953491s
Received healthy response to inference request in 2.723507881164551s
Received healthy response to inference request in 2.8949742317199707s
Received healthy response to inference request in 3.507498264312744s
5 requests
0 failed requests
5th percentile: 2.4729570388793944
10th percentile: 2.5355947494506834
20th percentile: 2.660870170593262
30th percentile: 2.756757688522339
40th percentile: 2.823257303237915
50th percentile: 2.889756917953491
60th percentile: 2.891843843460083
70th percentile: 2.893930768966675
80th percentile: 3.0174790382385255
90th percentile: 3.2624886512756346
95th percentile: 3.3849934577941894
99th percentile: 3.482997303009033
mean time: 2.8852113246917725
Pipeline stage StressChecker completed in 38.54s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.41s
Shutdown handler de-registered
function_bukel_2025-07-29 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 4353.65s
Shutdown handler de-registered
function_bukel_2025-07-29 status is now inactive due to auto deactivation removed underperforming models
function_bukel_2025-07-29 status is now torndown due to DeploymentManager action