function_tukab_2025-12-20

developer_uid: chai_backend_admin

submission_id: function_tukab_2025-12-20

model_name: function_tukab_2025-12-20

model_group:

status: torndown

timestamp: 2025-12-23T19:41:21+00:00

num_battles: 7606

num_wins: 3904

celo_rating: 1302.46

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: function_tukab_2025-12-20

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-23

win_ratio: 0.5132789902708388

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 8.664310216903687s
Received healthy response to inference request in 5.376091718673706s
Received healthy response to inference request in 3.295527696609497s
Received healthy response to inference request in 2.8084185123443604s
Received healthy response to inference request in 5.749917030334473s
Received healthy response to inference request in 6.081613779067993s
Received healthy response to inference request in 4.873844861984253s
Received healthy response to inference request in 5.684683322906494s
Received healthy response to inference request in 6.0176842212677s
Received healthy response to inference request in 4.498592138290405s
10 requests
0 failed requests
5th percentile: 3.027617645263672
10th percentile: 3.2468167781829833
20th percentile: 4.257979249954223
30th percentile: 4.761269044876099
40th percentile: 5.175192975997925
50th percentile: 5.5303875207901
60th percentile: 5.7107768058776855
70th percentile: 5.830247187614441
80th percentile: 6.030470132827759
90th percentile: 6.339883422851561
95th percentile: 7.502096819877622
99th percentile: 8.431867537498475
mean time: 5.3050683498382565
Pipeline stage StressChecker completed in 54.42s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.62s
Shutdown handler de-registered
function_tukab_2025-12-20 status is now deployed due to DeploymentManager action
function_tukab_2025-12-20 status is now inactive due to auto deactivation removed underperforming models
function_tukab_2025-12-20 status is now torndown due to DeploymentManager action