function_lugim_2025-12-18

developer_uid: chai_backend_admin

submission_id: function_lugim_2025-12-18

model_name: function_lugim_2025-12-18

model_group:

status: torndown

timestamp: 2026-01-14T16:59:55+00:00

num_battles: 3225

num_wins: 1708

celo_rating: 1313.84

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: function_lugim_2025-12-18

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-17

win_ratio: 0.5296124031007752

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.297135591506958s
Received healthy response to inference request in 1.342853307723999s
Received healthy response to inference request in 1.9162523746490479s
Received healthy response to inference request in 1.8009881973266602s
Received healthy response to inference request in 1.0848894119262695s
Received healthy response to inference request in 3.832841634750366s
Received healthy response to inference request in 1.1124753952026367s
Received healthy response to inference request in 1.27972412109375s
Received healthy response to inference request in 3.6518442630767822s
10 requests
1 failed requests
5th percentile: 1.0973031044006347
10th percentile: 1.109716796875
20th percentile: 1.2462743759155273
30th percentile: 1.3239145517349242
40th percentile: 1.6177342414855957
50th percentile: 1.858620285987854
60th percentile: 2.068605661392212
70th percentile: 2.7035481929779053
80th percentile: 3.688043737411499
90th percentile: 5.459974288940424
95th percentile: 12.782071232795698
99th percentile: 18.639748787879945
mean time: 3.842317247390747
%s, retrying in %s seconds...
Received healthy response to inference request in 1.9271128177642822s
Received healthy response to inference request in 1.8030564785003662s
Received healthy response to inference request in 1.2418055534362793s
Received healthy response to inference request in 2.0288467407226562s
Received healthy response to inference request in 1.7468411922454834s
Received healthy response to inference request in 2.2939867973327637s
Received healthy response to inference request in 2.3000073432922363s
Received healthy response to inference request in 1.7711842060089111s
Received healthy response to inference request in 2.701240062713623s
Received healthy response to inference request in 3.0235605239868164s
10 requests
0 failed requests
5th percentile: 1.469071590900421
10th percentile: 1.696337628364563
20th percentile: 1.7663156032562255
30th percentile: 1.7934947967529298
40th percentile: 1.8774902820587158
50th percentile: 1.9779797792434692
60th percentile: 2.134902763366699
70th percentile: 2.2957929611206054
80th percentile: 2.3802538871765138
90th percentile: 2.7334721088409424
95th percentile: 2.878516316413879
99th percentile: 2.994551682472229
mean time: 2.083764171600342
Pipeline stage StressChecker completed in 62.15s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.57s
Shutdown handler de-registered
function_lugim_2025-12-18 status is now deployed due to DeploymentManager action
function_lugim_2025-12-18 status is now protected due to ABTestQueueItem
function_lugim_2025-12-18 status is now inactive due to ABTestQueueItem
function_lugim_2025-12-18 status is now torndown due to DeploymentManager action