developer_uid: chai_backend_admin
submission_id: function_seben_2026-04-01
model_name: function_seben_2026-04-01
model_group:
status: inactive
timestamp: 2026-04-01T06:36:59+00:00
num_battles: 10621
num_wins: 5219
celo_rating: 8404.12
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: function_seben_2026-04-01
is_internal_developer: True
ranking_group: single
us_pacific_date: 2026-03-31
win_ratio: 0.4913849919969871
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.0083634853363037s
Received healthy response to inference request in 1.9343395233154297s
Received healthy response to inference request in 5.085489988327026s
Received healthy response to inference request in 1.9718716144561768s
Received healthy response to inference request in 4.376709461212158s
Received healthy response to inference request in 1.9438376426696777s
Received healthy response to inference request in 3.887385368347168s
Received healthy response to inference request in 4.639249801635742s
Received healthy response to inference request in 2.994988203048706s
10 requests
1 failed requests
5th percentile: 1.9386136770248412
10th percentile: 1.942887830734253
20th percentile: 1.966264820098877
30th percentile: 2.688053226470947
40th percentile: 3.0030133724212646
50th percentile: 3.447874426841736
60th percentile: 4.0831150054931635
70th percentile: 4.455471563339233
80th percentile: 4.728497838973999
90th percentile: 6.589292073249812
95th percentile: 13.35640145540236
99th percentile: 18.770088961124422
mean time: 4.996574592590332
%s, retrying in %s seconds...
Received healthy response to inference request in 2.544293165206909s
Received healthy response to inference request in 1.9131278991699219s
Received healthy response to inference request in 2.4615914821624756s
Received healthy response to inference request in 1.7350163459777832s
Received healthy response to inference request in 1.9008989334106445s
Received healthy response to inference request in 1.9289875030517578s
readfrom tcp 127.0.0.1:41268->127.0.0.1:8080: write tcp 127.0.0.1:41268->127.0.0.1:8080: use of closed network connection
Received unhealthy response to inference request!
Received healthy response to inference request in 5.536893367767334s
Received healthy response to inference request in 3.0305254459381104s
Received healthy response to inference request in 3.6507678031921387s
10 requests
1 failed requests
5th percentile: 0.9475700736045838
10th percentile: 1.5918442964553834
20th percentile: 1.8677224159240722
30th percentile: 1.9094592094421388
40th percentile: 1.9226436614990234
50th percentile: 2.1952894926071167
60th percentile: 2.494672155380249
70th percentile: 2.6901628494262693
80th percentile: 3.154573917388916
90th percentile: 3.8393803596496574
95th percentile: 4.688136863708494
99th percentile: 5.367142066955567
mean time: 2.500539779663086
%s, retrying in %s seconds...
Received healthy response to inference request in 2.566366672515869s
Received healthy response to inference request in 4.316020727157593s
Received healthy response to inference request in 3.849745512008667s
Received healthy response to inference request in 3.7139945030212402s
Received healthy response to inference request in 3.694094657897949s
Received healthy response to inference request in 3.1256630420684814s
Received healthy response to inference request in 3.107029676437378s
Received healthy response to inference request in 2.120342493057251s
Received healthy response to inference request in 2.9969794750213623s
Received healthy response to inference request in 2.9926459789276123s
10 requests
0 failed requests
5th percentile: 2.321053373813629
10th percentile: 2.5217642545700074
20th percentile: 2.9073901176452637
30th percentile: 2.995679426193237
40th percentile: 3.0630095958709718
50th percentile: 3.1163463592529297
60th percentile: 3.353035688400268
70th percentile: 3.7000646114349367
80th percentile: 3.7411447048187254
90th percentile: 3.8963730335235596
95th percentile: 4.106196880340575
99th percentile: 4.2740559577941895
mean time: 3.2482882738113403
Pipeline stage StressChecker completed in 114.18s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.70s
Shutdown handler de-registered
function_seben_2026-04-01 status is now deployed due to DeploymentManager action
function_seben_2026-04-01 status is now inactive due to auto deactivation removed underperforming models