function_sureb_2025-12-17

developer_uid: chai_backend_admin

submission_id: function_sureb_2025-12-17

model_name: function_sureb_2025-12-17

model_group:

status: torndown

timestamp: 2025-12-20T16:21:21+00:00

num_battles: 6621

num_wins: 3356

celo_rating: 1297.99

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: function_sureb_2025-12-17

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-20

win_ratio: 0.5068720737048784

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 19.25852060317993s
Received healthy response to inference request in 6.881927728652954s
Received healthy response to inference request in 14.481434106826782s
Received healthy response to inference request in 13.763047933578491s
Received healthy response to inference request in 11.344304323196411s
Received healthy response to inference request in 19.198338508605957s
Received healthy response to inference request in 18.631489515304565s
Received healthy response to inference request in 7.566990613937378s
Received healthy response to inference request in 7.645893573760986s
10 requests
1 failed requests
5th percentile: 7.1902060270309445
10th percentile: 7.498484325408936
20th percentile: 7.630112981796264
30th percentile: 10.234781098365783
40th percentile: 12.79555048942566
50th percentile: 14.122241020202637
60th percentile: 16.141456270217894
70th percentile: 18.80154421329498
80th percentile: 19.210374927520753
90th percentile: 19.344756746292113
95th percentile: 19.732819390296935
99th percentile: 20.043269505500792
mean time: 13.88928289413452
%s, retrying in %s seconds...
Received healthy response to inference request in 8.124426126480103s
Received healthy response to inference request in 8.84844422340393s
Received healthy response to inference request in 9.617968797683716s
Received healthy response to inference request in 10.821465492248535s
Received healthy response to inference request in 12.949156999588013s
Received healthy response to inference request in 6.0609893798828125s
Received healthy response to inference request in 1.3372924327850342s
Received healthy response to inference request in 9.053922414779663s
Received healthy response to inference request in 5.896776914596558s
Received healthy response to inference request in 7.036293983459473s
10 requests
0 failed requests
5th percentile: 3.38906044960022
10th percentile: 5.440828466415406
20th percentile: 6.028146886825562
30th percentile: 6.743702602386474
40th percentile: 7.68917326927185
50th percentile: 8.486435174942017
60th percentile: 8.930635499954224
70th percentile: 9.22313632965088
80th percentile: 9.85866813659668
90th percentile: 11.034234642982483
95th percentile: 11.991695821285246
99th percentile: 12.75766476392746
mean time: 7.9746736764907835
%s, retrying in %s seconds...
Received healthy response to inference request in 7.186742782592773s
Received healthy response to inference request in 8.81411623954773s
Received healthy response to inference request in 9.949222564697266s
Received healthy response to inference request in 7.577803373336792s
Received healthy response to inference request in 8.138960361480713s
Received healthy response to inference request in 4.7984702587127686s
Received healthy response to inference request in 6.074326276779175s
Received healthy response to inference request in 6.168026447296143s
Received healthy response to inference request in 7.0889739990234375s
Received healthy response to inference request in 4.365386724472046s
10 requests
0 failed requests
5th percentile: 4.560274314880371
10th percentile: 4.755161905288697
20th percentile: 5.819155073165893
30th percentile: 6.139916396141052
40th percentile: 6.720594978332519
50th percentile: 7.1378583908081055
60th percentile: 7.343167018890381
70th percentile: 7.746150469779968
80th percentile: 8.273991537094116
90th percentile: 8.927626872062683
95th percentile: 9.438424718379974
99th percentile: 9.847062995433808
mean time: 7.016202902793884
Pipeline stage StressChecker completed in 292.88s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.61s
Shutdown handler de-registered
function_sureb_2025-12-17 status is now deployed due to DeploymentManager action
function_sureb_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_sureb_2025-12-17 status is now torndown due to DeploymentManager action