developer_uid: chai_backend_admin
submission_id: function_jinob_2025-12-19
model_name: abtest_blend
model_group:
status: torndown
timestamp: 2025-12-22T02:01:17+00:00
num_battles: 7635
num_wins: 3847
celo_rating: 1332.31
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: abtest_blend
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-21
win_ratio: 0.5038637851997381
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.9180920124053955s
Received healthy response to inference request in 2.992612838745117s
Received healthy response to inference request in 2.9025096893310547s
Received healthy response to inference request in 3.7675790786743164s
Received healthy response to inference request in 5.162407159805298s
Received healthy response to inference request in 3.8698952198028564s
Received healthy response to inference request in 3.7854106426239014s
Received healthy response to inference request in 3.6892192363739014s
Received healthy response to inference request in 4.669875144958496s
10 requests
1 failed requests
5th percentile: 2.943056106567383
10th percentile: 2.983602523803711
20th percentile: 3.5498979568481444
30th percentile: 3.744071125984192
40th percentile: 3.7782780170440673
50th percentile: 3.827652931213379
60th percentile: 3.8891739368438722
70th percentile: 4.143626952171325
80th percentile: 4.768381547927857
90th percentile: 6.658191299438471
95th percentile: 13.389219927787764
99th percentile: 18.774042830467227
mean time: 5.487784957885742
%s, retrying in %s seconds...
Received healthy response to inference request in 5.048281908035278s
Received healthy response to inference request in 3.1244986057281494s
Received healthy response to inference request in 4.803505897521973s
Received healthy response to inference request in 3.2030906677246094s
Received healthy response to inference request in 4.111941814422607s
Received healthy response to inference request in 5.315987586975098s
Received healthy response to inference request in 6.8297436237335205s
Received healthy response to inference request in 4.386173486709595s
Received healthy response to inference request in 4.613597869873047s
Received healthy response to inference request in 5.611617565155029s
10 requests
0 failed requests
5th percentile: 3.1598650336265566
10th percentile: 3.1952314615249633
20th percentile: 3.930171585083008
30th percentile: 4.303903985023498
40th percentile: 4.522628116607666
50th percentile: 4.70855188369751
60th percentile: 4.901416301727295
70th percentile: 5.128593611717224
80th percentile: 5.375113582611084
90th percentile: 5.733430171012878
95th percentile: 6.281586897373198
99th percentile: 6.720112278461457
mean time: 4.704843902587891
Pipeline stage StressChecker completed in 104.94s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.66s
Shutdown handler de-registered
function_jinob_2025-12-19 status is now deployed due to DeploymentManager action
function_jinob_2025-12-19 status is now inactive due to auto deactivation removed underperforming models
function_jinob_2025-12-19 status is now torndown due to DeploymentManager action