developer_uid: chai_evaluation_service
submission_id: function_sufar_2025-12-15
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-18T21:21:16+00:00
num_battles: 15353
num_wins: 7531
celo_rating: 1286.54
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-18
win_ratio: 0.49052302481599686
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.60541033744812s
Received healthy response to inference request in 2.312163829803467s
Received healthy response to inference request in 3.2633156776428223s
Received healthy response to inference request in 3.225264310836792s
Received healthy response to inference request in 1.7420976161956787s
Received healthy response to inference request in 1.9390785694122314s
Received healthy response to inference request in 1.940202236175537s
Received healthy response to inference request in 2.010664224624634s
Received healthy response to inference request in 2.8292455673217773s
10 requests
1 failed requests
5th percentile: 1.8307390451431274
10th percentile: 1.9193804740905762
20th percentile: 1.939977502822876
30th percentile: 1.9895256280899047
40th percentile: 2.1915639877319335
50th percentile: 2.4587870836257935
60th percentile: 2.694944429397583
70th percentile: 2.948051190376282
80th percentile: 3.2328745841979982
90th percentile: 4.947782111167902
95th percentile: 12.527881062030774
99th percentile: 18.591960222721102
mean time: 4.1975422382354735
%s, retrying in %s seconds...
Received healthy response to inference request in 2.43623948097229s
Received healthy response to inference request in 2.6174094676971436s
Received healthy response to inference request in 2.135817289352417s
Received healthy response to inference request in 3.4751837253570557s
Received healthy response to inference request in 2.224520444869995s
Received healthy response to inference request in 2.8301966190338135s
Received healthy response to inference request in 2.251300811767578s
Received healthy response to inference request in 1.924288034439087s
Received healthy response to inference request in 2.1549954414367676s
Received healthy response to inference request in 2.301215887069702s
10 requests
0 failed requests
5th percentile: 2.0194761991500854
10th percentile: 2.114664363861084
20th percentile: 2.1511598110198973
30th percentile: 2.203662943840027
40th percentile: 2.240588665008545
50th percentile: 2.27625834941864
60th percentile: 2.3552253246307373
70th percentile: 2.490590476989746
80th percentile: 2.6599668979644777
90th percentile: 2.8946953296661375
95th percentile: 3.184939527511596
99th percentile: 3.4171348857879638
mean time: 2.435116720199585
Pipeline stage StressChecker completed in 70.14s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
function_sufar_2025-12-15 status is now deployed due to DeploymentManager action
function_sufar_2025-12-15 status is now inactive due to auto deactivation removed underperforming models
function_sufar_2025-12-15 status is now torndown due to DeploymentManager action