developer_uid: chai_evaluation_service
submission_id: function_tilak_2025-12-17
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-20T13:01:10+00:00
num_battles: 8346
num_wins: 4207
celo_rating: 1296.24
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-20
win_ratio: 0.5040738078121255
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.0916855335235596s
Received healthy response to inference request in 2.5081722736358643s
Received healthy response to inference request in 1.878333330154419s
Received healthy response to inference request in 3.668623208999634s
Received healthy response to inference request in 1.7057011127471924s
Received healthy response to inference request in 2.615746259689331s
Received healthy response to inference request in 2.1776297092437744s
Received healthy response to inference request in 2.1146836280822754s
Received healthy response to inference request in 2.4204206466674805s
10 requests
1 failed requests
5th percentile: 1.7833856105804444
10th percentile: 1.8610701084136962
20th percentile: 2.0490150928497313
30th percentile: 2.1077841997146605
40th percentile: 2.152451276779175
50th percentile: 2.2990251779556274
60th percentile: 2.455521297454834
70th percentile: 2.5404444694519044
80th percentile: 2.826321649551392
90th percentile: 5.312222743034357
95th percentile: 12.708420646190625
99th percentile: 18.62537896871567
mean time: 4.128561425209045
%s, retrying in %s seconds...
Received healthy response to inference request in 2.149698495864868s
Received healthy response to inference request in 2.2836198806762695s
Received healthy response to inference request in 1.7457396984100342s
Received healthy response to inference request in 2.670577049255371s
Received healthy response to inference request in 2.1600089073181152s
Received healthy response to inference request in 2.211562395095825s
Received healthy response to inference request in 2.0625131130218506s
Received healthy response to inference request in 1.9967477321624756s
Received healthy response to inference request in 2.3055579662323s
Received healthy response to inference request in 2.2702174186706543s
10 requests
0 failed requests
5th percentile: 1.8586933135986328
10th percentile: 1.9716469287872314
20th percentile: 2.0493600368499756
30th percentile: 2.123542881011963
40th percentile: 2.1558847427368164
50th percentile: 2.18578565120697
60th percentile: 2.235024404525757
70th percentile: 2.2742381572723387
80th percentile: 2.2880074977874756
90th percentile: 2.342059874534607
95th percentile: 2.5063184618949887
99th percentile: 2.6377253317832947
mean time: 2.1856242656707763
Pipeline stage StressChecker completed in 65.76s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
function_tilak_2025-12-17 status is now deployed due to DeploymentManager action
function_tilak_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_tilak_2025-12-17 status is now torndown due to DeploymentManager action