developer_uid: chai_evaluation_service
submission_id: function_hesub_2025-12-14
model_name: richard
model_group:
status: inactive
timestamp: 2025-12-14T12:56:36+00:00
num_battles: 8245
num_wins: 4051
celo_rating: 1256.39
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-14
win_ratio: 0.4913280776228017
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
('Connection aborted.', ConnectionResetError(104, 'Connection reset by peer'))
Received unhealthy response to inference request!
Received healthy response to inference request in 2.204301118850708s
Received healthy response to inference request in 2.5087902545928955s
Received healthy response to inference request in 1.856391429901123s
Received healthy response to inference request in 1.971372365951538s
Received healthy response to inference request in 2.685096025466919s
Received healthy response to inference request in 2.2539448738098145s
Received healthy response to inference request in 2.372450828552246s
Received healthy response to inference request in 2.449158191680908s
Received healthy response to inference request in 2.376797914505005s
10 requests
1 failed requests
5th percentile: 1.096322524547577
10th percentile: 1.7181970834732057
20th percentile: 1.948376178741455
30th percentile: 2.1344224929809568
40th percentile: 2.234087371826172
50th percentile: 2.3131978511810303
60th percentile: 2.3741896629333494
70th percentile: 2.398505997657776
80th percentile: 2.4610846042633057
90th percentile: 2.526420831680298
95th percentile: 2.6057584285736084
99th percentile: 2.669228506088257
mean time: 2.1152750968933107
%s, retrying in %s seconds...
Received healthy response to inference request in 3.5430490970611572s
Received healthy response to inference request in 2.4559733867645264s
Received healthy response to inference request in 2.747145891189575s
Received healthy response to inference request in 2.1772754192352295s
Received healthy response to inference request in 2.5593326091766357s
Received healthy response to inference request in 2.4905948638916016s
Received healthy response to inference request in 2.15912127494812s
Received healthy response to inference request in 1.7052085399627686s
Received healthy response to inference request in 2.135045051574707s
Received healthy response to inference request in 2.1694984436035156s
10 requests
0 failed requests
5th percentile: 1.8986349701881409
10th percentile: 2.092061400413513
20th percentile: 2.1543060302734376
30th percentile: 2.166385293006897
40th percentile: 2.174164628982544
50th percentile: 2.316624402999878
60th percentile: 2.4698219776153563
70th percentile: 2.5112161874771117
80th percentile: 2.5968952655792235
90th percentile: 2.826736211776733
95th percentile: 3.1848926544189444
99th percentile: 3.4714178085327148
mean time: 2.414224457740784
Pipeline stage StressChecker completed in 48.10s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.60s
Shutdown handler de-registered
function_hesub_2025-12-14 status is now deployed due to DeploymentManager action
function_hesub_2025-12-14 status is now inactive due to auto deactivation removed underperforming models