developer_uid: chai_evaluation_service
submission_id: function_ponib_2025-12-16
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-19T20:21:22+00:00
num_battles: 11270
num_wins: 5526
celo_rating: 1286.59
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-19
win_ratio: 0.49032830523513754
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.0390398502349854s
Received healthy response to inference request in 2.2638089656829834s
Received healthy response to inference request in 2.8651130199432373s
Received healthy response to inference request in 2.1222100257873535s
Received healthy response to inference request in 3.2990708351135254s
Received healthy response to inference request in 2.015947103500366s
Received healthy response to inference request in 2.882280111312866s
Received healthy response to inference request in 2.3046348094940186s
Received healthy response to inference request in 2.5141329765319824s
10 requests
1 failed requests
5th percentile: 2.0637654185295107
10th percentile: 2.1115837335586547
20th percentile: 2.2354891777038572
30th percentile: 2.292387056350708
40th percentile: 2.430333709716797
50th percentile: 2.68962299823761
60th percentile: 2.871979856491089
70th percentile: 2.929308032989502
80th percentile: 3.0910460472106935
90th percentile: 4.981118535995478
95th percentile: 12.550333189964277
99th percentile: 18.605704913139345
mean time: 4.342578554153443
%s, retrying in %s seconds...
Received healthy response to inference request in 2.3063526153564453s
Received healthy response to inference request in 2.053607225418091s
Received healthy response to inference request in 2.208676815032959s
Received healthy response to inference request in 2.2411603927612305s
Received healthy response to inference request in 1.823728084564209s
Received healthy response to inference request in 1.943274736404419s
Received healthy response to inference request in 3.092310905456543s
Received healthy response to inference request in 1.9753968715667725s
Received healthy response to inference request in 1.9559409618377686s
Received healthy response to inference request in 2.541980028152466s
10 requests
0 failed requests
5th percentile: 1.8775240778923035
10th percentile: 1.931320071220398
20th percentile: 1.9534077167510986
30th percentile: 1.9695600986480712
40th percentile: 2.0223230838775637
50th percentile: 2.131142020225525
60th percentile: 2.2216702461242677
70th percentile: 2.260718059539795
80th percentile: 2.3534780979156493
90th percentile: 2.5970131158828735
95th percentile: 2.844662010669708
99th percentile: 3.042781126499176
mean time: 2.21424286365509
Pipeline stage StressChecker completed in 68.13s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.67s
Shutdown handler de-registered
function_ponib_2025-12-16 status is now deployed due to DeploymentManager action
function_ponib_2025-12-16 status is now inactive due to auto deactivation removed underperforming models
function_ponib_2025-12-16 status is now torndown due to DeploymentManager action