developer_uid: chai_evaluation_service
submission_id: function_berer_2025-12-15
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-18T17:26:34+00:00
num_battles: 11571
num_wins: 5641
celo_rating: 1284.58
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-18
win_ratio: 0.4875118831561663
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.3154449462890625s
Received healthy response to inference request in 1.7234523296356201s
Received healthy response to inference request in 3.0092861652374268s
Received healthy response to inference request in 3.493104934692383s
Received healthy response to inference request in 2.7314369678497314s
Received healthy response to inference request in 2.3161230087280273s
Received healthy response to inference request in 2.3078973293304443s
Received healthy response to inference request in 1.7143051624298096s
Received healthy response to inference request in 2.0263476371765137s
10 requests
1 failed requests
5th percentile: 1.7184213876724244
10th percentile: 1.7225376129150392
20th percentile: 1.965768575668335
30th percentile: 2.223432421684265
40th percentile: 2.312425899505615
50th percentile: 2.315783977508545
60th percentile: 2.482248592376709
70th percentile: 2.81479172706604
80th percentile: 3.106049919128418
90th percentile: 5.157872653007502
95th percentile: 12.64932738542555
99th percentile: 18.642491171360017
mean time: 4.177818059921265
%s, retrying in %s seconds...
Received healthy response to inference request in 2.4218051433563232s
Received healthy response to inference request in 1.7747077941894531s
Received healthy response to inference request in 2.3923940658569336s
Received healthy response to inference request in 1.9313814640045166s
Received healthy response to inference request in 2.550084114074707s
Received healthy response to inference request in 2.627389907836914s
Received healthy response to inference request in 2.6128737926483154s
Received healthy response to inference request in 3.2377891540527344s
Received healthy response to inference request in 2.268812656402588s
Received healthy response to inference request in 2.893137216567993s
10 requests
0 failed requests
5th percentile: 1.8452109456062318
10th percentile: 1.9157140970230102
20th percentile: 2.2013264179229735
30th percentile: 2.35531964302063
40th percentile: 2.4100407123565675
50th percentile: 2.485944628715515
60th percentile: 2.5751999855041503
70th percentile: 2.617228627204895
80th percentile: 2.68053936958313
90th percentile: 2.9276024103164673
95th percentile: 3.0826957821846004
99th percentile: 3.2067704796791077
mean time: 2.4710375308990478
Pipeline stage StressChecker completed in 70.01s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.61s
Shutdown handler de-registered
function_berer_2025-12-15 status is now deployed due to DeploymentManager action
function_berer_2025-12-15 status is now inactive due to auto deactivation removed underperforming models
function_berer_2025-12-15 status is now torndown due to DeploymentManager action