developer_uid: chai_evaluation_service
submission_id: function_bibuf_2025-12-17
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-20T18:21:26+00:00
num_battles: 10226
num_wins: 5150
celo_rating: 1295.64
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-20
win_ratio: 0.5036182280461569
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.936032772064209s
Received healthy response to inference request in 1.738581895828247s
Received healthy response to inference request in 1.6635797023773193s
Received healthy response to inference request in 2.52013897895813s
Received healthy response to inference request in 1.9177603721618652s
Received healthy response to inference request in 3.0836541652679443s
Received healthy response to inference request in 1.9518795013427734s
Received healthy response to inference request in 2.2016468048095703s
Received healthy response to inference request in 2.4973385334014893s
Received healthy response to inference request in 3.420072317123413s
10 requests
0 failed requests
5th percentile: 1.697330689430237
10th percentile: 1.7310816764831543
20th percentile: 1.8819246768951416
30th percentile: 1.9305510520935059
40th percentile: 1.9455408096313476
50th percentile: 2.076763153076172
60th percentile: 2.3199234962463375
70th percentile: 2.5041786670684814
80th percentile: 2.632842016220093
90th percentile: 3.117295980453491
95th percentile: 3.2686841487884517
99th percentile: 3.389794683456421
mean time: 2.293068504333496
Pipeline stage StressChecker completed in 24.15s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.57s
Shutdown handler de-registered
function_bibuf_2025-12-17 status is now deployed due to DeploymentManager action
function_bibuf_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_bibuf_2025-12-17 status is now torndown due to DeploymentManager action