developer_uid: chai_evaluation_service
submission_id: function_bemur_2025-12-14
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-17T13:31:15+00:00
num_battles: 8993
num_wins: 4508
celo_rating: 1256.39
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-14
win_ratio: 0.5012787723785166
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.9330968856811523s
Received healthy response to inference request in 2.794191598892212s
Received healthy response to inference request in 3.0124216079711914s
Received healthy response to inference request in 4.420974969863892s
Received healthy response to inference request in 2.6820971965789795s
Received healthy response to inference request in 2.4155094623565674s
Received healthy response to inference request in 2.874596357345581s
Received healthy response to inference request in 3.0494167804718018s
Received healthy response to inference request in 2.1811530590057373s
Received healthy response to inference request in 2.5746381282806396s
10 requests
0 failed requests
5th percentile: 2.0447221636772155
10th percentile: 2.1563474416732786
20th percentile: 2.3686381816864013
30th percentile: 2.526899528503418
40th percentile: 2.6391135692596435
50th percentile: 2.7381443977355957
60th percentile: 2.8263535022735597
70th percentile: 2.9159439325332643
80th percentile: 3.0198206424713137
90th percentile: 3.18657259941101
95th percentile: 3.8037737846374498
99th percentile: 4.297534732818604
mean time: 2.7938096046447756
Pipeline stage StressChecker completed in 29.50s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.55s
Shutdown handler de-registered
function_bemur_2025-12-14 status is now deployed due to DeploymentManager action
function_bemur_2025-12-14 status is now inactive due to auto deactivation removed underperforming models
function_bemur_2025-12-14 status is now torndown due to DeploymentManager action