developer_uid: chai_evaluation_service
submission_id: function_masef_2025-12-17
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-20T09:21:16+00:00
num_battles: 8235
num_wins: 4094
celo_rating: 1291.25
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-20
win_ratio: 0.49714632665452335
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.8713161945343018s
Received healthy response to inference request in 2.6404292583465576s
Received healthy response to inference request in 3.276937484741211s
Received healthy response to inference request in 2.5578770637512207s
Received healthy response to inference request in 3.19124436378479s
Received healthy response to inference request in 3.807145118713379s
Received healthy response to inference request in 2.2759838104248047s
Received healthy response to inference request in 2.107570171356201s
Received healthy response to inference request in 4.003811836242676s
Received healthy response to inference request in 1.9838850498199463s
10 requests
0 failed requests
5th percentile: 2.039543354511261
10th percentile: 2.0952016592025755
20th percentile: 2.242301082611084
30th percentile: 2.4733090877532957
40th percentile: 2.607408380508423
50th percentile: 2.7558727264404297
60th percentile: 2.9992874622344967
70th percentile: 3.2169523000717164
80th percentile: 3.3829790115356446
90th percentile: 3.8268117904663086
95th percentile: 3.915311813354492
99th percentile: 3.9861118316650392
mean time: 2.871620035171509
Pipeline stage StressChecker completed in 29.99s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
function_masef_2025-12-17 status is now deployed due to DeploymentManager action
function_masef_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_masef_2025-12-17 status is now torndown due to DeploymentManager action