developer_uid: chai_evaluation_service
submission_id: function_hural_2025-12-18
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-21T13:31:06+00:00
num_battles: 9714
num_wins: 4827
celo_rating: 1291.12
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-21
win_ratio: 0.496911673872761
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.393216848373413s
Received healthy response to inference request in 3.1034297943115234s
Received healthy response to inference request in 1.7008459568023682s
Received healthy response to inference request in 2.394977331161499s
Received healthy response to inference request in 2.712397575378418s
Received healthy response to inference request in 3.370946168899536s
Received healthy response to inference request in 2.769449472427368s
Received healthy response to inference request in 2.730470895767212s
Received healthy response to inference request in 2.7274911403656006s
Received healthy response to inference request in 3.0078725814819336s
10 requests
0 failed requests
5th percentile: 2.0124128580093386
10th percentile: 2.3239797592163085
20th percentile: 2.3946252346038817
30th percentile: 2.6171715021133424
40th percentile: 2.7214537143707274
50th percentile: 2.7289810180664062
60th percentile: 2.7460623264312742
70th percentile: 2.8409764051437376
80th percentile: 3.0269840240478514
90th percentile: 3.1301814317703247
95th percentile: 3.25056380033493
99th percentile: 3.346869695186615
mean time: 2.691109776496887
Pipeline stage StressChecker completed in 29.40s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.30s
Shutdown handler de-registered
function_hural_2025-12-18 status is now deployed due to DeploymentManager action
function_hural_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_hural_2025-12-18 status is now torndown due to DeploymentManager action