function_lihub_2025-12-14

developer_uid: chai_evaluation_service

submission_id: function_lihub_2025-12-14

model_name: richard

model_group:

status: torndown

timestamp: 2025-12-17T17:41:31+00:00

num_battles: 8930

num_wins: 4572

celo_rating: 1256.41

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: richard

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-14

win_ratio: 0.5119820828667413

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.554112195968628s
Received healthy response to inference request in 2.962674856185913s
Received healthy response to inference request in 3.064307928085327s
Received healthy response to inference request in 2.0415611267089844s
Received healthy response to inference request in 2.6543116569519043s
Received healthy response to inference request in 2.0606305599212646s
Received healthy response to inference request in 2.3733668327331543s
Received healthy response to inference request in 2.487276077270508s
Received healthy response to inference request in 2.3689913749694824s
Received healthy response to inference request in 1.9120056629180908s
10 requests
0 failed requests
5th percentile: 1.970305621623993
10th percentile: 2.028605580329895
20th percentile: 2.0568166732788087
30th percentile: 2.276483130455017
40th percentile: 2.3716166496276854
50th percentile: 2.430321455001831
60th percentile: 2.514010524749756
70th percentile: 2.5841720342636108
80th percentile: 2.7159842967987062
90th percentile: 2.9728381633758545
95th percentile: 3.018573045730591
99th percentile: 3.05516095161438
mean time: 2.4479238271713255
Pipeline stage StressChecker completed in 25.81s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.56s
Shutdown handler de-registered
function_lihub_2025-12-14 status is now deployed due to DeploymentManager action
function_lihub_2025-12-14 status is now inactive due to auto deactivation removed underperforming models
function_lihub_2025-12-14 status is now torndown due to DeploymentManager action