developer_uid: chai_evaluation_service
submission_id: function_hulun_2025-12-16
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-19T20:21:21+00:00
num_battles: 11197
num_wins: 5516
celo_rating: 1288.13
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-19
win_ratio: 0.4926319549879432
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.3717992305755615s
Received healthy response to inference request in 4.7461113929748535s
Received healthy response to inference request in 3.543226957321167s
Received healthy response to inference request in 5.652591705322266s
Received healthy response to inference request in 3.702831983566284s
Received healthy response to inference request in 3.5766448974609375s
Received healthy response to inference request in 2.9633030891418457s
Received healthy response to inference request in 4.38412618637085s
Received healthy response to inference request in 2.9857358932495117s
Received healthy response to inference request in 3.8016464710235596s
10 requests
0 failed requests
5th percentile: 2.9733978509902954
10th percentile: 2.983492612838745
20th percentile: 3.2945865631103515
30th percentile: 3.4917986392974854
40th percentile: 3.5632777214050293
50th percentile: 3.639738440513611
60th percentile: 3.742357778549194
70th percentile: 3.9763903856277465
80th percentile: 4.45652322769165
90th percentile: 4.836759424209594
95th percentile: 5.244675564765929
99th percentile: 5.571008477210999
mean time: 3.8728017807006836
Pipeline stage StressChecker completed in 40.63s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.92s
Shutdown handler de-registered
function_hulun_2025-12-16 status is now deployed due to DeploymentManager action
function_hulun_2025-12-16 status is now inactive due to auto deactivation removed underperforming models
function_hulun_2025-12-16 status is now torndown due to DeploymentManager action