developer_uid: chai_evaluation_service
submission_id: function_nuhet_2025-12-14
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-17T23:41:18+00:00
num_battles: 8478
num_wins: 4308
celo_rating: 1256.42
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-14
win_ratio: 0.508138711960368
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 6.8829874992370605s
Received healthy response to inference request in 6.517997980117798s
Received healthy response to inference request in 4.77037787437439s
Received healthy response to inference request in 4.015594244003296s
Received healthy response to inference request in 4.794943332672119s
Received healthy response to inference request in 4.401789903640747s
Received healthy response to inference request in 3.0726354122161865s
Received healthy response to inference request in 4.334424734115601s
Received healthy response to inference request in 2.817626476287842s
Received healthy response to inference request in 4.128992557525635s
10 requests
0 failed requests
5th percentile: 2.932380497455597
10th percentile: 3.047134518623352
20th percentile: 3.827002477645874
30th percentile: 4.094973063468933
40th percentile: 4.252251863479614
50th percentile: 4.368107318878174
60th percentile: 4.549225091934204
70th percentile: 4.7777475118637085
80th percentile: 5.139554262161255
90th percentile: 6.554496932029724
95th percentile: 6.718742215633392
99th percentile: 6.850138442516327
mean time: 4.573737001419067
Pipeline stage StressChecker completed in 47.14s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
function_nuhet_2025-12-14 status is now deployed due to DeploymentManager action
function_nuhet_2025-12-14 status is now inactive due to auto deactivation removed underperforming models
function_nuhet_2025-12-14 status is now torndown due to DeploymentManager action