function_lufof_2025-12-18

developer_uid: chai_evaluation_service

submission_id: function_lufof_2025-12-18

model_name: richard

model_group:

status: torndown

timestamp: 2025-12-21T07:51:13+00:00

num_battles: 7979

num_wins: 4063

celo_rating: 1299.74

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: richard

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-20

win_ratio: 0.509211680661737

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.726670742034912s
Received healthy response to inference request in 3.413729429244995s
Received healthy response to inference request in 2.2627439498901367s
Received healthy response to inference request in 2.1822361946105957s
Received healthy response to inference request in 2.2317285537719727s
Received healthy response to inference request in 2.0191709995269775s
Received healthy response to inference request in 2.4186997413635254s
Received healthy response to inference request in 2.3776087760925293s
Received healthy response to inference request in 3.475102663040161s
Received healthy response to inference request in 2.980081081390381s
10 requests
0 failed requests
5th percentile: 1.8582958579063416
10th percentile: 1.989920973777771
20th percentile: 2.149623155593872
30th percentile: 2.2168808460235594
40th percentile: 2.2503377914428713
50th percentile: 2.320176362991333
60th percentile: 2.3940451622009276
70th percentile: 2.587114143371582
80th percentile: 3.0668107509613036
90th percentile: 3.419866752624512
95th percentile: 3.4474847078323365
99th percentile: 3.4695790719985964
mean time: 2.508777213096619
Pipeline stage StressChecker completed in 26.31s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.58s
Shutdown handler de-registered
function_lufof_2025-12-18 status is now deployed due to DeploymentManager action
function_lufof_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_lufof_2025-12-18 status is now torndown due to DeploymentManager action