function_fuhim_2025-12-16

developer_uid: chai_evaluation_service

submission_id: function_fuhim_2025-12-16

model_name: richard

model_group:

status: torndown

timestamp: 2025-12-19T12:21:20+00:00

num_battles: 9914

num_wins: 4960

celo_rating: 1293.33

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: richard

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-19

win_ratio: 0.5003026023804721

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 4.497801303863525s
Received healthy response to inference request in 2.2094974517822266s
Received healthy response to inference request in 2.1730220317840576s
Received healthy response to inference request in 3.614382028579712s
Received healthy response to inference request in 3.7685387134552s
Received healthy response to inference request in 10.814000368118286s
Received healthy response to inference request in 4.550790548324585s
Received healthy response to inference request in 3.9771881103515625s
Received healthy response to inference request in 4.6641340255737305s
Received healthy response to inference request in 6.773998022079468s
10 requests
0 failed requests
5th percentile: 2.189435970783234
10th percentile: 2.2058499097824096
20th percentile: 3.333405113220215
30th percentile: 3.7222917079925537
40th percentile: 3.8937283515930177
50th percentile: 4.237494707107544
60th percentile: 4.518997001647949
70th percentile: 4.584793591499329
80th percentile: 5.086106824874879
90th percentile: 7.177998256683348
95th percentile: 8.995999312400814
99th percentile: 10.450400156974792
mean time: 4.704335260391235
Pipeline stage StressChecker completed in 48.84s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.37s
Shutdown handler de-registered
function_fuhim_2025-12-16 status is now deployed due to DeploymentManager action
function_fuhim_2025-12-16 status is now inactive due to auto deactivation removed underperforming models
function_fuhim_2025-12-16 status is now torndown due to DeploymentManager action