function_takim_2025-12-17

developer_uid: chai_evaluation_service

submission_id: function_takim_2025-12-17

model_name: richard

model_group:

status: torndown

timestamp: 2025-12-20T15:21:30+00:00

num_battles: 7193

num_wins: 3630

celo_rating: 1296.68

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: richard

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-20

win_ratio: 0.5046573057138884

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.579820156097412s
Received healthy response to inference request in 1.8304975032806396s
Received healthy response to inference request in 4.608245849609375s
Received healthy response to inference request in 2.49963116645813s
Received healthy response to inference request in 2.8433609008789062s
Received healthy response to inference request in 5.459237098693848s
Received healthy response to inference request in 4.199340343475342s
Received healthy response to inference request in 2.703737735748291s
Received healthy response to inference request in 3.06623911857605s
Received healthy response to inference request in 2.88962721824646s
10 requests
0 failed requests
5th percentile: 2.1316076517105103
10th percentile: 2.432717800140381
20th percentile: 2.5637823581695556
30th percentile: 2.6665624618530273
40th percentile: 2.78751163482666
50th percentile: 2.866494059562683
60th percentile: 2.960271978378296
70th percentile: 3.406169486045837
80th percentile: 4.281121444702149
90th percentile: 4.693344974517822
95th percentile: 5.076291036605834
99th percentile: 5.382647886276246
mean time: 3.267973709106445
Pipeline stage StressChecker completed in 34.01s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.61s
Shutdown handler de-registered
function_takim_2025-12-17 status is now deployed due to DeploymentManager action
function_takim_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_takim_2025-12-17 status is now torndown due to DeploymentManager action