function_tekum_2025-12-18

developer_uid: chai_evaluation_service

submission_id: function_tekum_2025-12-18

model_name: richard

model_group:

status: torndown

timestamp: 2025-12-21T11:21:18+00:00

num_battles: 7288

num_wins: 3662

celo_rating: 1294.9

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: richard

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-21

win_ratio: 0.502469813391877

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.1395046710968018s
Received healthy response to inference request in 3.6091396808624268s
Received healthy response to inference request in 2.5959255695343018s
Received healthy response to inference request in 3.6939942836761475s
Received healthy response to inference request in 3.1992740631103516s
Received healthy response to inference request in 4.153260231018066s
Received healthy response to inference request in 4.154626369476318s
Received healthy response to inference request in 2.078505516052246s
Received healthy response to inference request in 2.164846420288086s
Received healthy response to inference request in 1.828580379486084s
10 requests
0 failed requests
5th percentile: 1.941046690940857
10th percentile: 2.05351300239563
20th percentile: 2.1273048400878904
30th percentile: 2.1572438955307005
40th percentile: 2.4234939098358153
50th percentile: 2.8975998163223267
60th percentile: 3.3632203102111813
70th percentile: 3.634596061706543
80th percentile: 3.7858474731445315
90th percentile: 4.153396844863892
95th percentile: 4.154011607170105
99th percentile: 4.154503417015076
mean time: 2.961765718460083
Pipeline stage StressChecker completed in 30.84s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.54s
Shutdown handler de-registered
function_tekum_2025-12-18 status is now deployed due to DeploymentManager action
function_tekum_2025-12-18 status is now inactive due to auto deactivation removed underperforming models
function_tekum_2025-12-18 status is now torndown due to DeploymentManager action