function_telem_2025-12-17

developer_uid: chai_evaluation_service

submission_id: function_telem_2025-12-17

model_name: richard

model_group:

status: torndown

timestamp: 2025-12-20T16:01:08+00:00

num_battles: 7420

num_wins: 3627

celo_rating: 1285.62

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: richard

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-20

win_ratio: 0.48881401617250675

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.42289662361145s
Received healthy response to inference request in 1.7159738540649414s
Received healthy response to inference request in 5.6541852951049805s
Received healthy response to inference request in 2.8219287395477295s
Received healthy response to inference request in 2.895420551300049s
Received healthy response to inference request in 3.2468717098236084s
Received healthy response to inference request in 3.941624641418457s
Received healthy response to inference request in 9.173458099365234s
Received healthy response to inference request in 3.0314667224884033s
Received healthy response to inference request in 2.744948625564575s
10 requests
0 failed requests
5th percentile: 2.1790125012397765
10th percentile: 2.6420511484146116
20th percentile: 2.8065327167510987
30th percentile: 2.873373007774353
40th percentile: 2.9770482540130616
50th percentile: 3.139169216156006
60th percentile: 3.317281675338745
70th percentile: 3.5785150289535523
80th percentile: 4.284136772155762
90th percentile: 6.006112575531005
95th percentile: 7.589785337448117
99th percentile: 8.856723546981812
mean time: 3.864877486228943
Pipeline stage StressChecker completed in 40.38s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.62s
Shutdown handler de-registered
function_telem_2025-12-17 status is now deployed due to DeploymentManager action
function_telem_2025-12-17 status is now inactive due to auto deactivation removed underperforming models
function_telem_2025-12-17 status is now torndown due to DeploymentManager action