developer_uid: chai_evaluation_service
submission_id: function_kosik_2025-12-16
model_name: richard
model_group:
status: torndown
timestamp: 2025-12-19T08:21:23+00:00
num_battles: 8254
num_wins: 4209
celo_rating: 1300.11
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: function
display_name: richard
is_internal_developer: True
ranking_group: single
us_pacific_date: 2025-12-19
win_ratio: 0.5099345771747031
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.829041004180908s
Received healthy response to inference request in 3.2284607887268066s
Received healthy response to inference request in 3.3504750728607178s
Received healthy response to inference request in 2.485023260116577s
Received healthy response to inference request in 2.7529385089874268s
Received healthy response to inference request in 1.6052486896514893s
Received healthy response to inference request in 2.3989245891571045s
Received healthy response to inference request in 3.1699655055999756s
Received healthy response to inference request in 3.012359857559204s
Received healthy response to inference request in 2.0775444507598877s
10 requests
0 failed requests
5th percentile: 1.8177817821502686
10th percentile: 2.030314874649048
20th percentile: 2.3346485614776613
30th percentile: 2.4591936588287355
40th percentile: 2.645772409439087
50th percentile: 2.7909897565841675
60th percentile: 2.9023685455322266
70th percentile: 3.0596415519714357
80th percentile: 3.181664562225342
90th percentile: 3.2406622171401978
95th percentile: 3.2955686450004578
99th percentile: 3.339493787288666
mean time: 2.69099817276001
Pipeline stage StressChecker completed in 28.30s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.57s
Shutdown handler de-registered
function_kosik_2025-12-16 status is now deployed due to DeploymentManager action
function_kosik_2025-12-16 status is now inactive due to auto deactivation removed underperforming models
function_kosik_2025-12-16 status is now torndown due to DeploymentManager action