function_kutun_2025-12-15

developer_uid: chai_evaluation_service

submission_id: function_kutun_2025-12-15

model_name: richard

model_group:

status: torndown

timestamp: 2025-12-18T11:36:22+00:00

num_battles: 7696

num_wins: 3920

celo_rating: 1299.72

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: function

display_name: richard

is_internal_developer: True

ranking_group: single

us_pacific_date: 2025-12-18

win_ratio: 0.5093555093555093

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '### Instruction:\n{memory}\n', 'prompt_template': '### Input:\n{prompt}\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '### Response:\n{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.8842337131500244s
Received healthy response to inference request in 4.195807218551636s
Received healthy response to inference request in 2.1841442584991455s
Received healthy response to inference request in 3.481109857559204s
Received healthy response to inference request in 2.4616100788116455s
Received healthy response to inference request in 2.1449122428894043s
Received healthy response to inference request in 3.3551809787750244s
Received healthy response to inference request in 2.7975454330444336s
Received healthy response to inference request in 2.6530773639678955s
10 requests
1 failed requests
5th percentile: 2.1625666499137877
10th percentile: 2.1802210569381715
20th percentile: 2.4061169147491457
30th percentile: 2.5956371784210206
40th percentile: 2.7397582054138185
50th percentile: 3.076363205909729
60th percentile: 3.405552530288696
70th percentile: 3.60204701423645
80th percentile: 3.946548414230347
90th percentile: 5.786178255081171
95th percentile: 12.942847919464095
99th percentile: 18.668183650970462
mean time: 4.725713872909546
%s, retrying in %s seconds...
Received healthy response to inference request in 3.497390031814575s
Received healthy response to inference request in 3.6544785499572754s
Received healthy response to inference request in 2.366180896759033s
Received healthy response to inference request in 1.7959473133087158s
Received healthy response to inference request in 3.1218621730804443s
Received healthy response to inference request in 2.8105053901672363s
Received healthy response to inference request in 3.0257925987243652s
Received healthy response to inference request in 3.5777547359466553s
Received healthy response to inference request in 4.071712017059326s
Received healthy response to inference request in 3.138652801513672s
10 requests
0 failed requests
5th percentile: 2.0525524258613586
10th percentile: 2.3091575384140013
20th percentile: 2.7216404914855956
30th percentile: 2.9612064361572266
40th percentile: 3.0834343433380127
50th percentile: 3.130257487297058
60th percentile: 3.282147693634033
70th percentile: 3.521499443054199
80th percentile: 3.593099498748779
90th percentile: 3.69620189666748
95th percentile: 3.8839569568634027
99th percentile: 4.034161005020142
mean time: 3.1060276508331297
Pipeline stage StressChecker completed in 80.83s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.89s
Shutdown handler de-registered
function_kutun_2025-12-15 status is now deployed due to DeploymentManager action
function_kutun_2025-12-15 status is now inactive due to auto deactivation removed underperforming models
function_kutun_2025-12-15 status is now torndown due to DeploymentManager action