developer_uid: chai_backend_admin
submission_id: function_pidun_2024-08-20
model_name: gpt4-tl
status: torndown
timestamp: 2024-08-20T18:16:07+00:00
num_battles: 37053
num_wins: 17519
celo_rating: 1212.61
family_friendly_score: 0.0
submission_type: function
display_name: gpt4-tl
is_internal_developer: True
ranking_group: single
us_pacific_date: 2024-08-20
win_ratio: 0.47280921922651337
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.1, 'top_k': 100, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', 'You:'], 'max_input_tokens': 512, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Running pipeline stage StressChecker
Received healthy response to inference request in 2.4181132316589355s
Received healthy response to inference request in 1.7545397281646729s
Received healthy response to inference request in 2.1636271476745605s
Received healthy response to inference request in 2.1828978061676025s
Received healthy response to inference request in 1.8100059032440186s
5 requests
0 failed requests
5th percentile: 1.765632963180542
10th percentile: 1.776726198196411
20th percentile: 1.7989126682281493
30th percentile: 1.880730152130127
40th percentile: 2.0221786499023438
50th percentile: 2.1636271476745605
60th percentile: 2.1713354110717775
70th percentile: 2.179043674468994
80th percentile: 2.229940891265869
90th percentile: 2.3240270614624023
95th percentile: 2.371070146560669
99th percentile: 2.4087046146392823
mean time: 2.065836763381958
Pipeline stage StressChecker completed in 11.08s
function_pidun_2024-08-20 status is now deployed due to DeploymentManager action
function_pidun_2024-08-20 status is now inactive due to auto deactivation removed underperforming models
function_pidun_2024-08-20 status is now torndown due to DeploymentManager action