developer_uid: chai_backend_admin
submission_id: function_pukus_2024-08-16
model_name: gpt4-tl
status: torndown
timestamp: 2024-08-16T21:23:38+00:00
num_battles: 10293
num_wins: 5212
celo_rating: 1230.21
family_friendly_score: 0.0
submission_type: function
display_name: gpt4-tl
is_internal_developer: True
ranking_group: single
us_pacific_date: 2024-08-16
win_ratio: 0.5063635480423588
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.1, 'top_k': 100, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', 'You:'], 'max_input_tokens': 512, 'best_of': 16, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Running pipeline stage StressChecker
Received healthy response to inference request in 4.354866027832031s
Received healthy response to inference request in 2.458975315093994s
Received healthy response to inference request in 3.558701515197754s
Received healthy response to inference request in 3.539841651916504s
Received healthy response to inference request in 2.5152511596679688s
5 requests
0 failed requests
5th percentile: 2.470230484008789
10th percentile: 2.481485652923584
20th percentile: 2.503995990753174
30th percentile: 2.720169258117676
40th percentile: 3.13000545501709
50th percentile: 3.539841651916504
60th percentile: 3.547385597229004
70th percentile: 3.5549295425415037
80th percentile: 3.7179344177246096
90th percentile: 4.036400222778321
95th percentile: 4.195633125305176
99th percentile: 4.3230194473266605
mean time: 3.2855271339416503
Pipeline stage StressChecker completed in 17.06s
function_pukus_2024-08-16 status is now deployed due to DeploymentManager action
function_pukus_2024-08-16 status is now inactive due to auto deactivation removed underperforming models
function_pukus_2024-08-16 status is now torndown due to DeploymentManager action