developer_uid: chai_backend_admin
submission_id: function_medet_2024-08-21
model_name: gpt4-tl
model_group:
status: torndown
timestamp: 2024-08-21T22:35:52+00:00
num_battles: 8616
num_wins: 3961
celo_rating: 1212.55
family_friendly_score: 0.0
submission_type: function
display_name: gpt4-tl
is_internal_developer: True
ranking_group: single
us_pacific_date: 2024-08-21
win_ratio: 0.4597260909935005
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.1, 'top_k': 100, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', 'You:'], 'max_input_tokens': 512, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Running pipeline stage StressChecker
Received healthy response to inference request in 1.7835016250610352s
Received healthy response to inference request in 1.6406445503234863s
Received healthy response to inference request in 2.0670604705810547s
Received healthy response to inference request in 1.94875168800354s
Received healthy response to inference request in 2.0987260341644287s
5 requests
0 failed requests
5th percentile: 1.6692159652709961
10th percentile: 1.697787380218506
20th percentile: 1.7549302101135253
30th percentile: 1.816551637649536
40th percentile: 1.8826516628265382
50th percentile: 1.94875168800354
60th percentile: 1.996075201034546
70th percentile: 2.043398714065552
80th percentile: 2.0733935832977295
90th percentile: 2.086059808731079
95th percentile: 2.092392921447754
99th percentile: 2.097459411621094
mean time: 1.907736873626709
Pipeline stage StressChecker completed in 10.09s
function_medet_2024-08-21 status is now deployed due to DeploymentManager action
function_medet_2024-08-21 status is now inactive due to auto deactivation removed underperforming models
function_medet_2024-08-21 status is now torndown due to DeploymentManager action