submission_id: function_gabur_2024-08-16
developer_uid: chai_backend_admin
alignment_samples: 92
display_name: gpt4-tl
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.1, 'top_k': 100, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', 'You:'], 'max_input_tokens': 512, 'best_of': 16, 'max_output_tokens': 64}
is_internal_developer: True
model_group:
model_name: gpt4-tl
num_battles: 92
num_wins: 44
propriety_score: 0.8
propriety_total_count: 10.0
ranking_group: single
status: torndown
submission_type: function
timestamp: 2024-08-16T19:31:31+00:00
us_pacific_date: 2024-08-16
win_ratio: 0.4782608695652174
Download Preferencedata
Resubmit model
Running pipeline stage StressChecker
Received healthy response to inference request in 2.715534210205078s
Received healthy response to inference request in 1.618894100189209s
Received healthy response to inference request in 3.5150301456451416s
Received healthy response to inference request in 3.007545232772827s
Received healthy response to inference request in 4.8097875118255615s
5 requests
0 failed requests
5th percentile: 1.838222122192383
10th percentile: 2.057550144195557
20th percentile: 2.496206188201904
30th percentile: 2.773936414718628
40th percentile: 2.8907408237457277
50th percentile: 3.007545232772827
60th percentile: 3.210539197921753
70th percentile: 3.4135331630706784
80th percentile: 3.773981618881226
90th percentile: 4.291884565353394
95th percentile: 4.550836038589478
99th percentile: 4.757997217178345
mean time: 3.1333582401275635
Pipeline stage StressChecker completed in 16.26s
function_gabur_2024-08-16 status is now deployed due to DeploymentManager action
function_gabur_2024-08-16 status is now inactive due to admin request
function_gabur_2024-08-16 status is now torndown due to DeploymentManager action

Usage Metrics

Latency Metrics