submission_id: function_leraf_2024-08-17
developer_uid: chai_backend_admin
alignment_samples: 8769
alignment_score: 1.2873132065879402
celo_rating: 1216.66
display_name: gpt4-tl
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.1, 'top_k': 100, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', 'You:'], 'max_input_tokens': 512, 'best_of': 8, 'max_output_tokens': 64}
is_internal_developer: True
model_group:
model_name: gpt4-tl
num_battles: 8769
num_wins: 4197
propriety_score: 0.7881548974943052
propriety_total_count: 878.0
ranking_group: single
status: torndown
submission_type: function
timestamp: 2024-08-17T05:49:29+00:00
us_pacific_date: 2024-08-16
win_ratio: 0.4786178583646938
Download Preference Data
Resubmit model
Running pipeline stage StressChecker
Received healthy response to inference request in 3.824162244796753s
Failed to get response for submission undi95-meta-llama-3-70b_6209_v19: ('http://undi95-meta-llama-3-70b-6209-v19-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Received healthy response to inference request in 4.214353561401367s
Received healthy response to inference request in 2.1274280548095703s
Received healthy response to inference request in 11.27430772781372s
Received healthy response to inference request in 6.105754375457764s
5 requests
0 failed requests
5th percentile: 2.466774892807007
10th percentile: 2.8061217308044433
20th percentile: 3.4848154067993167
30th percentile: 3.9022005081176756
40th percentile: 4.058277034759522
50th percentile: 4.214353561401367
60th percentile: 4.970913887023926
70th percentile: 5.727474212646484
80th percentile: 7.139465045928956
90th percentile: 9.20688638687134
95th percentile: 10.240597057342528
99th percentile: 11.067565593719483
mean time: 5.509201192855835
%s, retrying in %s seconds...
{"detail":"('http://chaiml-llama-8b-pairwise-8189-v4-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:54918->127.0.0.1:8080: read: connection reset by peer\\n')"}
Received unhealthy response to inference request!
Received healthy response to inference request in 2.076833724975586s
{"detail":"('http://chaiml-llama-8b-pairwise-8189-v4-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '{\"error\":\"ValueError : [TypeError(\\\\\"\\'numpy.int64\\' object is not iterable\\\\\"), TypeError(\\'vars() argument must have __dict__ attribute\\')]\"}')"}
Received unhealthy response to inference request!
Received healthy response to inference request in 14.103046894073486s
Received healthy response to inference request in 1.4131765365600586s
5 requests
2 failed requests
5th percentile: 1.4470425605773927
10th percentile: 1.4809085845947265
20th percentile: 1.5486406326293944
30th percentile: 1.6813720703125
40th percentile: 1.879102897644043
50th percentile: 2.076833724975586
60th percentile: 2.530542182922363
70th percentile: 2.9842506408691403
80th percentile: 5.389493274688723
90th percentile: 9.746270084381106
95th percentile: 11.924658489227292
99th percentile: 13.667369213104248
mean time: 4.477333736419678
%s, retrying in %s seconds...
Received healthy response to inference request in 3.6742584705352783s
Failed to get response for submission undi95-meta-llama-3-70b_6209_v19: ('http://undi95-meta-llama-3-70b-6209-v19-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Received healthy response to inference request in 2.0120668411254883s
Received healthy response to inference request in 2.3039894104003906s
Received healthy response to inference request in 3.000267744064331s
Received healthy response to inference request in 3.6779496669769287s
5 requests
0 failed requests
5th percentile: 2.070451354980469
10th percentile: 2.1288358688354494
20th percentile: 2.24560489654541
30th percentile: 2.4432450771331786
40th percentile: 2.721756410598755
50th percentile: 3.000267744064331
60th percentile: 3.26986403465271
70th percentile: 3.539460325241089
80th percentile: 3.6749967098236085
90th percentile: 3.6764731884002684
95th percentile: 3.6772114276885985
99th percentile: 3.677802019119263
mean time: 2.9337064266204833
Pipeline stage StressChecker completed in 66.84s
function_leraf_2024-08-17 status is now deployed due to DeploymentManager action
function_leraf_2024-08-17 status is now inactive due to auto deactivation removed underperforming models
function_leraf_2024-08-17 status is now torndown due to DeploymentManager action

Usage Metrics

Latency Metrics