developer_uid: chai_backend_admin
submission_id: chaiml-llama-8b-multihea_7878_v5
model_name: chaiml-llama-8b-multihea_7878_v5
model_group: ChaiML/llama_8b_multihea
status: deployed
timestamp: 2024-12-12T23:53:10+00:00
num_battles: 6595892
num_wins: 3302594
celo_rating: 1263.1
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/llama_8b_multihead_204m_512_v3_tokens_step_398208
model_architecture: MultiHeadLlamaClassifier
model_num_parameters: 8030261248.0
best_of: 1
max_input_tokens: 256
max_output_tokens: 1
display_name: chaiml-llama-8b-multihea_7878_v5
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/llama_8b_multihead_204m_512_v3_tokens_step_398208
model_size: 8B
ranking_group: single
us_pacific_date: 2024-12-12
win_ratio: 0.5007046810348017
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 256, 'best_of': 1, 'max_output_tokens': 1}
formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name chaiml-llama-8b-multihea-7878-v5-mkmlizer
Waiting for job on chaiml-llama-8b-multihea-7878-v5-mkmlizer to finish
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ _____ __ __ ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ /___/ ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ Version: 0.11.37 ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ https://mk1.ai ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ belonging to: ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ Chai Research Corp. ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ║ ║
chaiml-llama-8b-multihea-7878-v5-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-llama-8b-multihea-7878-v5-mkmlizer: quantized model in 16.192s
chaiml-llama-8b-multihea-7878-v5-mkmlizer: Processed model ChaiML/llama_8b_multihead_204m_512_v3_tokens_step_398208 in 43.029s
chaiml-llama-8b-multihea-7878-v5-mkmlizer: creating bucket guanaco-mkml-models
chaiml-llama-8b-multihea-7878-v5-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
chaiml-llama-8b-multihea-7878-v5-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/chaiml-llama-8b-multihea-7878-v5
chaiml-llama-8b-multihea-7878-v5-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/chaiml-llama-8b-multihea-7878-v5/config.json
chaiml-llama-8b-multihea-7878-v5-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/chaiml-llama-8b-multihea-7878-v5/special_tokens_map.json
chaiml-llama-8b-multihea-7878-v5-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/chaiml-llama-8b-multihea-7878-v5/tokenizer_config.json
chaiml-llama-8b-multihea-7878-v5-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-llama-8b-multihea-7878-v5/tokenizer.json
chaiml-llama-8b-multihea-7878-v5-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-llama-8b-multihea-7878-v5/flywheel_model.0.safetensors
chaiml-llama-8b-multihea-7878-v5-mkmlizer: Loading 0: 0%| | 0/294 [00:00<?, ?it/s] Loading 0: 2%|▏ | 5/294 [00:00<00:08, 32.57it/s] Loading 0: 4%|▍ | 13/294 [00:00<00:05, 54.39it/s] Loading 0: 6%|▋ | 19/294 [00:00<00:05, 49.74it/s] Loading 0: 9%|▊ | 25/294 [00:00<00:05, 49.66it/s] Loading 0: 11%|█ | 31/294 [00:00<00:05, 51.08it/s] Loading 0: 13%|█▎ | 37/294 [00:00<00:05, 47.08it/s] Loading 0: 14%|█▍ | 42/294 [00:00<00:05, 46.93it/s] Loading 0: 17%|█▋ | 49/294 [00:00<00:04, 52.54it/s] Loading 0: 19%|█▊ | 55/294 [00:01<00:04, 48.90it/s] Loading 0: 21%|██ | 61/294 [00:01<00:04, 50.24it/s] Loading 0: 23%|██▎ | 67/294 [00:01<00:04, 52.83it/s] Loading 0: 25%|██▍ | 73/294 [00:01<00:04, 49.37it/s] Loading 0: 27%|██▋ | 79/294 [00:01<00:04, 49.44it/s] Loading 0: 29%|██▉ | 85/294 [00:01<00:05, 36.36it/s] Loading 0: 31%|███ | 90/294 [00:01<00:05, 39.17it/s] Loading 0: 32%|███▏ | 95/294 [00:02<00:05, 36.18it/s] Loading 0: 35%|███▌ | 103/294 [00:02<00:04, 45.39it/s] Loading 0: 37%|███▋ | 109/294 [00:02<00:04, 45.01it/s] Loading 0: 39%|███▉ | 114/294 [00:02<00:03, 45.81it/s] Loading 0: 41%|████ | 121/294 [00:02<00:03, 51.56it/s] Loading 0: 43%|████▎ | 127/294 [00:02<00:03, 48.43it/s] Loading 0: 45%|████▌ | 133/294 [00:02<00:03, 50.16it/s] Loading 0: 47%|████▋ | 139/294 [00:02<00:02, 52.28it/s] Loading 0: 49%|████▉ | 145/294 [00:03<00:03, 49.52it/s] Loading 0: 51%|█████▏ | 151/294 [00:03<00:02, 51.32it/s] Loading 0: 53%|█████▎ | 157/294 [00:03<00:02, 51.63it/s] Loading 0: 55%|█████▌ | 163/294 [00:03<00:02, 48.92it/s] Loading 0: 57%|█████▋ | 168/294 [00:03<00:02, 47.89it/s] Loading 0: 60%|██████ | 177/294 [00:03<00:02, 52.35it/s] Loading 0: 62%|██████▏ | 183/294 [00:03<00:02, 52.99it/s] Loading 0: 64%|██████▍ | 189/294 [00:04<00:02, 36.00it/s] Loading 0: 66%|██████▌ | 194/294 [00:04<00:02, 37.90it/s] Loading 0: 69%|██████▊ | 202/294 [00:04<00:02, 45.79it/s] Loading 0: 71%|███████ | 208/294 [00:04<00:01, 44.40it/s] Loading 0: 72%|███████▏ | 213/294 [00:04<00:01, 45.01it/s] Loading 0: 75%|███████▍ | 220/294 [00:04<00:01, 50.40it/s] Loading 0: 77%|███████▋ | 226/294 [00:04<00:01, 47.81it/s] Loading 0: 79%|███████▉ | 232/294 [00:04<00:01, 49.64it/s] Loading 0: 81%|████████▏ | 239/294 [00:05<00:01, 46.83it/s] Loading 0: 84%|████████▍ | 247/294 [00:05<00:00, 54.24it/s] Loading 0: 86%|████████▌ | 253/294 [00:05<00:00, 50.40it/s] Loading 0: 88%|████████▊ | 259/294 [00:05<00:00, 51.95it/s] Loading 0: 90%|█████████ | 266/294 [00:05<00:00, 48.17it/s] Loading 0: 93%|█████████▎| 274/294 [00:05<00:00, 54.67it/s] Loading 0: 95%|█████████▌| 280/294 [00:05<00:00, 49.05it/s] Loading 0: 97%|█████████▋| 286/294 [00:06<00:00, 45.78it/s] Loading 0: 99%|█████████▉| 291/294 [00:06<00:00, 40.90it/s]
Job chaiml-llama-8b-multihea-7878-v5-mkmlizer completed after 76.88s with status: succeeded
Stopping job with name chaiml-llama-8b-multihea-7878-v5-mkmlizer
Pipeline stage MKMLizer completed in 77.94s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.25s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service chaiml-llama-8b-multihea-7878-v5
Waiting for inference service chaiml-llama-8b-multihea-7878-v5 to be ready
Inference service chaiml-llama-8b-multihea-7878-v5 ready after 81.45916938781738s
Pipeline stage MKMLDeployer completed in 83.17s
run pipeline stage %s
Running pipeline stage StressChecker
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 8.131437301635742s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 5.684611797332764s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 4.346080303192139s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 4.98636794090271s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 4.765385627746582s
5 requests
0 failed requests
5th percentile: 4.4299413681030275
10th percentile: 4.513802433013916
20th percentile: 4.681524562835693
30th percentile: 4.809582090377807
40th percentile: 4.897975015640259
50th percentile: 4.98636794090271
60th percentile: 5.265665483474732
70th percentile: 5.544963026046752
80th percentile: 6.173976898193359
90th percentile: 7.152707099914551
95th percentile: 7.6420722007751465
99th percentile: 8.033564281463622
mean time: 5.582776594161987
%s, retrying in %s seconds...
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 5.742295980453491s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 4.944649696350098s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 6.246665000915527s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 6.13285756111145s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 5.7856738567352295s
5 requests
0 failed requests
5th percentile: 5.104178953170776
10th percentile: 5.263708209991455
20th percentile: 5.582766723632813
30th percentile: 5.750971555709839
40th percentile: 5.768322706222534
50th percentile: 5.7856738567352295
60th percentile: 5.924547338485718
70th percentile: 6.063420820236206
80th percentile: 6.155619049072266
90th percentile: 6.201142024993897
95th percentile: 6.223903512954712
99th percentile: 6.2421127033233645
mean time: 5.770428419113159
%s, retrying in %s seconds...
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 6.517698526382446s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 9.722412109375s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 5.780354022979736s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 6.5355424880981445s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 5.340087413787842s
5 requests
0 failed requests
5th percentile: 5.42814073562622
10th percentile: 5.5161940574646
20th percentile: 5.692300701141358
30th percentile: 5.927822923660278
40th percentile: 6.222760725021362
50th percentile: 6.517698526382446
60th percentile: 6.524836111068725
70th percentile: 6.5319736957550045
80th percentile: 7.1729164123535165
90th percentile: 8.447664260864258
95th percentile: 9.085038185119629
99th percentile: 9.594937324523926
mean time: 6.779218912124634
clean up pipeline due to error=DeploymentChecksError('Unacceptable 70th percentile latency 6.5319736957550045s')
Shutdown handler de-registered
chaiml-llama-8b-multihea_7878_v5 status is now failed due to DeploymentManager action