Running pipeline stage MKMLizer
Starting job with name turboderp-llama3-turbca-4336-v22-mkmlizer
Waiting for job on turboderp-llama3-turbca-4336-v22-mkmlizer to finish
turboderp-llama3-turbca-4336-v22-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ _____ __ __ ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ /___/ ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ Version: 0.9.11 ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ https://mk1.ai ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ The license key for the current software has been verified as ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ belonging to: ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ Chai Research Corp. ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ║ ║
turboderp-llama3-turbca-4336-v22-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
turboderp-llama3-turbca-4336-v22-mkmlizer: Downloaded to shared memory in 26.242s
turboderp-llama3-turbca-4336-v22-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp84g57ofs, device:0
turboderp-llama3-turbca-4336-v22-mkmlizer: Saving flywheel model at /dev/shm/model_cache
turboderp-llama3-turbca-4336-v22-mkmlizer: quantized model in 25.906s
turboderp-llama3-turbca-4336-v22-mkmlizer: Processed model turboderp/llama3-turbcat-instruct-8b in 52.148s
turboderp-llama3-turbca-4336-v22-mkmlizer: creating bucket guanaco-mkml-models
turboderp-llama3-turbca-4336-v22-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
turboderp-llama3-turbca-4336-v22-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/turboderp-llama3-turbca-4336-v22
turboderp-llama3-turbca-4336-v22-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/turboderp-llama3-turbca-4336-v22/config.json
turboderp-llama3-turbca-4336-v22-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/turboderp-llama3-turbca-4336-v22/special_tokens_map.json
turboderp-llama3-turbca-4336-v22-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/turboderp-llama3-turbca-4336-v22/tokenizer_config.json
turboderp-llama3-turbca-4336-v22-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/turboderp-llama3-turbca-4336-v22/tokenizer.json
turboderp-llama3-turbca-4336-v22-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/turboderp-llama3-turbca-4336-v22/flywheel_model.0.safetensors
turboderp-llama3-turbca-4336-v22-mkmlizer:
Loading 0: 0%| | 0/291 [00:00<?, ?it/s]
Loading 0: 2%|▏ | 5/291 [00:00<00:08, 34.22it/s]
Loading 0: 4%|▍ | 13/291 [00:00<00:04, 55.70it/s]
Loading 0: 7%|▋ | 19/291 [00:00<00:05, 49.64it/s]
Loading 0: 9%|▊ | 25/291 [00:00<00:05, 51.95it/s]
Loading 0: 11%|█ | 32/291 [00:00<00:05, 47.77it/s]
Loading 0: 14%|█▍ | 41/291 [00:00<00:04, 50.34it/s]
Loading 0: 17%|█▋ | 50/291 [00:00<00:04, 51.69it/s]
Loading 0: 20%|██ | 59/291 [00:01<00:04, 52.54it/s]
Loading 0: 23%|██▎ | 68/291 [00:01<00:04, 52.93it/s]
Loading 0: 26%|██▋ | 77/291 [00:01<00:03, 53.52it/s]
Loading 0: 29%|██▊ | 83/291 [00:01<00:04, 41.98it/s]
Loading 0: 30%|███ | 88/291 [00:01<00:04, 42.55it/s]
Loading 0: 32%|███▏ | 94/291 [00:01<00:04, 46.12it/s]
Loading 0: 34%|███▍ | 100/291 [00:02<00:04, 45.53it/s]
Loading 0: 36%|███▌ | 105/291 [00:02<00:04, 44.92it/s]
Loading 0: 38%|███▊ | 112/291 [00:02<00:03, 50.32it/s]
Loading 0: 41%|████ | 118/291 [00:02<00:03, 48.01it/s]
Loading 0: 42%|████▏ | 123/291 [00:02<00:03, 47.87it/s]
Loading 0: 45%|████▍ | 130/291 [00:02<00:03, 52.92it/s]
Loading 0: 47%|████▋ | 136/291 [00:02<00:03, 49.05it/s]
Loading 0: 49%|████▉ | 142/291 [00:02<00:02, 50.63it/s]
Loading 0: 51%|█████ | 148/291 [00:03<00:02, 52.71it/s]
Loading 0: 53%|█████▎ | 154/291 [00:03<00:02, 49.90it/s]
Loading 0: 55%|█████▍ | 160/291 [00:03<00:02, 51.26it/s]
Loading 0: 57%|█████▋ | 166/291 [00:03<00:02, 53.04it/s]
Loading 0: 59%|█████▉ | 172/291 [00:03<00:02, 50.15it/s]
Loading 0: 62%|██████▏ | 179/291 [00:03<00:02, 54.71it/s]
Loading 0: 64%|██████▎ | 185/291 [00:03<00:01, 55.04it/s]
Loading 0: 66%|██████▌ | 191/291 [00:04<00:02, 35.60it/s]
Loading 0: 67%|██████▋ | 196/291 [00:04<00:02, 37.25it/s]
Loading 0: 69%|██████▉ | 202/291 [00:04<00:02, 41.35it/s]
Loading 0: 71%|███████▏ | 208/291 [00:04<00:01, 41.84it/s]
Loading 0: 73%|███████▎ | 213/291 [00:04<00:01, 42.88it/s]
Loading 0: 76%|███████▌ | 221/291 [00:04<00:01, 44.48it/s]
Loading 0: 79%|███████▊ | 229/291 [00:04<00:01, 51.90it/s]
Loading 0: 81%|████████ | 235/291 [00:04<00:01, 49.34it/s]
Loading 0: 83%|████████▎ | 241/291 [00:05<00:00, 51.20it/s]
Loading 0: 85%|████████▍ | 247/291 [00:05<00:00, 52.36it/s]
Loading 0: 87%|████████▋ | 253/291 [00:05<00:00, 49.25it/s]
Loading 0: 89%|████████▉ | 259/291 [00:05<00:00, 51.09it/s]
Loading 0: 91%|█████████▏| 266/291 [00:05<00:00, 47.54it/s]
Loading 0: 94%|█████████▍| 274/291 [00:05<00:00, 54.76it/s]
Loading 0: 96%|█████████▌| 280/291 [00:05<00:00, 51.16it/s]
Loading 0: 98%|█████████▊| 286/291 [00:05<00:00, 46.49it/s]
Loading 0: 100%|██████████| 291/291 [00:11<00:00, 3.52it/s]
Job turboderp-llama3-turbca-4336-v22-mkmlizer completed after 73.83s with status: succeeded
Stopping job with name turboderp-llama3-turbca-4336-v22-mkmlizer
Pipeline stage MKMLizer completed in 74.87s
Running pipeline stage MKMLKubeTemplater
Pipeline stage MKMLKubeTemplater completed in 0.10s
Running pipeline stage ISVCDeployer
Creating inference service turboderp-llama3-turbca-4336-v22
Waiting for inference service turboderp-llama3-turbca-4336-v22 to be ready
Failed to get response for submission blend_filor_2024-08-16: ('http://neversleep-noromaid-v0-8068-v149-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:33266->127.0.0.1:8080: read: connection reset by peer\n')
Inference service turboderp-llama3-turbca-4336-v22 ready after 241.47041296958923s
Pipeline stage ISVCDeployer completed in 243.06s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.0801312923431396s
Received healthy response to inference request in 1.7143185138702393s
Received healthy response to inference request in 1.561471939086914s
Received healthy response to inference request in 1.3700635433197021s
Received healthy response to inference request in 1.5206191539764404s
5 requests
0 failed requests
5th percentile: 1.4001746654510498
10th percentile: 1.4302857875823975
20th percentile: 1.4905080318450927
30th percentile: 1.5287897109985351
40th percentile: 1.5451308250427247
50th percentile: 1.561471939086914
60th percentile: 1.6226105690002441
70th percentile: 1.6837491989135742
80th percentile: 1.7874810695648193
90th percentile: 1.9338061809539795
95th percentile: 2.0069687366485596
99th percentile: 2.065498781204224
mean time: 1.6493208885192872
Pipeline stage StressChecker completed in 8.96s
turboderp-llama3-turbca_4336_v22 status is now deployed due to DeploymentManager action
turboderp-llama3-turbca_4336_v22 status is now inactive due to auto deactivation removed underperforming models
turboderp-llama3-turbca_4336_v22 status is now torndown due to DeploymentManager action