Running pipeline stage MKMLizer
Starting job with name trace2333-fd-llama3-v2-nall-v3-mkmlizer
Waiting for job on trace2333-fd-llama3-v2-nall-v3-mkmlizer to finish
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ _____ __ __ ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ /___/ ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ Version: 0.10.1 ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ https://mk1.ai ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ The license key for the current software has been verified as ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ belonging to: ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ Chai Research Corp. ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ║ ║
trace2333-fd-llama3-v2-nall-v3-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
trace2333-fd-llama3-v2-nall-v3-mkmlizer: Downloaded to shared memory in 66.896s
trace2333-fd-llama3-v2-nall-v3-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp5ocvt39r, device:0
trace2333-fd-llama3-v2-nall-v3-mkmlizer: Saving flywheel model at /dev/shm/model_cache
trace2333-fd-llama3-v2-nall-v3-mkmlizer: quantized model in 35.156s
trace2333-fd-llama3-v2-nall-v3-mkmlizer: Processed model Trace2333/fd_llama3_v2_Nall in 102.052s
trace2333-fd-llama3-v2-nall-v3-mkmlizer: creating bucket guanaco-mkml-models
trace2333-fd-llama3-v2-nall-v3-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
trace2333-fd-llama3-v2-nall-v3-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/trace2333-fd-llama3-v2-nall-v3
trace2333-fd-llama3-v2-nall-v3-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/trace2333-fd-llama3-v2-nall-v3/config.json
trace2333-fd-llama3-v2-nall-v3-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/trace2333-fd-llama3-v2-nall-v3/special_tokens_map.json
trace2333-fd-llama3-v2-nall-v3-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/trace2333-fd-llama3-v2-nall-v3/tokenizer_config.json
trace2333-fd-llama3-v2-nall-v3-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/trace2333-fd-llama3-v2-nall-v3/tokenizer.json
trace2333-fd-llama3-v2-nall-v3-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/trace2333-fd-llama3-v2-nall-v3/flywheel_model.0.safetensors
trace2333-fd-llama3-v2-nall-v3-mkmlizer:
Loading 0: 0%| | 0/291 [00:00<?, ?it/s]
Loading 0: 2%|▏ | 5/291 [00:00<00:10, 26.39it/s]
Loading 0: 4%|▍ | 12/291 [00:00<00:07, 35.64it/s]
Loading 0: 5%|▌ | 16/291 [00:00<00:08, 33.32it/s]
Loading 0: 7%|▋ | 21/291 [00:00<00:07, 36.45it/s]
Loading 0: 9%|▊ | 25/291 [00:00<00:07, 33.72it/s]
Loading 0: 10%|▉ | 29/291 [00:00<00:07, 35.03it/s]
Loading 0: 11%|█▏ | 33/291 [00:01<00:10, 24.14it/s]
Loading 0: 12%|█▏ | 36/291 [00:01<00:11, 21.98it/s]
Loading 0: 14%|█▍ | 41/291 [00:01<00:10, 24.43it/s]
Loading 0: 16%|█▋ | 48/291 [00:01<00:07, 31.20it/s]
Loading 0: 18%|█▊ | 52/291 [00:01<00:07, 30.16it/s]
Loading 0: 20%|█▉ | 57/291 [00:01<00:07, 32.97it/s]
Loading 0: 21%|██ | 61/291 [00:02<00:07, 31.95it/s]
Loading 0: 23%|██▎ | 66/291 [00:02<00:06, 34.16it/s]
Loading 0: 24%|██▍ | 70/291 [00:02<00:06, 31.84it/s]
Loading 0: 25%|██▌ | 74/291 [00:02<00:07, 30.88it/s]
Loading 0: 27%|██▋ | 78/291 [00:02<00:07, 30.12it/s]
Loading 0: 28%|██▊ | 82/291 [00:02<00:09, 21.21it/s]
Loading 0: 29%|██▉ | 85/291 [00:03<00:09, 22.07it/s]
Loading 0: 31%|███ | 90/291 [00:03<00:07, 25.66it/s]
Loading 0: 32%|███▏ | 93/291 [00:03<00:07, 25.07it/s]
Loading 0: 34%|███▍ | 99/291 [00:03<00:06, 30.25it/s]
Loading 0: 35%|███▌ | 103/291 [00:03<00:06, 29.93it/s]
Loading 0: 37%|███▋ | 108/291 [00:03<00:05, 32.91it/s]
Loading 0: 38%|███▊ | 112/291 [00:03<00:05, 31.65it/s]
Loading 0: 40%|███▉ | 116/291 [00:03<00:05, 30.94it/s]
Loading 0: 42%|████▏ | 122/291 [00:04<00:04, 34.89it/s]
Loading 0: 44%|████▎ | 127/291 [00:04<00:04, 33.51it/s]
Loading 0: 46%|████▌ | 133/291 [00:04<00:05, 28.65it/s]
Loading 0: 47%|████▋ | 137/291 [00:04<00:05, 28.75it/s]
Loading 0: 48%|████▊ | 141/291 [00:04<00:05, 26.97it/s]
Loading 0: 51%|█████ | 147/291 [00:04<00:04, 31.19it/s]
Loading 0: 52%|█████▏ | 151/291 [00:05<00:04, 30.62it/s]
Loading 0: 54%|█████▎ | 156/291 [00:05<00:04, 32.46it/s]
Loading 0: 55%|█████▍ | 160/291 [00:05<00:04, 31.31it/s]
Loading 0: 57%|█████▋ | 165/291 [00:05<00:03, 33.14it/s]
Loading 0: 58%|█████▊ | 169/291 [00:05<00:03, 32.29it/s]
Loading 0: 60%|█████▉ | 174/291 [00:05<00:03, 34.00it/s]
Loading 0: 61%|██████ | 178/291 [00:05<00:03, 32.33it/s]
Loading 0: 63%|██████▎ | 184/291 [00:06<00:02, 38.11it/s]
Loading 0: 65%|██████▍ | 188/291 [00:06<00:04, 25.36it/s]
Loading 0: 66%|██████▌ | 192/291 [00:06<00:03, 26.00it/s]
Loading 0: 67%|██████▋ | 196/291 [00:06<00:03, 26.73it/s]
Loading 0: 69%|██████▉ | 201/291 [00:06<00:02, 30.55it/s]
Loading 0: 70%|███████ | 205/291 [00:06<00:02, 29.75it/s]
Loading 0: 72%|███████▏ | 210/291 [00:07<00:02, 31.80it/s]
Loading 0: 74%|███████▎ | 214/291 [00:07<00:02, 31.13it/s]
Loading 0: 75%|███████▌ | 219/291 [00:07<00:02, 34.27it/s]
Loading 0: 77%|███████▋ | 223/291 [00:07<00:02, 33.13it/s]
Loading 0: 78%|███████▊ | 227/291 [00:07<00:01, 32.83it/s]
Loading 0: 79%|███████▉ | 231/291 [00:07<00:01, 33.01it/s]
Loading 0: 81%|████████ | 235/291 [00:07<00:02, 23.84it/s]
Loading 0: 82%|████████▏ | 239/291 [00:08<00:02, 23.60it/s]
Loading 0: 85%|████████▍ | 246/291 [00:08<00:01, 30.70it/s]
Loading 0: 86%|████████▌ | 250/291 [00:08<00:01, 30.56it/s]
Loading 0: 88%|████████▊ | 255/291 [00:08<00:01, 33.46it/s]
Loading 0: 89%|████████▉ | 259/291 [00:08<00:00, 32.37it/s]
Loading 0: 91%|█████████ | 264/291 [00:08<00:00, 34.62it/s]
Loading 0: 92%|█████████▏| 268/291 [00:08<00:00, 32.92it/s]
Loading 0: 94%|█████████▍| 273/291 [00:08<00:00, 35.48it/s]
Loading 0: 95%|█████████▌| 277/291 [00:09<00:00, 33.42it/s]
Loading 0: 97%|█████████▋| 281/291 [00:09<00:00, 32.10it/s]
Loading 0: 98%|█████████▊| 286/291 [00:14<00:01, 2.56it/s]
Loading 0: 99%|█████████▉| 289/291 [00:20<00:01, 1.41it/s]
Job trace2333-fd-llama3-v2-nall-v3-mkmlizer completed after 127.42s with status: succeeded
Stopping job with name trace2333-fd-llama3-v2-nall-v3-mkmlizer
Pipeline stage MKMLizer completed in 128.78s
Running pipeline stage MKMLKubeTemplater
Pipeline stage MKMLKubeTemplater completed in 0.92s
Running pipeline stage ISVCDeployer
Creating inference service trace2333-fd-llama3-v2-nall-v3
Waiting for inference service trace2333-fd-llama3-v2-nall-v3 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Inference service trace2333-fd-llama3-v2-nall-v3 ready after 181.4611279964447s
Pipeline stage ISVCDeployer completed in 182.97s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.1178250312805176s
Received healthy response to inference request in 1.4516513347625732s
Received healthy response to inference request in 1.8783223628997803s
Received healthy response to inference request in 1.7109436988830566s
Received healthy response to inference request in 1.7201447486877441s
5 requests
0 failed requests
5th percentile: 1.50350980758667
10th percentile: 1.5553682804107667
20th percentile: 1.65908522605896
30th percentile: 1.7127839088439942
40th percentile: 1.7164643287658692
50th percentile: 1.7201447486877441
60th percentile: 1.7834157943725586
70th percentile: 1.846686840057373
80th percentile: 1.9262228965759278
90th percentile: 2.0220239639282225
95th percentile: 2.06992449760437
99th percentile: 2.108244924545288
mean time: 1.7757774353027345
Pipeline stage StressChecker completed in 10.56s
trace2333-fd-llama3-v2-nall_v3 status is now deployed due to DeploymentManager action
trace2333-fd-llama3-v2-nall_v3 status is now inactive due to auto deactivation removed underperforming models
trace2333-fd-llama3-v2-nall_v3 status is now torndown due to DeploymentManager action