Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rirv938-llama-8b-multihe-9103-v1-mkmlizer
Waiting for job on rirv938-llama-8b-multihe-9103-v1-mkmlizer to finish
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ _____ __ __ ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ /___/ ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ Version: 0.11.33 ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ https://mk1.ai ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ The license key for the current software has been verified as ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ belonging to: ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ Chai Research Corp. ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ║ ║
rirv938-llama-8b-multihe-9103-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Connection pool is full, discarding connection: %s. Connection pool size: %s
Retrying (%r) after connection broken by '%r': %s
rirv938-llama-8b-multihe-9103-v1-mkmlizer: quantized model in 90.512s
rirv938-llama-8b-multihe-9103-v1-mkmlizer: Processed model rirv938/llama_8b_multihead_181m_preference in 286.986s
rirv938-llama-8b-multihe-9103-v1-mkmlizer: creating bucket guanaco-mkml-models
rirv938-llama-8b-multihe-9103-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rirv938-llama-8b-multihe-9103-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rirv938-llama-8b-multihe-9103-v1
rirv938-llama-8b-multihe-9103-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rirv938-llama-8b-multihe-9103-v1/tokenizer.json
rirv938-llama-8b-multihe-9103-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/rirv938-llama-8b-multihe-9103-v1/flywheel_model.0.safetensors
rirv938-llama-8b-multihe-9103-v1-mkmlizer:
Loading 0: 0%| | 0/291 [00:00<?, ?it/s]
Loading 0: 1%| | 3/291 [00:00<00:59, 4.86it/s]
Loading 0: 1%|▏ | 4/291 [00:01<01:36, 2.97it/s]
Loading 0: 2%|▏ | 5/291 [00:01<02:11, 2.17it/s]
Loading 0: 3%|▎ | 8/291 [00:02<01:08, 4.16it/s]
Loading 0: 3%|▎ | 9/291 [00:02<01:06, 4.24it/s]
Loading 0: 3%|▎ | 10/291 [00:02<00:57, 4.85it/s]
Loading 0: 4%|▍ | 12/291 [00:03<01:09, 4.04it/s]
Loading 0: 4%|▍ | 13/291 [00:03<01:31, 3.04it/s]
Loading 0: 5%|▍ | 14/291 [00:04<01:55, 2.40it/s]
Loading 0: 6%|▌ | 17/291 [00:04<01:06, 4.13it/s]
Loading 0: 6%|▌ | 18/291 [00:04<01:01, 4.42it/s]
Loading 0: 7%|▋ | 19/291 [00:04<00:54, 4.96it/s]
Loading 0: 7%|▋ | 21/291 [00:05<01:05, 4.15it/s]
Loading 0: 8%|▊ | 22/291 [00:06<01:26, 3.12it/s]
Loading 0: 8%|▊ | 23/291 [00:06<01:48, 2.46it/s]
Loading 0: 9%|▉ | 26/291 [00:07<01:03, 4.19it/s]
Loading 0: 9%|▉ | 27/291 [00:07<00:59, 4.41it/s]
Loading 0: 10%|▉ | 28/291 [00:07<00:53, 4.95it/s]
Loading 0: 10%|█ | 30/291 [00:07<00:42, 6.13it/s]
Loading 0: 11%|█ | 31/291 [00:07<00:43, 6.00it/s]
Loading 0: 11%|█ | 32/291 [00:07<00:40, 6.39it/s]
Loading 0: 11%|█▏ | 33/291 [00:08<00:43, 5.95it/s]
Loading 0: 12%|█▏ | 34/291 [00:08<01:12, 3.52it/s]
Loading 0: 12%|█▏ | 35/291 [00:09<01:35, 2.69it/s]
Loading 0: 12%|█▏ | 36/291 [00:09<01:57, 2.16it/s]
Loading 0: 13%|█▎ | 39/291 [00:10<01:20, 3.11it/s]
Loading 0: 14%|█▎ | 40/291 [00:11<01:35, 2.63it/s]
Loading 0: 14%|█▍ | 41/291 [00:11<01:51, 2.23it/s]
Loading 0: 15%|█▌ | 44/291 [00:12<01:04, 3.84it/s]
Loading 0: 15%|█▌ | 45/291 [00:12<01:00, 4.09it/s]
Loading 0: 16%|█▌ | 46/291 [00:12<00:53, 4.62it/s]
Loading 0: 16%|█▋ | 48/291 [00:12<01:00, 4.00it/s]
Loading 0: 17%|█▋ | 49/291 [00:13<01:18, 3.07it/s]
Loading 0: 17%|█▋ | 50/291 [00:14<01:39, 2.43it/s]
Loading 0: 18%|█▊ | 53/291 [00:14<00:56, 4.20it/s]
Loading 0: 19%|█▊ | 54/291 [00:14<00:53, 4.43it/s]
Loading 0: 19%|█▉ | 55/291 [00:14<00:47, 4.94it/s]
Loading 0: 20%|█▉ | 57/291 [00:15<00:56, 4.16it/s]
Loading 0: 20%|█▉ | 58/291 [00:15<01:14, 3.14it/s]
Loading 0: 20%|██ | 59/291 [00:16<01:34, 2.47it/s]
Loading 0: 21%|██▏ | 62/291 [00:16<00:53, 4.24it/s]
Loading 0: 22%|██▏ | 63/291 [00:17<00:51, 4.46it/s]
Loading 0: 22%|██▏ | 64/291 [00:17<00:45, 5.00it/s]
Loading 0: 23%|██▎ | 66/291 [00:17<00:53, 4.19it/s]
Loading 0: 23%|██▎ | 67/291 [00:18<01:10, 3.16it/s]
Loading 0: 23%|██▎ | 68/291 [00:19<01:29, 2.49it/s]
Loading 0: 24%|██▍ | 71/291 [00:19<00:51, 4.29it/s]
Loading 0: 25%|██▍ | 72/291 [00:19<00:48, 4.50it/s]
Loading 0: 25%|██▌ | 73/291 [00:19<00:42, 5.08it/s]
Loading 0: 25%|██▌ | 74/291 [00:20<01:03, 3.40it/s]
Loading 0: 26%|██▌ | 75/291 [00:20<01:23, 2.57it/s]
Loading 0: 26%|██▋ | 77/291 [00:20<00:57, 3.74it/s]
Loading 0: 27%|██▋ | 78/291 [00:21<00:52, 4.07it/s]
Loading 0: 27%|██▋ | 79/291 [00:21<00:45, 4.69it/s]
Loading 0: 27%|██▋ | 80/291 [00:21<00:44, 4.72it/s]
Loading 0: 28%|██▊ | 81/291 [00:22<01:07, 3.12it/s]
Loading 0: 28%|██▊ | 82/291 [00:22<01:23, 2.49it/s]
Loading 0: 29%|██▊ | 83/291 [00:23<01:39, 2.09it/s]
Loading 0: 30%|██▉ | 86/291 [00:23<00:52, 3.92it/s]
Loading 0: 30%|██▉ | 87/291 [00:23<00:48, 4.19it/s]
Loading 0: 30%|███ | 88/291 [00:23<00:42, 4.81it/s]
Loading 0: 31%|███ | 90/291 [00:24<00:49, 4.06it/s]
Loading 0: 31%|███▏ | 91/291 [00:25<01:04, 3.08it/s]
Loading 0: 32%|███▏ | 92/291 [00:25<01:21, 2.45it/s]
Loading 0: 33%|███▎ | 95/291 [00:25<00:46, 4.20it/s]
Loading 0: 33%|███▎ | 96/291 [00:26<00:44, 4.43it/s]
Loading 0: 33%|███▎ | 97/291 [00:26<00:38, 5.01it/s]
Loading 0: 34%|███▍ | 99/291 [00:26<00:46, 4.16it/s]
Loading 0: 34%|███▍ | 100/291 [00:27<01:00, 3.15it/s]
Loading 0: 35%|███▍ | 101/291 [00:28<01:16, 2.50it/s]
Loading 0: 36%|███▌ | 104/291 [00:28<00:43, 4.30it/s]
Loading 0: 36%|███▌ | 105/291 [00:28<00:40, 4.59it/s]
Loading 0: 36%|███▋ | 106/291 [00:28<00:36, 5.10it/s]
Loading 0: 37%|███▋ | 108/291 [00:29<00:43, 4.21it/s]
Loading 0: 37%|███▋ | 109/291 [00:29<00:57, 3.17it/s]
Loading 0: 38%|███▊ | 110/291 [00:30<01:12, 2.48it/s]
Loading 0: 39%|███▉ | 113/291 [00:30<00:42, 4.20it/s]
Loading 0: 39%|███▉ | 114/291 [00:30<00:39, 4.43it/s]
Loading 0: 40%|███▉ | 115/291 [00:31<00:35, 4.98it/s]
Loading 0: 40%|███▉ | 116/291 [00:31<00:52, 3.36it/s]
Loading 0: 41%|████ | 118/291 [00:31<00:38, 4.51it/s]
Loading 0: 41%|████ | 119/291 [00:32<00:36, 4.73it/s]
Loading 0: 41%|████ | 120/291 [00:32<00:31, 5.38it/s]
Loading 0: 42%|████▏ | 122/291 [00:32<00:39, 4.28it/s]
Loading 0: 43%|████▎ | 125/291 [00:33<00:36, 4.59it/s]
Loading 0: 43%|████▎ | 126/291 [00:33<00:47, 3.47it/s]
Loading 0: 44%|████▎ | 127/291 [00:34<01:00, 2.72it/s]
Loading 0: 45%|████▍ | 130/291 [00:34<00:36, 4.36it/s]
Loading 0: 45%|████▌ | 131/291 [00:34<00:35, 4.55it/s]
Loading 0: 45%|████▌ | 132/291 [00:35<00:31, 5.05it/s]
Loading 0: 46%|████▌ | 133/291 [00:35<00:32, 4.89it/s]
Loading 0: 46%|████▌ | 134/291 [00:35<00:47, 3.31it/s]
Loading 0: 46%|████▋ | 135/291 [00:36<01:02, 2.50it/s]
Loading 0: 47%|████▋ | 138/291 [00:37<00:45, 3.39it/s]
Loading 0: 48%|████▊ | 139/291 [00:37<00:54, 2.79it/s]
Loading 0: 48%|████▊ | 140/291 [00:38<01:05, 2.32it/s]
Loading 0: 49%|████▉ | 143/291 [00:38<00:37, 3.90it/s]
Loading 0: 49%|████▉ | 144/291 [00:38<00:35, 4.14it/s]
Loading 0: 50%|████▉ | 145/291 [00:39<00:31, 4.65it/s]
Loading 0: 51%|█████ | 147/291 [00:39<00:35, 4.01it/s]
Loading 0: 51%|█████ | 148/291 [00:40<00:46, 3.08it/s]
Loading 0: 51%|█████ | 149/291 [00:40<00:57, 2.48it/s]
Loading 0: 52%|█████▏ | 152/291 [00:41<00:33, 4.19it/s]
Loading 0: 53%|█████▎ | 153/291 [00:41<00:31, 4.42it/s]
Loading 0: 53%|█████▎ | 154/291 [00:41<00:27, 4.97it/s]
Loading 0: 54%|█████▎ | 156/291 [00:42<00:32, 4.16it/s]
Loading 0: 54%|█████▍ | 157/291 [00:42<00:42, 3.15it/s]
Loading 0: 54%|█████▍ | 158/291 [00:43<00:53, 2.47it/s]
Loading 0: 55%|█████▌ | 161/291 [00:43<00:30, 4.19it/s]
Loading 0: 56%|█████▌ | 162/291 [00:43<00:29, 4.41it/s]
Loading 0: 56%|█████▌ | 163/291 [00:43<00:25, 4.95it/s]
Loading 0: 57%|█████▋ | 165/291 [00:44<00:30, 4.14it/s]
Loading 0: 57%|█████▋ | 166/291 [00:45<00:39, 3.14it/s]
Loading 0: 57%|█████▋ | 167/291 [00:45<00:49, 2.50it/s]
Loading 0: 58%|█████▊ | 170/291 [00:45<00:28, 4.25it/s]
Loading 0: 59%|█████▉ | 171/291 [00:46<00:26, 4.47it/s]
Loading 0: 59%|█████▉ | 172/291 [00:46<00:23, 5.00it/s]
Loading 0: 60%|█████▉ | 174/291 [00:46<00:28, 4.14it/s]
Loading 0: 60%|██████ | 175/291 [00:47<00:37, 3.13it/s]
Loading 0: 60%|██████ | 176/291 [00:48<00:46, 2.49it/s]
Loading 0: 62%|██████▏ | 179/291 [00:48<00:26, 4.22it/s]
Loading 0: 62%|██████▏ | 180/291 [00:48<00:25, 4.44it/s]
Loading 0: 62%|██████▏ | 181/291 [00:48<00:22, 4.99it/s]
Loading 0: 63%|██████▎ | 183/291 [00:48<00:17, 6.03it/s]
Loading 0: 63%|██████▎ | 184/291 [00:48<00:17, 5.96it/s]
Loading 0: 64%|██████▎ | 185/291 [00:49<00:16, 6.48it/s]
Loading 0: 64%|██████▍ | 186/291 [00:49<00:17, 6.14it/s]
Loading 0: 64%|██████▍ | 187/291 [00:49<00:28, 3.60it/s]
Loading 0: 65%|██████▍ | 188/291 [00:50<00:37, 2.73it/s]
Loading 0: 65%|██████▍ | 189/291 [00:51<00:46, 2.22it/s]
Loading 0: 66%|██████▌ | 192/291 [00:51<00:31, 3.18it/s]
Loading 0: 66%|██████▋ | 193/291 [00:52<00:36, 2.67it/s]
Loading 0: 67%|██████▋ | 194/291 [00:53<00:43, 2.25it/s]
Loading 0: 68%|██████▊ | 197/291 [00:53<00:24, 3.83it/s]
Loading 0: 68%|██████▊ | 198/291 [00:53<00:22, 4.09it/s]
Loading 0: 68%|██████▊ | 199/291 [00:53<00:19, 4.61it/s]
Loading 0: 69%|██████▉ | 201/291 [00:54<00:22, 4.00it/s]
Loading 0: 69%|██████▉ | 202/291 [00:54<00:28, 3.08it/s]
Loading 0: 70%|██████▉ | 203/291 [00:55<00:35, 2.48it/s]
Loading 0: 71%|███████ | 206/291 [00:55<00:20, 4.19it/s]
Loading 0: 71%|███████ | 207/291 [00:55<00:18, 4.43it/s]
Loading 0: 71%|███████▏ | 208/291 [00:55<00:16, 4.92it/s]
Loading 0: 72%|███████▏ | 210/291 [00:56<00:19, 4.15it/s]
Loading 0: 73%|███████▎ | 211/291 [00:57<00:25, 3.14it/s]
Loading 0: 73%|███████▎ | 212/291 [00:57<00:31, 2.49it/s]
Loading 0: 74%|███████▍ | 215/291 [00:58<00:17, 4.23it/s]
Loading 0: 74%|███████▍ | 216/291 [00:58<00:16, 4.46it/s]
Loading 0: 75%|███████▍ | 217/291 [00:58<00:14, 5.04it/s]
Loading 0: 75%|███████▌ | 219/291 [00:58<00:17, 4.21it/s]
Loading 0: 76%|███████▌ | 220/291 [00:59<00:22, 3.17it/s]
Loading 0: 76%|███████▌ | 221/291 [01:00<00:27, 2.51it/s]
Loading 0: 77%|███████▋ | 224/291 [01:00<00:15, 4.32it/s]
Loading 0: 77%|███████▋ | 225/291 [01:00<00:14, 4.53it/s]
Loading 0: 78%|███████▊ | 226/291 [01:00<00:12, 5.07it/s]
Loading 0: 78%|███████▊ | 227/291 [01:01<00:18, 3.42it/s]
Loading 0: 78%|███████▊ | 228/291 [01:01<00:24, 2.59it/s]
Loading 0: 79%|███████▉ | 230/291 [01:02<00:16, 3.70it/s]
Loading 0: 79%|███████▉ | 231/291 [01:02<00:14, 4.03it/s]
Loading 0: 80%|███████▉ | 232/291 [01:02<00:12, 4.68it/s]
Loading 0: 80%|████████ | 233/291 [01:02<00:11, 4.84it/s]
Loading 0: 80%|████████ | 234/291 [01:03<00:17, 3.18it/s]
Loading 0: 81%|████████▏ | 237/291 [01:03<00:13, 3.96it/s]
Loading 0: 82%|████████▏ | 238/291 [01:04<00:17, 3.10it/s]
Loading 0: 82%|████████▏ | 239/291 [01:05<00:20, 2.51it/s]
Loading 0: 83%|████████▎ | 242/291 [01:05<00:11, 4.20it/s]
Loading 0: 84%|████████▎ | 243/291 [01:05<00:10, 4.42it/s]
Loading 0: 84%|████████▍ | 244/291 [01:05<00:09, 4.93it/s]
Loading 0: 85%|████████▍ | 246/291 [01:06<00:10, 4.15it/s]
Loading 0: 85%|████████▍ | 247/291 [01:06<00:14, 3.13it/s]
Loading 0: 85%|████████▌ | 248/291 [01:07<00:17, 2.50it/s]
Loading 0: 86%|████████▋ | 251/291 [01:07<00:09, 4.23it/s]
Loading 0: 87%|████████▋ | 252/291 [01:07<00:08, 4.46it/s]
Loading 0: 87%|████████▋ | 253/291 [01:07<00:07, 4.88it/s]
Loading 0: 88%|████████▊ | 255/291 [01:08<00:08, 4.13it/s]
Loading 0: 88%|████████▊ | 256/291 [01:09<00:11, 3.14it/s]
Loading 0: 88%|████████▊ | 257/291 [01:09<00:13, 2.50it/s]
Loading 0: 89%|████████▉ | 260/291 [01:10<00:07, 4.24it/s]
Loading 0: 90%|████████▉ | 261/291 [01:10<00:06, 4.47it/s]
Loading 0: 90%|█████████ | 262/291 [01:10<00:05, 4.99it/s]
Loading 0: 91%|█████████ | 264/291 [01:10<00:06, 4.16it/s]
Loading 0: 91%|█████████ | 265/291 [01:11<00:08, 3.14it/s]
Loading 0: 91%|█████████▏| 266/291 [01:12<00:10, 2.48it/s]
Loading 0: 92%|█████████▏| 269/291 [01:12<00:05, 4.21it/s]
Loading 0: 93%|█████████▎| 270/291 [01:12<00:04, 4.45it/s]
Loading 0: 93%|█████████▎| 271/291 [01:12<00:04, 4.98it/s]
Loading 0: 94%|█████████▍| 273/291 [01:13<00:04, 4.19it/s]
Loading 0: 94%|█████████▍| 274/291 [01:13<00:05, 3.18it/s]
Loading 0: 95%|█████████▍| 275/291 [01:14<00:06, 2.51it/s]
Loading 0: 96%|█████████▌| 278/291 [01:14<00:03, 4.27it/s]
Loading 0: 96%|█████████▌| 279/291 [01:15<00:02, 4.51it/s]
Loading 0: 96%|█████████▌| 280/291 [01:15<00:02, 5.03it/s]
Loading 0: 97%|█████████▋| 281/291 [01:15<00:02, 3.41it/s]
Loading 0: 97%|█████████▋| 283/291 [01:15<00:01, 4.60it/s]
Loading 0: 98%|█████████▊| 284/291 [01:16<00:01, 4.84it/s]
Loading 0: 98%|█████████▊| 285/291 [01:16<00:01, 5.42it/s]
Loading 0: 98%|█████████▊| 286/291 [01:16<00:00, 5.37it/s]
Loading 0: 99%|█████████▊| 287/291 [01:17<00:01, 3.36it/s]
Loading 0: 99%|█████████▉| 288/291 [01:17<00:01, 2.50it/s]
Job rirv938-llama-8b-multihe-9103-v1-mkmlizer completed after 308.9s with status: succeeded
Stopping job with name rirv938-llama-8b-multihe-9103-v1-mkmlizer
Pipeline stage MKMLizer completed in 309.53s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.20s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rirv938-llama-8b-multihe-9103-v1
Waiting for inference service rirv938-llama-8b-multihe-9103-v1 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Inference service rirv938-llama-8b-multihe-9103-v1 ready after 180.74329900741577s
Pipeline stage MKMLDeployer completed in 181.46s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 4.47619366645813s
Received healthy response to inference request in 3.2274458408355713s
Received healthy response to inference request in 5.573505640029907s
Received healthy response to inference request in 2.0101237297058105s
Received healthy response to inference request in 2.026397943496704s
5 requests
0 failed requests
5th percentile: 2.0133785724639894
10th percentile: 2.016633415222168
20th percentile: 2.023143100738525
30th percentile: 2.2666075229644775
40th percentile: 2.7470266819000244
50th percentile: 3.2274458408355713
60th percentile: 3.7269449710845945
70th percentile: 4.226444101333618
80th percentile: 4.695656061172485
90th percentile: 5.134580850601196
95th percentile: 5.354043245315552
99th percentile: 5.5296131610870365
mean time: 3.4627333641052247
%s, retrying in %s seconds...
Received healthy response to inference request in 6.021600723266602s
Received healthy response to inference request in 3.812478542327881s
Received healthy response to inference request in 4.034631729125977s
Received healthy response to inference request in 1.2217693328857422s
Received healthy response to inference request in 2.776766061782837s
5 requests
0 failed requests
5th percentile: 1.532768678665161
10th percentile: 1.84376802444458
20th percentile: 2.465766716003418
30th percentile: 2.9839085578918456
40th percentile: 3.3981935501098635
50th percentile: 3.812478542327881
60th percentile: 3.901339817047119
70th percentile: 3.9902010917663575
80th percentile: 4.432025527954102
90th percentile: 5.226813125610351
95th percentile: 5.624206924438476
99th percentile: 5.942121963500976
mean time: 3.5734492778778075
%s, retrying in %s seconds...
Received healthy response to inference request in 1.5742275714874268s
Received healthy response to inference request in 4.1113409996032715s
Received healthy response to inference request in 3.8084490299224854s
Received healthy response to inference request in 3.987786054611206s
Received healthy response to inference request in 2.665966272354126s
5 requests
0 failed requests
5th percentile: 1.7925753116607666
10th percentile: 2.0109230518341064
20th percentile: 2.447618532180786
30th percentile: 2.894462823867798
40th percentile: 3.3514559268951416
50th percentile: 3.8084490299224854
60th percentile: 3.8801838397979735
70th percentile: 3.9519186496734617
80th percentile: 4.012497043609619
90th percentile: 4.061919021606445
95th percentile: 4.086630010604859
99th percentile: 4.106398801803589
mean time: 3.229553985595703
clean up pipeline due to error=DeploymentChecksError('Unacceptable 70th percentile latency 3.9519186496734617s')
Shutdown handler de-registered
rirv938-llama-8b-multihe_9103_v1 status is now failed due to DeploymentManager action
admin requested tearing down of rirv938-llama-8b-multihe_9103_v1
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLDeleter
Checking if service rirv938-llama-8b-multihe-9103-v1 is running
Tearing down inference service rirv938-llama-8b-multihe-9103-v1
Service rirv938-llama-8b-multihe-9103-v1 has been torndown
Pipeline stage MKMLDeleter completed in 2.33s
run pipeline stage %s
Running pipeline stage MKMLModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
Deleting key rirv938-llama-8b-multihe-9103-v1/config.json from bucket guanaco-mkml-models
Deleting key rirv938-llama-8b-multihe-9103-v1/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Deleting key rirv938-llama-8b-multihe-9103-v1/special_tokens_map.json from bucket guanaco-mkml-models
rirv938-llama-8b-multihe_9103_v1 status is now torndown due to DeploymentManager action