Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rirv938-llama-8b-multihe-3555-v2-mkmlizer
Waiting for job on rirv938-llama-8b-multihe-3555-v2-mkmlizer to finish
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ _____ __ __ ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ /___/ ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ Version: 0.11.33 ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ https://mk1.ai ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ The license key for the current software has been verified as ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ belonging to: ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ Chai Research Corp. ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ║ ║
rirv938-llama-8b-multihe-3555-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
rirv938-llama-8b-multihe-3555-v2-mkmlizer: Downloaded to shared memory in 107.074s
rirv938-llama-8b-multihe-3555-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:t0, folder:/tmp/tmpfu349l92, device:0
rirv938-llama-8b-multihe-3555-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
rirv938-llama-8b-multihe-3555-v2-mkmlizer: quantized model in 90.346s
rirv938-llama-8b-multihe-3555-v2-mkmlizer: Processed model rirv938/llama_8b_multihead_57m_nsfw in 197.420s
rirv938-llama-8b-multihe-3555-v2-mkmlizer: creating bucket guanaco-mkml-models
rirv938-llama-8b-multihe-3555-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rirv938-llama-8b-multihe-3555-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rirv938-llama-8b-multihe-3555-v2
rirv938-llama-8b-multihe-3555-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/rirv938-llama-8b-multihe-3555-v2/config.json
rirv938-llama-8b-multihe-3555-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/rirv938-llama-8b-multihe-3555-v2/special_tokens_map.json
rirv938-llama-8b-multihe-3555-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/rirv938-llama-8b-multihe-3555-v2/tokenizer_config.json
rirv938-llama-8b-multihe-3555-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rirv938-llama-8b-multihe-3555-v2/tokenizer.json
rirv938-llama-8b-multihe-3555-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/rirv938-llama-8b-multihe-3555-v2/flywheel_model.0.safetensors
rirv938-llama-8b-multihe-3555-v2-mkmlizer:
Loading 0: 0%| | 0/291 [00:00<?, ?it/s]
Loading 0: 1%| | 3/291 [00:00<00:59, 4.87it/s]
Loading 0: 1%|▏ | 4/291 [00:01<01:36, 2.98it/s]
Loading 0: 2%|▏ | 5/291 [00:01<02:09, 2.21it/s]
Loading 0: 3%|▎ | 8/291 [00:02<01:06, 4.22it/s]
Loading 0: 3%|▎ | 9/291 [00:02<01:06, 4.27it/s]
Loading 0: 3%|▎ | 10/291 [00:02<00:57, 4.85it/s]
Loading 0: 4%|▍ | 12/291 [00:03<01:09, 4.02it/s]
Loading 0: 4%|▍ | 13/291 [00:03<01:32, 3.02it/s]
Loading 0: 5%|▍ | 14/291 [00:04<01:56, 2.38it/s]
Loading 0: 6%|▌ | 17/291 [00:04<01:06, 4.11it/s]
Loading 0: 6%|▌ | 18/291 [00:04<01:02, 4.40it/s]
Loading 0: 7%|▋ | 19/291 [00:04<00:55, 4.90it/s]
Loading 0: 7%|▋ | 21/291 [00:05<01:05, 4.09it/s]
Loading 0: 8%|▊ | 22/291 [00:06<01:27, 3.09it/s]
Loading 0: 8%|▊ | 23/291 [00:06<01:49, 2.46it/s]
Loading 0: 9%|▉ | 26/291 [00:07<01:03, 4.19it/s]
Loading 0: 9%|▉ | 27/291 [00:07<00:59, 4.41it/s]
Loading 0: 10%|▉ | 28/291 [00:07<00:53, 4.93it/s]
Loading 0: 10%|█ | 30/291 [00:07<00:42, 6.13it/s]
Loading 0: 11%|█ | 31/291 [00:07<00:43, 6.03it/s]
Loading 0: 11%|█ | 32/291 [00:07<00:39, 6.60it/s]
Loading 0: 11%|█▏ | 33/291 [00:08<00:42, 6.12it/s]
Loading 0: 12%|█▏ | 34/291 [00:08<01:12, 3.53it/s]
Loading 0: 12%|█▏ | 35/291 [00:09<01:35, 2.69it/s]
Loading 0: 12%|█▏ | 36/291 [00:09<01:58, 2.16it/s]
Loading 0: 13%|█▎ | 39/291 [00:10<01:21, 3.09it/s]
Loading 0: 14%|█▎ | 40/291 [00:11<01:36, 2.61it/s]
Loading 0: 14%|█▍ | 41/291 [00:11<01:53, 2.20it/s]
Loading 0: 15%|█▌ | 44/291 [00:12<01:05, 3.77it/s]
Loading 0: 15%|█▌ | 45/291 [00:12<01:00, 4.08it/s]
Loading 0: 16%|█▌ | 46/291 [00:12<00:52, 4.65it/s]
Loading 0: 16%|█▋ | 48/291 [00:12<01:00, 4.01it/s]
Loading 0: 17%|█▋ | 49/291 [00:13<01:18, 3.07it/s]
Loading 0: 17%|█▋ | 50/291 [00:14<01:38, 2.44it/s]
Loading 0: 18%|█▊ | 53/291 [00:14<00:57, 4.16it/s]
Loading 0: 19%|█▊ | 54/291 [00:14<00:53, 4.46it/s]
Loading 0: 19%|█▉ | 55/291 [00:14<00:47, 5.01it/s]
Loading 0: 20%|█▉ | 57/291 [00:15<00:56, 4.17it/s]
Loading 0: 20%|█▉ | 58/291 [00:15<01:14, 3.15it/s]
Loading 0: 20%|██ | 59/291 [00:16<01:33, 2.47it/s]
Loading 0: 21%|██▏ | 62/291 [00:16<00:54, 4.20it/s]
Loading 0: 22%|██▏ | 63/291 [00:17<00:51, 4.43it/s]
Loading 0: 22%|██▏ | 64/291 [00:17<00:45, 5.01it/s]
Loading 0: 23%|██▎ | 66/291 [00:17<00:54, 4.16it/s]
Loading 0: 23%|██▎ | 67/291 [00:18<01:11, 3.13it/s]
Loading 0: 23%|██▎ | 68/291 [00:19<01:31, 2.43it/s]
Loading 0: 24%|██▍ | 71/291 [00:19<00:52, 4.20it/s]
Loading 0: 25%|██▍ | 72/291 [00:19<00:49, 4.43it/s]
Loading 0: 25%|██▌ | 73/291 [00:19<00:43, 5.01it/s]
Loading 0: 25%|██▌ | 74/291 [00:20<01:04, 3.37it/s]
Loading 0: 26%|██▌ | 75/291 [00:20<01:25, 2.54it/s]
Loading 0: 26%|██▋ | 77/291 [00:21<00:57, 3.70it/s]
Loading 0: 27%|██▋ | 78/291 [00:21<00:52, 4.02it/s]
Loading 0: 27%|██▋ | 79/291 [00:21<00:46, 4.61it/s]
Loading 0: 27%|██▋ | 80/291 [00:21<00:44, 4.70it/s]
Loading 0: 28%|██▊ | 81/291 [00:22<01:07, 3.10it/s]
Loading 0: 28%|██▊ | 82/291 [00:22<01:24, 2.47it/s]
Loading 0: 29%|██▊ | 83/291 [00:23<01:40, 2.06it/s]
Loading 0: 30%|██▉ | 86/291 [00:23<00:52, 3.89it/s]
Loading 0: 30%|██▉ | 87/291 [00:23<00:48, 4.17it/s]
Loading 0: 30%|███ | 88/291 [00:23<00:42, 4.78it/s]
Loading 0: 31%|███ | 90/291 [00:24<00:49, 4.03it/s]
Loading 0: 31%|███▏ | 91/291 [00:25<01:05, 3.04it/s]
Loading 0: 32%|███▏ | 92/291 [00:25<01:22, 2.43it/s]
Loading 0: 33%|███▎ | 95/291 [00:26<00:46, 4.22it/s]
Loading 0: 33%|███▎ | 96/291 [00:26<00:43, 4.46it/s]
Loading 0: 33%|███▎ | 97/291 [00:26<00:38, 5.02it/s]
Loading 0: 34%|███▍ | 99/291 [00:26<00:46, 4.16it/s]
Loading 0: 34%|███▍ | 100/291 [00:27<01:00, 3.14it/s]
Loading 0: 35%|███▍ | 101/291 [00:28<01:16, 2.47it/s]
Loading 0: 36%|███▌ | 104/291 [00:28<00:43, 4.27it/s]
Loading 0: 36%|███▌ | 105/291 [00:28<00:41, 4.49it/s]
Loading 0: 36%|███▋ | 106/291 [00:28<00:36, 5.05it/s]
Loading 0: 37%|███▋ | 108/291 [00:29<00:44, 4.14it/s]
Loading 0: 37%|███▋ | 109/291 [00:29<00:58, 3.13it/s]
Loading 0: 38%|███▊ | 110/291 [00:30<01:12, 2.49it/s]
Loading 0: 39%|███▉ | 113/291 [00:30<00:42, 4.24it/s]
Loading 0: 39%|███▉ | 114/291 [00:31<00:39, 4.46it/s]
Loading 0: 40%|███▉ | 115/291 [00:31<00:35, 4.96it/s]
Loading 0: 40%|███▉ | 116/291 [00:31<00:52, 3.35it/s]
Loading 0: 41%|████ | 118/291 [00:31<00:38, 4.51it/s]
Loading 0: 41%|████ | 119/291 [00:32<00:36, 4.74it/s]
Loading 0: 41%|████ | 120/291 [00:32<00:31, 5.40it/s]
Loading 0: 42%|████▏ | 122/291 [00:32<00:40, 4.22it/s]
Loading 0: 43%|████▎ | 125/291 [00:33<00:36, 4.53it/s]
Loading 0: 43%|████▎ | 126/291 [00:34<00:48, 3.43it/s]
Loading 0: 44%|████▎ | 127/291 [00:34<01:00, 2.69it/s]
Loading 0: 45%|████▍ | 130/291 [00:34<00:36, 4.38it/s]
Loading 0: 45%|████▌ | 131/291 [00:35<00:34, 4.65it/s]
Loading 0: 45%|████▌ | 132/291 [00:35<00:30, 5.19it/s]
Loading 0: 46%|████▌ | 133/291 [00:35<00:30, 5.23it/s]
Loading 0: 46%|████▌ | 134/291 [00:36<00:46, 3.41it/s]
Loading 0: 46%|████▋ | 135/291 [00:36<01:01, 2.55it/s]
Loading 0: 47%|████▋ | 138/291 [00:37<00:44, 3.41it/s]
Loading 0: 48%|████▊ | 139/291 [00:37<00:54, 2.81it/s]
Loading 0: 48%|████▊ | 140/291 [00:38<01:04, 2.33it/s]
Loading 0: 49%|████▉ | 143/291 [00:38<00:37, 3.98it/s]
Loading 0: 49%|████▉ | 144/291 [00:38<00:34, 4.26it/s]
Loading 0: 50%|████▉ | 145/291 [00:39<00:30, 4.83it/s]
Loading 0: 51%|█████ | 147/291 [00:39<00:35, 4.08it/s]
Loading 0: 51%|█████ | 148/291 [00:40<00:46, 3.11it/s]
Loading 0: 51%|█████ | 149/291 [00:40<00:57, 2.47it/s]
Loading 0: 52%|█████▏ | 152/291 [00:41<00:33, 4.21it/s]
Loading 0: 53%|█████▎ | 153/291 [00:41<00:31, 4.43it/s]
Loading 0: 53%|█████▎ | 154/291 [00:41<00:27, 4.97it/s]
Loading 0: 54%|█████▎ | 156/291 [00:42<00:32, 4.15it/s]
Loading 0: 54%|█████▍ | 157/291 [00:42<00:42, 3.15it/s]
Loading 0: 54%|█████▍ | 158/291 [00:43<00:53, 2.48it/s]
Loading 0: 55%|█████▌ | 161/291 [00:43<00:30, 4.21it/s]
Loading 0: 56%|█████▌ | 162/291 [00:43<00:29, 4.43it/s]
Loading 0: 56%|█████▌ | 163/291 [00:43<00:25, 5.00it/s]
Loading 0: 57%|█████▋ | 165/291 [00:44<00:30, 4.17it/s]
Loading 0: 57%|█████▋ | 166/291 [00:45<00:39, 3.14it/s]
Loading 0: 57%|█████▋ | 167/291 [00:45<00:49, 2.49it/s]
Loading 0: 58%|█████▊ | 170/291 [00:45<00:28, 4.23it/s]
Loading 0: 59%|█████▉ | 171/291 [00:46<00:26, 4.45it/s]
Loading 0: 59%|█████▉ | 172/291 [00:46<00:23, 5.00it/s]
Loading 0: 60%|█████▉ | 174/291 [00:46<00:28, 4.14it/s]
Loading 0: 60%|██████ | 175/291 [00:47<00:37, 3.13it/s]
Loading 0: 60%|██████ | 176/291 [00:48<00:46, 2.49it/s]
Loading 0: 62%|██████▏ | 179/291 [00:48<00:26, 4.21it/s]
Loading 0: 62%|██████▏ | 180/291 [00:48<00:25, 4.43it/s]
Loading 0: 62%|██████▏ | 181/291 [00:48<00:22, 4.99it/s]
Loading 0: 63%|██████▎ | 183/291 [00:48<00:17, 6.12it/s]
Loading 0: 63%|██████▎ | 184/291 [00:49<00:17, 6.04it/s]
Loading 0: 64%|██████▎ | 185/291 [00:49<00:16, 6.58it/s]
Loading 0: 64%|██████▍ | 186/291 [00:49<00:16, 6.28it/s]
Loading 0: 64%|██████▍ | 187/291 [00:49<00:28, 3.59it/s]
Loading 0: 65%|██████▍ | 188/291 [00:50<00:37, 2.72it/s]
Loading 0: 65%|██████▍ | 189/291 [00:51<00:46, 2.21it/s]
Loading 0: 66%|██████▌ | 192/291 [00:51<00:31, 3.18it/s]
Loading 0: 66%|██████▋ | 193/291 [00:52<00:36, 2.67it/s]
Loading 0: 67%|██████▋ | 194/291 [00:53<00:42, 2.27it/s]
Loading 0: 68%|██████▊ | 197/291 [00:53<00:24, 3.86it/s]
Loading 0: 68%|██████▊ | 198/291 [00:53<00:22, 4.11it/s]
Loading 0: 68%|██████▊ | 199/291 [00:53<00:19, 4.63it/s]
Loading 0: 69%|██████▉ | 201/291 [00:54<00:22, 4.01it/s]
Loading 0: 69%|██████▉ | 202/291 [00:54<00:28, 3.09it/s]
Loading 0: 70%|██████▉ | 203/291 [00:55<00:35, 2.48it/s]
Loading 0: 71%|███████ | 206/291 [00:55<00:20, 4.21it/s]
Loading 0: 71%|███████ | 207/291 [00:55<00:18, 4.44it/s]
Loading 0: 71%|███████▏ | 208/291 [00:55<00:16, 5.00it/s]
Loading 0: 72%|███████▏ | 210/291 [00:56<00:19, 4.16it/s]
Loading 0: 73%|███████▎ | 211/291 [00:57<00:25, 3.14it/s]
Loading 0: 73%|███████▎ | 212/291 [00:57<00:31, 2.48it/s]
Loading 0: 74%|███████▍ | 215/291 [00:58<00:18, 4.22it/s]
Loading 0: 74%|███████▍ | 216/291 [00:58<00:16, 4.45it/s]
Loading 0: 75%|███████▍ | 217/291 [00:58<00:14, 5.01it/s]
Loading 0: 75%|███████▌ | 219/291 [00:58<00:17, 4.16it/s]
Loading 0: 76%|███████▌ | 220/291 [00:59<00:22, 3.15it/s]
Loading 0: 76%|███████▌ | 221/291 [01:00<00:27, 2.51it/s]
Loading 0: 77%|███████▋ | 224/291 [01:00<00:15, 4.26it/s]
Loading 0: 77%|███████▋ | 225/291 [01:00<00:14, 4.49it/s]
Loading 0: 78%|███████▊ | 226/291 [01:00<00:12, 5.05it/s]
Loading 0: 78%|███████▊ | 227/291 [01:01<00:18, 3.39it/s]
Loading 0: 78%|███████▊ | 228/291 [01:01<00:24, 2.58it/s]
Loading 0: 79%|███████▉ | 230/291 [01:02<00:16, 3.70it/s]
Loading 0: 79%|███████▉ | 231/291 [01:02<00:14, 4.03it/s]
Loading 0: 80%|███████▉ | 232/291 [01:02<00:12, 4.68it/s]
Loading 0: 80%|████████ | 233/291 [01:02<00:11, 4.96it/s]
Loading 0: 80%|████████ | 234/291 [01:03<00:17, 3.19it/s]
Loading 0: 81%|████████▏ | 237/291 [01:03<00:13, 3.95it/s]
Loading 0: 82%|████████▏ | 238/291 [01:04<00:17, 3.08it/s]
Loading 0: 82%|████████▏ | 239/291 [01:05<00:21, 2.47it/s]
Loading 0: 83%|████████▎ | 242/291 [01:05<00:11, 4.14it/s]
Loading 0: 84%|████████▎ | 243/291 [01:05<00:10, 4.38it/s]
Loading 0: 84%|████████▍ | 244/291 [01:05<00:09, 4.92it/s]
Loading 0: 85%|████████▍ | 246/291 [01:06<00:10, 4.13it/s]
Loading 0: 85%|████████▍ | 247/291 [01:06<00:14, 3.14it/s]
Loading 0: 85%|████████▌ | 248/291 [01:07<00:17, 2.49it/s]
Loading 0: 86%|████████▋ | 251/291 [01:07<00:09, 4.23it/s]
Loading 0: 87%|████████▋ | 252/291 [01:07<00:08, 4.46it/s]
Loading 0: 87%|████████▋ | 253/291 [01:08<00:07, 4.98it/s]
Loading 0: 88%|████████▊ | 255/291 [01:08<00:08, 4.16it/s]
Loading 0: 88%|████████▊ | 256/291 [01:09<00:11, 3.14it/s]
Loading 0: 88%|████████▊ | 257/291 [01:09<00:13, 2.50it/s]
Loading 0: 89%|████████▉ | 260/291 [01:10<00:07, 4.24it/s]
Loading 0: 90%|████████▉ | 261/291 [01:10<00:06, 4.47it/s]
Loading 0: 90%|█████████ | 262/291 [01:10<00:05, 5.01it/s]
Loading 0: 91%|█████████ | 264/291 [01:11<00:06, 4.16it/s]
Loading 0: 91%|█████████ | 265/291 [01:11<00:08, 3.15it/s]
Loading 0: 91%|█████████▏| 266/291 [01:12<00:09, 2.50it/s]
Loading 0: 92%|█████████▏| 269/291 [01:12<00:05, 4.25it/s]
Loading 0: 93%|█████████▎| 270/291 [01:12<00:04, 4.46it/s]
Loading 0: 93%|█████████▎| 271/291 [01:12<00:04, 4.94it/s]
Loading 0: 94%|█████████▍| 273/291 [01:13<00:04, 4.13it/s]
Loading 0: 94%|█████████▍| 274/291 [01:14<00:05, 3.14it/s]
Loading 0: 95%|█████████▍| 275/291 [01:14<00:06, 2.50it/s]
Loading 0: 96%|█████████▌| 278/291 [01:14<00:03, 4.25it/s]
Loading 0: 96%|█████████▌| 279/291 [01:15<00:02, 4.48it/s]
Loading 0: 96%|█████████▌| 280/291 [01:15<00:02, 5.04it/s]
Loading 0: 97%|█████████▋| 281/291 [01:15<00:02, 3.39it/s]
Loading 0: 97%|█████████▋| 283/291 [01:16<00:01, 4.55it/s]
Loading 0: 98%|█████████▊| 284/291 [01:16<00:01, 4.77it/s]
Loading 0: 98%|█████████▊| 285/291 [01:16<00:01, 5.38it/s]
Loading 0: 98%|█████████▊| 286/291 [01:16<00:00, 5.51it/s]
Loading 0: 99%|█████████▊| 287/291 [01:17<00:01, 3.39it/s]
Loading 0: 99%|█████████▉| 288/291 [01:17<00:01, 2.51it/s]
Job rirv938-llama-8b-multihe-3555-v2-mkmlizer completed after 226.72s with status: succeeded
Stopping job with name rirv938-llama-8b-multihe-3555-v2-mkmlizer
Pipeline stage MKMLizer completed in 227.35s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.17s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rirv938-llama-8b-multihe-3555-v2
Waiting for inference service rirv938-llama-8b-multihe-3555-v2 to be ready
Inference service rirv938-llama-8b-multihe-3555-v2 ready after 181.05133438110352s
Pipeline stage MKMLDeployer completed in 181.61s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 5.255169153213501s
Received healthy response to inference request in 2.709099054336548s
Received healthy response to inference request in 2.391728401184082s
Received healthy response to inference request in 3.449296474456787s
Received healthy response to inference request in 3.215284824371338s
5 requests
0 failed requests
5th percentile: 2.4552025318145754
10th percentile: 2.5186766624450683
20th percentile: 2.6456249237060545
30th percentile: 2.810336208343506
40th percentile: 3.0128105163574217
50th percentile: 3.215284824371338
60th percentile: 3.3088894844055177
70th percentile: 3.4024941444396974
80th percentile: 3.81047101020813
90th percentile: 4.532820081710816
95th percentile: 4.893994617462158
99th percentile: 5.182934246063232
mean time: 3.404115581512451
Pipeline stage StressChecker completed in 18.90s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 5.71s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 2.40s
Shutdown handler de-registered
rirv938-llama-8b-multihe_3555_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.18s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-multihe-3555-v2-profiler
Waiting for inference service rirv938-llama-8b-multihe-3555-v2-profiler to be ready
Inference service rirv938-llama-8b-multihe-3555-v2-profiler ready after 180.54169273376465s
Pipeline stage MKMLProfilerDeployer completed in 180.93s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deploq29bq:/code/chaiverse_profiler_1731384804 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deploq29bq --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1731384804 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1731384804/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deploq29bq:/code/chaiverse_profiler_1731386889 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deploq29bq --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1731386889 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1731386889/summary.json'
Received signal 2, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-multihe-3555-v2-profiler is running
Tearing down inference service rirv938-llama-8b-multihe-3555-v2-profiler
Service rirv938-llama-8b-multihe-3555-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 1.72s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-multihe-3555-v2-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 1.68s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-multihe-3555-v2-profiler
Waiting for inference service rirv938-llama-8b-multihe-3555-v2-profiler to be ready
Inference service rirv938-llama-8b-multihe-3555-v2-profiler ready after 180.4821901321411s
Pipeline stage MKMLProfilerDeployer completed in 180.93s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deplohrf7f:/code/chaiverse_profiler_1731388436 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deplohrf7f --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1731388436 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1731388436/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deplohrf7f:/code/chaiverse_profiler_1731390539 --namespace tenant-chaiml-guanaco
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deplohrf7f:/code/chaiverse_profiler_1731390540 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deplohrf7f --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1731390540 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1731390540/summary.json'
Received signal 2, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-multihe-3555-v2-profiler is running
Tearing down inference service rirv938-llama-8b-multihe-3555-v2-profiler
Service rirv938-llama-8b-multihe-3555-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 1.95s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-multihe-3555-v2-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 1.91s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-multihe-3555-v2-profiler
Waiting for inference service rirv938-llama-8b-multihe-3555-v2-profiler to be ready
Inference service rirv938-llama-8b-multihe-3555-v2-profiler ready after 80.2040445804596s
Pipeline stage MKMLProfilerDeployer completed in 80.57s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deplo94tv7:/code/chaiverse_profiler_1731391961 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deplo94tv7 --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1731391961 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1731391961/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deplo94tv7:/code/chaiverse_profiler_1731391965 --namespace tenant-chaiml-guanaco
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deplo94tv7:/code/chaiverse_profiler_1731391966 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mul1b8ebc0dde3329ccecc61c720a3230cc-deplo94tv7 --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1731391966 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1731391966/summary.json'
clean up pipeline due to error=ISVCScriptError('Command failed with error: Defaulted container "kserve-container" out of: kserve-container, queue-proxy\nUnable to use a TTY - input is not a terminal or the right kind of file\n\n 0%| | 0/200 [00:00<?, ?it/s]\n 0%| | 1/200 [00:08<28:45, 8.67s/it]\n 1%| | 2/200 [00:38<1:10:05, 21.24s/it]\n 2%|▏ | 3/200 [01:08<1:22:54, 25.25s/it]\n 2%|▏ | 4/200 [01:38<1:28:39, 27.14s/it]\n 2%|▎ | 5/200 [02:08<1:31:34, 28.18s/it]\n 3%|▎ | 6/200 [02:38<1:33:08, 28.81s/it]\n 4%|▎ | 7/200 [03:08<1:33:56, 29.21s/it]\n 4%|▍ | 8/200 [03:38<1:34:18, 29.47s/it]\n 4%|▍ | 9/200 [04:08<1:34:22, 29.65s/it]\n 5%|▌ | 10/200 [04:38<1:34:15, 29.76s/it]\n 6%|▌ | 11/200 [05:08<1:34:00, 29.84s/it]\n 6%|▌ | 12/200 [05:38<1:33:41, 29.90s/it]\n 6%|▋ | 13/200 [06:09<1:33:18, 29.94s/it]\n 7%|▋ | 14/200 [06:39<1:32:52, 29.96s/it]\n 8%|▊ | 15/200 [07:09<1:32:26, 29.98s/it]\n 8%|▊ | 16/200 [07:39<1:31:58, 29.99s/it]\n 8%|▊ | 17/200 [08:09<1:31:28, 29.99s/it]\n 9%|▉ | 18/200 [08:39<1:31:00, 30.01s/it]\n 10%|▉ | 19/200 [09:09<1:30:32, 30.01s/it]\n 10%|█ | 20/200 [09:39<1:30:01, 30.01s/it]\n 10%|█ | 21/200 [10:09<1:29:31, 30.01s/it]\n 11%|█ | 22/200 [10:39<1:29:03, 30.02s/it]\n 12%|█▏ | 23/200 [11:09<1:28:33, 30.02s/it]\n 12%|█▏ | 24/200 [11:39<1:28:04, 30.02s/it]\n 12%|█▎ | 25/200 [12:09<1:27:34, 30.03s/it]\n 13%|█▎ | 26/200 [12:39<1:27:04, 30.03s/it]\n 14%|█▎ | 27/200 [13:09<1:26:35, 30.03s/it]\n 14%|█▍ | 28/200 [13:39<1:26:04, 30.03s/it]\n 14%|█▍ | 29/200 [14:09<1:25:34, 30.03s/it]\n 15%|█▌ | 30/200 [14:39<1:25:03, 30.02s/it]\n 16%|█▌ | 31/200 [15:09<1:24:34, 30.02s/it]\n 16%|█▌ | 32/200 [15:39<1:24:04, 30.03s/it]\n 16%|█▋ | 33/200 [16:09<1:23:34, 30.03s/it]\n 17%|█▋ | 34/200 [16:39<1:23:04, 30.03s/it]\n 18%|█▊ | 35/200 [17:09<1:22:34, 30.03s/it]\n 18%|█▊ | 36/200 [17:39<1:22:04, 30.03s/it]\n 18%|█▊ | 37/200 [18:09<1:21:34, 30.03s/it]\n 19%|█▉ | 38/200 [18:39<1:21:04, 30.03s/it]\n 20%|█▉ | 39/200 [19:09<1:20:34, 30.03s/it]\n 20%|██ | 40/200 [19:39<1:20:04, 30.03s/it]\n 20%|██ | 41/200 [20:09<1:19:34, 30.03s/it]\n 21%|██ | 42/200 [20:39<1:19:05, 30.03s/it]\n 22%|██▏ | 43/200 [21:09<1:18:34, 30.03s/it]\n 22%|██▏ | 44/200 [21:39<1:18:03, 30.02s/it]\n 22%|██▎ | 45/200 [22:09<1:17:34, 30.03s/it]\n 23%|██▎ | 46/200 [22:39<1:17:04, 30.03s/it]\n 24%|██▎ | 47/200 [23:09<1:16:34, 30.03s/it]\n 24%|██▍ | 48/200 [23:39<1:16:03, 30.03s/it]\n 24%|██▍ | 49/200 [24:09<1:15:34, 30.03s/it]\n 25%|██▌ | 50/200 [24:39<1:15:04, 30.03s/it]\n 26%|██▌ | 51/200 [25:10<1:14:34, 30.03s/it]\n 26%|██▌ | 52/200 [25:40<1:14:04, 30.03s/it]\n 26%|██▋ | 53/200 [26:10<1:13:34, 30.03s/it]\n 27%|██▋ | 54/200 [26:40<1:13:04, 30.03s/it]\n 28%|██▊ | 55/200 [27:10<1:12:33, 30.02s/it]\n 28%|██▊ | 56/200 [27:40<1:12:03, 30.03s/it]\n 28%|██▊ | 57/200 [28:10<1:11:34, 30.03s/it]\n 29%|██▉ | 58/200 [28:40<1:11:03, 30.03s/it]\n 30%|██▉ | 59/200 [29:10<1:10:33, 30.03s/it]\n 30%|███ | 60/200 [29:40<1:10:02, 30.02s/it]\n 30%|███ | 61/200 [30:10<1:09:33, 30.02s/it]\n 31%|███ | 62/200 [30:40<1:09:03, 30.03s/it]\n 32%|███▏ | 63/200 [31:10<1:08:33, 30.03s/it]\n 32%|███▏ | 64/200 [31:40<1:08:02, 30.02s/it]\n 32%|███▎ | 65/200 [32:10<1:07:33, 30.02s/it]\n 33%|███▎ | 66/200 [32:40<1:07:03, 30.03s/it]\n 34%|███▎ | 67/200 [33:10<1:06:32, 30.02s/it]\n 34%|███▍ | 68/200 [33:40<1:06:03, 30.02s/it]\n 34%|███▍ | 69/200 [34:10<1:05:32, 30.02s/it]\n 35%|███▌ | 70/200 [34:40<1:05:03, 30.02s/it]\n 36%|███▌ | 71/200 [34:41<45:36, 21.22s/it] command terminated with exit code 137\n, output: waiting for startup of TargetModel(endpoint=\'localhost\', route=\'GPT-J-6B-lit-v2\', namespace=\'tenant-chaiml-guanaco\', max_characters=9999, reward=False, url_format=\'{endpoint}-predictor-default.{namespace}.knative.ord1.coreweave.cloud\')\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Read timed out. (read timeout=30)\nRequest failed with: (\'Connection aborted.\', RemoteDisconnected(\'Remote end closed connection without response\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db7730>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db4520>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db4a30>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db5f90>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00640>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00880>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db4a60>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db7610>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db49a0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db7730>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d013c0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d012a0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db71c0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db49a0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db4d30>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db7c10>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00dc0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d01180>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db73d0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db7d90>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db53c0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d01750>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d011e0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00550>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db6470>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db4940>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d01210>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d01e10>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00d00>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db5360>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db6380>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d02a40>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d020b0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d02710>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00eb0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db49a0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db73d0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00eb0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00e50>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d02bf0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d02b30>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db6c50>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d02170>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00f10>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d03190>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d030d0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db71f0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d03940>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00af0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d03700>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d01120>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00fa0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db49a0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d03e20>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00cd0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d031c0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d02680>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d02260>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d201c0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d02da0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d004f0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d02890>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d00a90>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6db4970>: Failed to establish a new connection: [Errno 111] Connection refused\'))\nRequest failed with: HTTPConnectionPool(host=\'localhost\', port=8080): Max retries exceeded with url: /v1/models/GPT-J-6B-lit-v2:predict (Caused by NewConnectionError(\'<urllib3.connection.HTTPConnection object at 0x7f50c6d01ed0>: Failed to establish a new connection: [Errno 111] Connection refused\'))\n')
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-multihe-3555-v2-profiler is running
Tearing down inference service rirv938-llama-8b-multihe-3555-v2-profiler
Service rirv938-llama-8b-multihe-3555-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.02s
Shutdown handler de-registered
rirv938-llama-8b-multihe_3555_v2 status is now inactive due to auto deactivation removed underperforming models
rirv938-llama-8b-multihe_3555_v2 status is now torndown due to DeploymentManager action