Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name chaiml-llama-8b-pairwis-8189-v44-mkmlizer
Waiting for job on chaiml-llama-8b-pairwis-8189-v44-mkmlizer to finish
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ _____ __ __ ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ /___/ ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ Version: 0.11.12 ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ https://mk1.ai ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ belonging to: ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ Chai Research Corp. ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ║ ║
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: Downloaded to shared memory in 25.990s
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: quantizing model to /dev/shm/model_cache, profile:t0, folder:/tmp/tmp8jyyq8ih, device:0
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: Saving flywheel model at /dev/shm/model_cache
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: quantized model in 86.478s
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: Processed model ChaiML/llama_8b_pairwise_64m_256_tokens_step_249984 in 112.469s
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: creating bucket guanaco-mkml-models
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/chaiml-llama-8b-pairwis-8189-v44/config.json
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/chaiml-llama-8b-pairwis-8189-v44/special_tokens_map.json
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/chaiml-llama-8b-pairwis-8189-v44/tokenizer_config.json
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-llama-8b-pairwis-8189-v44/tokenizer.json
chaiml-llama-8b-pairwis-8189-v44-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-llama-8b-pairwis-8189-v44/flywheel_model.0.safetensors
chaiml-llama-8b-pairwis-8189-v44-mkmlizer:
Loading 0: 0%| | 0/291 [00:00<?, ?it/s]
Loading 0: 1%| | 3/291 [00:00<00:57, 5.04it/s]
Loading 0: 1%|▏ | 4/291 [00:01<01:34, 3.04it/s]
Loading 0: 2%|▏ | 5/291 [00:01<02:06, 2.26it/s]
Loading 0: 3%|▎ | 8/291 [00:02<01:05, 4.32it/s]
Loading 0: 3%|▎ | 9/291 [00:02<01:04, 4.38it/s]
Loading 0: 3%|▎ | 10/291 [00:02<00:56, 5.00it/s]
Loading 0: 4%|▍ | 12/291 [00:03<01:06, 4.20it/s]
Loading 0: 4%|▍ | 13/291 [00:03<01:27, 3.17it/s]
Loading 0: 5%|▍ | 14/291 [00:04<01:50, 2.52it/s]
Loading 0: 6%|▌ | 17/291 [00:04<01:03, 4.32it/s]
Loading 0: 6%|▌ | 18/291 [00:04<01:00, 4.54it/s]
Loading 0: 7%|▋ | 19/291 [00:04<00:54, 5.02it/s]
Loading 0: 7%|▋ | 21/291 [00:05<01:03, 4.23it/s]
Loading 0: 8%|▊ | 22/291 [00:05<01:24, 3.20it/s]
Loading 0: 8%|▊ | 23/291 [00:06<01:46, 2.51it/s]
Loading 0: 9%|▉ | 26/291 [00:06<01:02, 4.27it/s]
Loading 0: 9%|▉ | 27/291 [00:07<00:58, 4.52it/s]
Loading 0: 10%|▉ | 28/291 [00:07<00:51, 5.12it/s]
Loading 0: 10%|█ | 30/291 [00:07<01:00, 4.29it/s]
Loading 0: 11%|█ | 31/291 [00:08<01:20, 3.24it/s]
Loading 0: 11%|█ | 32/291 [00:08<01:41, 2.56it/s]
Loading 0: 12%|█▏ | 35/291 [00:09<00:58, 4.35it/s]
Loading 0: 12%|█▏ | 36/291 [00:09<00:55, 4.59it/s]
Loading 0: 13%|█▎ | 37/291 [00:09<00:49, 5.12it/s]
Loading 0: 13%|█▎ | 39/291 [00:10<00:58, 4.29it/s]
Loading 0: 14%|█▎ | 40/291 [00:10<01:17, 3.23it/s]
Loading 0: 14%|█▍ | 41/291 [00:11<01:38, 2.53it/s]
Loading 0: 15%|█▌ | 44/291 [00:11<00:56, 4.35it/s]
Loading 0: 15%|█▌ | 45/291 [00:11<00:53, 4.59it/s]
Loading 0: 16%|█▌ | 46/291 [00:11<00:47, 5.15it/s]
Loading 0: 16%|█▋ | 48/291 [00:12<00:56, 4.28it/s]
Loading 0: 17%|█▋ | 49/291 [00:12<01:15, 3.22it/s]
Loading 0: 17%|█▋ | 50/291 [00:13<01:34, 2.55it/s]
Loading 0: 18%|█▊ | 53/291 [00:13<00:54, 4.38it/s]
Loading 0: 19%|█▊ | 54/291 [00:13<00:51, 4.62it/s]
Loading 0: 19%|█▉ | 55/291 [00:14<00:45, 5.22it/s]
Loading 0: 20%|█▉ | 57/291 [00:14<00:54, 4.32it/s]
Loading 0: 20%|█▉ | 58/291 [00:15<01:11, 3.25it/s]
Loading 0: 20%|██ | 59/291 [00:15<01:30, 2.56it/s]
Loading 0: 21%|██▏ | 62/291 [00:16<00:52, 4.35it/s]
Loading 0: 22%|██▏ | 63/291 [00:16<00:49, 4.59it/s]
Loading 0: 22%|██▏ | 64/291 [00:16<00:43, 5.19it/s]
Loading 0: 23%|██▎ | 66/291 [00:16<00:51, 4.34it/s]
Loading 0: 23%|██▎ | 67/291 [00:17<01:08, 3.28it/s]
Loading 0: 23%|██▎ | 68/291 [00:18<01:26, 2.59it/s]
Loading 0: 24%|██▍ | 71/291 [00:18<00:50, 4.38it/s]
Loading 0: 25%|██▍ | 72/291 [00:18<00:47, 4.62it/s]
Loading 0: 25%|██▌ | 73/291 [00:18<00:41, 5.22it/s]
Loading 0: 26%|██▌ | 75/291 [00:19<00:49, 4.35it/s]
Loading 0: 26%|██▌ | 76/291 [00:19<01:05, 3.28it/s]
Loading 0: 26%|██▋ | 77/291 [00:20<01:23, 2.57it/s]
Loading 0: 27%|██▋ | 80/291 [00:20<00:48, 4.36it/s]
Loading 0: 28%|██▊ | 81/291 [00:20<00:45, 4.60it/s]
Loading 0: 28%|██▊ | 82/291 [00:21<00:40, 5.17it/s]
Loading 0: 29%|██▊ | 83/291 [00:21<00:42, 4.89it/s]
Loading 0: 29%|██▉ | 84/291 [00:21<01:02, 3.32it/s]
Loading 0: 29%|██▉ | 85/291 [00:22<01:17, 2.66it/s]
Loading 0: 30%|██▉ | 86/291 [00:23<01:33, 2.20it/s]
Loading 0: 31%|███ | 89/291 [00:23<00:49, 4.06it/s]
Loading 0: 31%|███ | 90/291 [00:23<00:46, 4.35it/s]
Loading 0: 31%|███▏ | 91/291 [00:23<00:40, 4.98it/s]
Loading 0: 32%|███▏ | 93/291 [00:24<00:46, 4.22it/s]
Loading 0: 32%|███▏ | 94/291 [00:24<01:01, 3.20it/s]
Loading 0: 33%|███▎ | 95/291 [00:25<01:16, 2.55it/s]
Loading 0: 34%|███▎ | 98/291 [00:25<00:44, 4.36it/s]
Loading 0: 34%|███▍ | 99/291 [00:25<00:41, 4.60it/s]
Loading 0: 34%|███▍ | 100/291 [00:25<00:36, 5.17it/s]
Loading 0: 35%|███▌ | 102/291 [00:26<00:43, 4.31it/s]
Loading 0: 35%|███▌ | 103/291 [00:27<00:58, 3.23it/s]
Loading 0: 36%|███▌ | 104/291 [00:27<01:12, 2.57it/s]
Loading 0: 37%|███▋ | 107/291 [00:27<00:42, 4.37it/s]
Loading 0: 37%|███▋ | 108/291 [00:28<00:39, 4.61it/s]
Loading 0: 37%|███▋ | 109/291 [00:28<00:34, 5.21it/s]
Loading 0: 38%|███▊ | 111/291 [00:28<00:41, 4.35it/s]
Loading 0: 38%|███▊ | 112/291 [00:29<00:54, 3.28it/s]
Loading 0: 39%|███▉ | 113/291 [00:29<01:08, 2.58it/s]
Loading 0: 40%|███▉ | 116/291 [00:30<00:39, 4.43it/s]
Loading 0: 40%|████ | 117/291 [00:30<00:37, 4.66it/s]
Loading 0: 41%|████ | 118/291 [00:30<00:32, 5.27it/s]
Loading 0: 41%|████ | 120/291 [00:31<00:38, 4.38it/s]
Loading 0: 42%|████▏ | 121/291 [00:31<00:51, 3.30it/s]
Loading 0: 42%|████▏ | 122/291 [00:32<01:06, 2.55it/s]
Loading 0: 43%|████▎ | 125/291 [00:32<00:38, 4.33it/s]
Loading 0: 43%|████▎ | 126/291 [00:32<00:36, 4.55it/s]
Loading 0: 44%|████▎ | 127/291 [00:32<00:32, 5.12it/s]
Loading 0: 44%|████▍ | 129/291 [00:33<00:37, 4.27it/s]
Loading 0: 45%|████▍ | 130/291 [00:33<00:49, 3.23it/s]
Loading 0: 45%|████▌ | 131/291 [00:34<01:02, 2.56it/s]
Loading 0: 46%|████▌ | 134/291 [00:34<00:35, 4.41it/s]
Loading 0: 46%|████▋ | 135/291 [00:34<00:33, 4.63it/s]
Loading 0: 47%|████▋ | 136/291 [00:35<00:30, 5.14it/s]
Loading 0: 47%|████▋ | 138/291 [00:35<00:35, 4.32it/s]
Loading 0: 48%|████▊ | 139/291 [00:36<00:46, 3.27it/s]
Loading 0: 48%|████▊ | 140/291 [00:36<00:58, 2.60it/s]
Loading 0: 49%|████▉ | 143/291 [00:37<00:33, 4.47it/s]
Loading 0: 49%|████▉ | 144/291 [00:37<00:31, 4.70it/s]
Loading 0: 50%|█████ | 146/291 [00:37<00:22, 6.45it/s]
Loading 0: 51%|█████ | 148/291 [00:38<00:42, 3.36it/s]
Loading 0: 51%|█████ | 149/291 [00:39<00:52, 2.72it/s]
Loading 0: 52%|█████▏ | 152/291 [00:39<00:32, 4.31it/s]
Loading 0: 53%|█████▎ | 153/291 [00:39<00:30, 4.54it/s]
Loading 0: 53%|█████▎ | 154/291 [00:39<00:26, 5.10it/s]
Loading 0: 54%|█████▎ | 156/291 [00:40<00:31, 4.33it/s]
Loading 0: 54%|█████▍ | 157/291 [00:40<00:40, 3.30it/s]
Loading 0: 54%|█████▍ | 158/291 [00:41<00:50, 2.63it/s]
Loading 0: 55%|█████▌ | 161/291 [00:41<00:29, 4.48it/s]
Loading 0: 56%|█████▌ | 162/291 [00:41<00:27, 4.72it/s]
Loading 0: 56%|█████▋ | 164/291 [00:41<00:19, 6.45it/s]
Loading 0: 57%|█████▋ | 166/291 [00:43<00:37, 3.35it/s]
Loading 0: 57%|█████▋ | 167/291 [00:43<00:45, 2.73it/s]
Loading 0: 58%|█████▊ | 170/291 [00:43<00:27, 4.36it/s]
Loading 0: 59%|█████▉ | 171/291 [00:44<00:26, 4.59it/s]
Loading 0: 59%|█████▉ | 172/291 [00:44<00:23, 5.16it/s]
Loading 0: 59%|█████▉ | 173/291 [00:44<00:33, 3.57it/s]
Loading 0: 60%|██████ | 175/291 [00:44<00:24, 4.76it/s]
Loading 0: 60%|██████ | 176/291 [00:45<00:23, 5.00it/s]
Loading 0: 62%|██████▏ | 179/291 [00:45<00:23, 4.69it/s]
Loading 0: 62%|██████▏ | 180/291 [00:46<00:31, 3.56it/s]
Loading 0: 62%|██████▏ | 181/291 [00:47<00:39, 2.81it/s]
Loading 0: 63%|██████▎ | 184/291 [00:47<00:23, 4.60it/s]
Loading 0: 64%|██████▎ | 185/291 [00:47<00:22, 4.79it/s]
Loading 0: 64%|██████▍ | 186/291 [00:47<00:19, 5.33it/s]
Loading 0: 64%|██████▍ | 187/291 [00:47<00:19, 5.28it/s]
Loading 0: 65%|██████▍ | 188/291 [00:48<00:29, 3.50it/s]
Loading 0: 65%|██████▍ | 189/291 [00:48<00:38, 2.65it/s]
Loading 0: 66%|██████▌ | 192/291 [00:49<00:27, 3.58it/s]
Loading 0: 66%|██████▋ | 193/291 [00:50<00:33, 2.95it/s]
Loading 0: 67%|██████▋ | 194/291 [00:50<00:39, 2.46it/s]
Loading 0: 68%|██████▊ | 197/291 [00:50<00:22, 4.13it/s]
Loading 0: 68%|██████▊ | 198/291 [00:51<00:21, 4.40it/s]
Loading 0: 68%|██████▊ | 199/291 [00:51<00:18, 4.91it/s]
Loading 0: 69%|██████▉ | 201/291 [00:51<00:21, 4.24it/s]
Loading 0: 69%|██████▉ | 202/291 [00:52<00:27, 3.25it/s]
Loading 0: 70%|██████▉ | 203/291 [00:52<00:34, 2.58it/s]
Loading 0: 71%|███████ | 206/291 [00:53<00:19, 4.37it/s]
Loading 0: 71%|███████ | 207/291 [00:53<00:18, 4.59it/s]
Loading 0: 71%|███████▏ | 208/291 [00:53<00:16, 5.17it/s]
Loading 0: 72%|███████▏ | 210/291 [00:54<00:18, 4.33it/s]
Loading 0: 73%|███████▎ | 211/291 [00:54<00:24, 3.27it/s]
Loading 0: 73%|███████▎ | 212/291 [00:55<00:30, 2.60it/s]
Loading 0: 74%|███████▍ | 215/291 [00:55<00:17, 4.47it/s]
Loading 0: 74%|███████▍ | 216/291 [00:55<00:15, 4.70it/s]
Loading 0: 75%|███████▍ | 217/291 [00:55<00:14, 5.26it/s]
Loading 0: 75%|███████▌ | 219/291 [00:56<00:16, 4.38it/s]
Loading 0: 76%|███████▌ | 220/291 [00:56<00:21, 3.31it/s]
Loading 0: 76%|███████▌ | 221/291 [00:57<00:26, 2.61it/s]
Loading 0: 77%|███████▋ | 224/291 [00:57<00:14, 4.47it/s]
Loading 0: 77%|███████▋ | 225/291 [00:57<00:14, 4.71it/s]
Loading 0: 78%|███████▊ | 226/291 [00:58<00:12, 5.11it/s]
Loading 0: 78%|███████▊ | 228/291 [00:58<00:14, 4.30it/s]
Loading 0: 79%|███████▊ | 229/291 [00:59<00:18, 3.27it/s]
Loading 0: 79%|███████▉ | 230/291 [00:59<00:23, 2.60it/s]
Loading 0: 80%|████████ | 233/291 [01:00<00:12, 4.47it/s]
Loading 0: 80%|████████ | 234/291 [01:00<00:12, 4.71it/s]
Loading 0: 81%|████████ | 235/291 [01:00<00:10, 5.28it/s]
Loading 0: 81%|████████▏ | 237/291 [01:00<00:12, 4.40it/s]
Loading 0: 82%|████████▏ | 238/291 [01:01<00:15, 3.32it/s]
Loading 0: 82%|████████▏ | 239/291 [01:02<00:19, 2.62it/s]
Loading 0: 83%|████████▎ | 242/291 [01:02<00:11, 4.45it/s]
Loading 0: 84%|████████▎ | 243/291 [01:02<00:10, 4.70it/s]
Loading 0: 84%|████████▍ | 245/291 [01:02<00:07, 6.45it/s]
Loading 0: 85%|████████▍ | 247/291 [01:03<00:13, 3.36it/s]
Loading 0: 85%|████████▌ | 248/291 [01:04<00:15, 2.73it/s]
Loading 0: 86%|████████▋ | 251/291 [01:04<00:09, 4.39it/s]
Loading 0: 87%|████████▋ | 252/291 [01:04<00:08, 4.68it/s]
Loading 0: 87%|████████▋ | 253/291 [01:04<00:07, 5.22it/s]
Loading 0: 88%|████████▊ | 255/291 [01:05<00:08, 4.37it/s]
Loading 0: 88%|████████▊ | 256/291 [01:05<00:10, 3.31it/s]
Loading 0: 88%|████████▊ | 257/291 [01:06<00:12, 2.63it/s]
Loading 0: 89%|████████▉ | 260/291 [01:06<00:06, 4.47it/s]
Loading 0: 90%|████████▉ | 261/291 [01:06<00:06, 4.71it/s]
Loading 0: 91%|█████████ | 264/291 [01:07<00:05, 4.58it/s]
Loading 0: 91%|█████████ | 265/291 [01:08<00:07, 3.56it/s]
Loading 0: 91%|█████████▏| 266/291 [01:08<00:08, 2.83it/s]
Loading 0: 92%|█████████▏| 269/291 [01:09<00:04, 4.53it/s]
Loading 0: 93%|█████████▎| 270/291 [01:09<00:04, 4.75it/s]
Loading 0: 93%|█████████▎| 271/291 [01:09<00:03, 5.31it/s]
Loading 0: 94%|█████████▍| 273/291 [01:09<00:04, 4.42it/s]
Loading 0: 94%|█████████▍| 274/291 [01:10<00:05, 3.35it/s]
Loading 0: 95%|█████████▍| 275/291 [01:11<00:06, 2.65it/s]
Loading 0: 96%|█████████▌| 278/291 [01:11<00:02, 4.51it/s]
Loading 0: 96%|█████████▌| 279/291 [01:11<00:02, 4.75it/s]
Loading 0: 96%|█████████▌| 280/291 [01:11<00:02, 5.33it/s]
Loading 0: 97%|█████████▋| 281/291 [01:12<00:02, 3.57it/s]
Loading 0: 97%|█████████▋| 282/291 [01:12<00:03, 2.68it/s]
Loading 0: 98%|█████████▊| 284/291 [01:13<00:01, 3.85it/s]
Loading 0: 98%|█████████▊| 285/291 [01:13<00:01, 4.20it/s]
Loading 0: 98%|█████████▊| 286/291 [01:13<00:01, 4.89it/s]
Loading 0: 99%|█████████▊| 287/291 [01:13<00:00, 4.96it/s]
Loading 0: 99%|█████████▉| 288/291 [01:14<00:00, 3.29it/s]
Job chaiml-llama-8b-pairwis-8189-v44-mkmlizer completed after 135.07s with status: succeeded
Stopping job with name chaiml-llama-8b-pairwis-8189-v44-mkmlizer
Pipeline stage MKMLizer completed in 135.55s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service chaiml-llama-8b-pairwis-8189-v44
Waiting for inference service chaiml-llama-8b-pairwis-8189-v44 to be ready
Inference service chaiml-llama-8b-pairwis-8189-v44 ready after 201.00526976585388s
Pipeline stage MKMLDeployer completed in 201.56s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 5.1218461990356445s
Received healthy response to inference request in 4.131861925125122s
Received healthy response to inference request in 2.505263566970825s
Received healthy response to inference request in 3.7385621070861816s
Received healthy response to inference request in 5.281928539276123s
5 requests
0 failed requests
5th percentile: 2.7519232749938967
10th percentile: 2.9985829830169677
20th percentile: 3.49190239906311
30th percentile: 3.81722207069397
40th percentile: 3.974541997909546
50th percentile: 4.131861925125122
60th percentile: 4.527855634689331
70th percentile: 4.92384934425354
80th percentile: 5.15386266708374
90th percentile: 5.217895603179931
95th percentile: 5.249912071228027
99th percentile: 5.275525245666504
mean time: 4.155892467498779
%s, retrying in %s seconds...
Received healthy response to inference request in 3.2896664142608643s
Received healthy response to inference request in 2.775350332260132s
Received healthy response to inference request in 2.9346237182617188s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 3.007922649383545s
Received healthy response to inference request in 4.646139144897461s
5 requests
0 failed requests
5th percentile: 2.807205009460449
10th percentile: 2.8390596866607667
20th percentile: 2.9027690410614015
30th percentile: 2.949283504486084
40th percentile: 2.9786030769348146
50th percentile: 3.007922649383545
60th percentile: 3.1206201553344726
70th percentile: 3.2333176612854
80th percentile: 3.5609609603881838
90th percentile: 4.103550052642822
95th percentile: 4.374844598770141
99th percentile: 4.591880235671997
mean time: 3.330740451812744
Pipeline stage StressChecker completed in 40.12s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.22s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 2.12s
Shutdown handler de-registered
chaiml-llama-8b-pairwis_8189_v44 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.12s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.11s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-llama-8b-pairwis-8189-v44-profiler
Waiting for inference service chaiml-llama-8b-pairwis-8189-v44-profiler to be ready
Inference service chaiml-llama-8b-pairwis-8189-v44-profiler ready after 200.45093297958374s
Pipeline stage MKMLProfilerDeployer completed in 200.82s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplon69wt:/code/chaiverse_profiler_1734033063 --namespace tenant-chaiml-guanaco
kubectl exec -it chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplon69wt --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734033063 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734033063/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplon69wt:/code/chaiverse_profiler_1734035861 --namespace tenant-chaiml-guanaco
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplon69wt:/code/chaiverse_profiler_1734035862 --namespace tenant-chaiml-guanaco
kubectl exec -it chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplon69wt --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734035862 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734035862/summary.json'
Received signal 2, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service chaiml-llama-8b-pairwis-8189-v44-profiler is running
Tearing down inference service chaiml-llama-8b-pairwis-8189-v44-profiler
Service chaiml-llama-8b-pairwis-8189-v44-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.16s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service chaiml-llama-8b-pairwis-8189-v44-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 2.27s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.12s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-llama-8b-pairwis-8189-v44-profiler
Waiting for inference service chaiml-llama-8b-pairwis-8189-v44-profiler to be ready
Inference service chaiml-llama-8b-pairwis-8189-v44-profiler ready after 40.11849403381348s
Pipeline stage MKMLProfilerDeployer completed in 40.45s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplo6fvzd:/code/chaiverse_profiler_1734036529 --namespace tenant-chaiml-guanaco
kubectl exec -it chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplo6fvzd --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734036529 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734036529/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplo6fvzd:/code/chaiverse_profiler_1734038923 --namespace tenant-chaiml-guanaco
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplo6fvzd:/code/chaiverse_profiler_1734038924 --namespace tenant-chaiml-guanaco
clean up pipeline due to error=ISVCScriptError('Command failed with error: Defaulted container "kserve-container" out of: kserve-container, queue-proxy\nerror: unable to upgrade connection: container not found ("kserve-container")\n, output: ')
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service chaiml-llama-8b-pairwis-8189-v44-profiler is running
Tearing down inference service chaiml-llama-8b-pairwis-8189-v44-profiler
Service chaiml-llama-8b-pairwis-8189-v44-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.19s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service chaiml-llama-8b-pairwis-8189-v44-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 2.38s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.12s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-llama-8b-pairwis-8189-v44-profiler
Waiting for inference service chaiml-llama-8b-pairwis-8189-v44-profiler to be ready
Inference service chaiml-llama-8b-pairwis-8189-v44-profiler ready after 200.46000266075134s
Pipeline stage MKMLProfilerDeployer completed in 200.80s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplom8m82:/code/chaiverse_profiler_1734039171 --namespace tenant-chaiml-guanaco
kubectl exec -it chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplom8m82 --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734039171 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734039171/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplom8m82:/code/chaiverse_profiler_1734041939 --namespace tenant-chaiml-guanaco
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplom8m82:/code/chaiverse_profiler_1734041940 --namespace tenant-chaiml-guanaco
kubectl exec -it chaiml-llama-8b-pair5c0e260050fe1507093fd30dd71ba5d8-deplom8m82 --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734041940 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734041940/summary.json'
Received signal 2, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service chaiml-llama-8b-pairwis-8189-v44-profiler is running
Tearing down inference service chaiml-llama-8b-pairwis-8189-v44-profiler
Service chaiml-llama-8b-pairwis-8189-v44-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.25s
Shutdown handler de-registered
chaiml-llama-8b-pairwis_8189_v44 status is now inactive due to auto deactivation removed underperforming models