Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rirv938-llama-8b-multihe-6884-v2-mkmlizer
Waiting for job on rirv938-llama-8b-multihe-6884-v2-mkmlizer to finish
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ _____ __ __ ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ /___/ ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ Version: 0.11.12 ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ https://mk1.ai ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ The license key for the current software has been verified as ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ belonging to: ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ Chai Research Corp. ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ║ ║
rirv938-llama-8b-multihe-6884-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
rirv938-llama-8b-multihe-6884-v2-mkmlizer: Downloaded to shared memory in 63.812s
rirv938-llama-8b-multihe-6884-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:t0, folder:/tmp/tmpf_fahgdd, device:0
rirv938-llama-8b-multihe-6884-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
Connection pool is full, discarding connection: %s. Connection pool size: %s
Failed to get response for submission blend_fomak_2024-11-12: ('http://chaiml-nemo-20241017-tie-8098-v9-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:43150->127.0.0.1:8080: read: connection reset by peer\n')
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
rirv938-llama-8b-multihe-6884-v2-mkmlizer: quantized model in 88.207s
rirv938-llama-8b-multihe-6884-v2-mkmlizer: Processed model rirv938/llama_8b_multihead_204m_retry in 152.019s
rirv938-llama-8b-multihe-6884-v2-mkmlizer: creating bucket guanaco-mkml-models
rirv938-llama-8b-multihe-6884-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rirv938-llama-8b-multihe-6884-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rirv938-llama-8b-multihe-6884-v2
rirv938-llama-8b-multihe-6884-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rirv938-llama-8b-multihe-6884-v2/tokenizer.json
rirv938-llama-8b-multihe-6884-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/rirv938-llama-8b-multihe-6884-v2/flywheel_model.0.safetensors
rirv938-llama-8b-multihe-6884-v2-mkmlizer:
Loading 0: 0%| | 0/291 [00:00<?, ?it/s]
Loading 0: 1%| | 3/291 [00:00<00:58, 4.91it/s]
Loading 0: 1%|▏ | 4/291 [00:01<01:35, 3.00it/s]
Loading 0: 2%|▏ | 5/291 [00:01<02:07, 2.24it/s]
Loading 0: 3%|▎ | 8/291 [00:02<01:06, 4.27it/s]
Loading 0: 3%|▎ | 9/291 [00:02<01:05, 4.33it/s]
Loading 0: 3%|▎ | 10/291 [00:02<00:57, 4.91it/s]
Loading 0: 4%|▍ | 12/291 [00:03<01:07, 4.12it/s]
Loading 0: 4%|▍ | 13/291 [00:03<01:29, 3.10it/s]
Loading 0: 5%|▍ | 14/291 [00:04<01:53, 2.44it/s]
Loading 0: 6%|▌ | 17/291 [00:04<01:04, 4.27it/s]
Loading 0: 6%|▌ | 18/291 [00:04<01:00, 4.49it/s]
Loading 0: 7%|▋ | 19/291 [00:04<00:53, 5.09it/s]
Loading 0: 7%|▋ | 21/291 [00:05<01:03, 4.25it/s]
Loading 0: 8%|▊ | 22/291 [00:06<01:24, 3.20it/s]
Loading 0: 8%|▊ | 23/291 [00:06<01:46, 2.53it/s]
Loading 0: 9%|▉ | 26/291 [00:06<01:01, 4.30it/s]
Loading 0: 9%|▉ | 27/291 [00:07<00:58, 4.50it/s]
Loading 0: 10%|▉ | 28/291 [00:07<00:52, 5.01it/s]
Loading 0: 10%|█ | 30/291 [00:07<00:42, 6.10it/s]
Loading 0: 11%|█ | 31/291 [00:07<00:43, 6.04it/s]
Loading 0: 11%|█ | 32/291 [00:07<00:39, 6.64it/s]
Loading 0: 11%|█▏ | 33/291 [00:07<00:42, 6.10it/s]
Loading 0: 12%|█▏ | 34/291 [00:08<01:11, 3.61it/s]
Loading 0: 12%|█▏ | 35/291 [00:09<01:32, 2.76it/s]
Loading 0: 12%|█▏ | 36/291 [00:09<01:53, 2.24it/s]
Loading 0: 13%|█▎ | 39/291 [00:10<01:18, 3.22it/s]
Loading 0: 14%|█▎ | 40/291 [00:10<01:32, 2.71it/s]
Loading 0: 14%|█▍ | 41/291 [00:11<01:49, 2.29it/s]
Loading 0: 15%|█▌ | 44/291 [00:11<01:02, 3.94it/s]
Loading 0: 15%|█▌ | 45/291 [00:11<00:58, 4.24it/s]
Loading 0: 16%|█▌ | 46/291 [00:12<00:50, 4.82it/s]
Loading 0: 16%|█▋ | 48/291 [00:12<00:58, 4.14it/s]
Loading 0: 17%|█▋ | 49/291 [00:13<01:16, 3.15it/s]
Loading 0: 17%|█▋ | 50/291 [00:13<01:35, 2.51it/s]
Loading 0: 18%|█▊ | 53/291 [00:14<00:55, 4.28it/s]
Loading 0: 19%|█▊ | 54/291 [00:14<00:51, 4.57it/s]
Loading 0: 19%|█▉ | 55/291 [00:14<00:45, 5.17it/s]
Loading 0: 20%|█▉ | 57/291 [00:14<00:54, 4.30it/s]
Loading 0: 20%|█▉ | 58/291 [00:15<01:12, 3.23it/s]
Loading 0: 20%|██ | 59/291 [00:16<01:30, 2.55it/s]
Loading 0: 21%|██▏ | 62/291 [00:16<00:52, 4.33it/s]
Loading 0: 22%|██▏ | 63/291 [00:16<00:50, 4.56it/s]
Loading 0: 22%|██▏ | 64/291 [00:16<00:44, 5.09it/s]
Loading 0: 23%|██▎ | 66/291 [00:17<00:52, 4.25it/s]
Loading 0: 23%|██▎ | 67/291 [00:17<01:09, 3.20it/s]
Loading 0: 23%|██▎ | 68/291 [00:18<01:28, 2.53it/s]
Loading 0: 24%|██▍ | 71/291 [00:18<00:51, 4.30it/s]
Loading 0: 25%|██▍ | 72/291 [00:18<00:47, 4.59it/s]
Loading 0: 25%|██▌ | 73/291 [00:19<00:41, 5.19it/s]
Loading 0: 25%|██▌ | 74/291 [00:19<01:02, 3.48it/s]
Loading 0: 26%|██▌ | 75/291 [00:20<01:22, 2.62it/s]
Loading 0: 26%|██▋ | 77/291 [00:20<00:57, 3.75it/s]
Loading 0: 27%|██▋ | 78/291 [00:20<00:52, 4.09it/s]
Loading 0: 27%|██▋ | 79/291 [00:20<00:44, 4.78it/s]
Loading 0: 27%|██▋ | 80/291 [00:20<00:42, 4.99it/s]
Loading 0: 28%|██▊ | 81/291 [00:21<01:04, 3.26it/s]
Loading 0: 28%|██▊ | 82/291 [00:22<01:20, 2.60it/s]
Loading 0: 29%|██▊ | 83/291 [00:22<01:36, 2.15it/s]
Loading 0: 30%|██▉ | 86/291 [00:23<00:50, 4.03it/s]
Loading 0: 30%|██▉ | 87/291 [00:23<00:47, 4.31it/s]
Loading 0: 30%|███ | 88/291 [00:23<00:41, 4.91it/s]
Loading 0: 31%|███ | 90/291 [00:23<00:48, 4.14it/s]
Loading 0: 31%|███▏ | 91/291 [00:24<01:03, 3.14it/s]
Loading 0: 32%|███▏ | 92/291 [00:25<01:19, 2.50it/s]
Loading 0: 33%|███▎ | 95/291 [00:25<00:45, 4.33it/s]
Loading 0: 33%|███▎ | 96/291 [00:25<00:42, 4.56it/s]
Loading 0: 33%|███▎ | 97/291 [00:25<00:37, 5.17it/s]
Loading 0: 34%|███▍ | 99/291 [00:26<00:44, 4.29it/s]
Loading 0: 34%|███▍ | 100/291 [00:26<00:59, 3.23it/s]
Loading 0: 35%|███▍ | 101/291 [00:27<01:14, 2.55it/s]
Loading 0: 36%|███▌ | 104/291 [00:27<00:43, 4.34it/s]
Loading 0: 36%|███▌ | 105/291 [00:27<00:40, 4.56it/s]
Loading 0: 36%|███▋ | 106/291 [00:27<00:36, 5.05it/s]
Loading 0: 37%|███▋ | 108/291 [00:28<00:43, 4.22it/s]
Loading 0: 37%|███▋ | 109/291 [00:29<00:56, 3.19it/s]
Loading 0: 38%|███▊ | 110/291 [00:29<01:11, 2.55it/s]
Loading 0: 39%|███▉ | 113/291 [00:30<00:41, 4.32it/s]
Loading 0: 39%|███▉ | 114/291 [00:30<00:38, 4.55it/s]
Loading 0: 40%|███▉ | 115/291 [00:30<00:34, 5.13it/s]
Loading 0: 40%|███▉ | 116/291 [00:30<00:50, 3.45it/s]
Loading 0: 41%|████ | 118/291 [00:31<00:37, 4.62it/s]
Loading 0: 41%|████ | 119/291 [00:31<00:35, 4.84it/s]
Loading 0: 41%|████ | 120/291 [00:31<00:31, 5.42it/s]
Loading 0: 42%|████▏ | 122/291 [00:31<00:38, 4.35it/s]
Loading 0: 43%|████▎ | 125/291 [00:32<00:35, 4.66it/s]
Loading 0: 43%|████▎ | 126/291 [00:33<00:46, 3.53it/s]
Loading 0: 44%|████▎ | 127/291 [00:33<00:59, 2.77it/s]
Loading 0: 45%|████▍ | 130/291 [00:34<00:36, 4.44it/s]
Loading 0: 45%|████▌ | 131/291 [00:34<00:34, 4.63it/s]
Loading 0: 45%|████▌ | 132/291 [00:34<00:30, 5.14it/s]
Loading 0: 46%|████▌ | 133/291 [00:34<00:30, 5.22it/s]
Loading 0: 46%|████▌ | 134/291 [00:35<00:45, 3.45it/s]
Loading 0: 46%|████▋ | 135/291 [00:35<00:59, 2.60it/s]
Loading 0: 47%|████▋ | 138/291 [00:36<00:43, 3.50it/s]
Loading 0: 48%|████▊ | 139/291 [00:36<00:52, 2.88it/s]
Loading 0: 48%|████▊ | 140/291 [00:37<01:03, 2.39it/s]
Loading 0: 49%|████▉ | 143/291 [00:37<00:36, 4.01it/s]
Loading 0: 49%|████▉ | 144/291 [00:37<00:34, 4.25it/s]
Loading 0: 50%|████▉ | 145/291 [00:38<00:30, 4.82it/s]
Loading 0: 51%|█████ | 147/291 [00:38<00:34, 4.13it/s]
Loading 0: 51%|█████ | 148/291 [00:39<00:45, 3.16it/s]
Loading 0: 51%|█████ | 149/291 [00:39<00:55, 2.54it/s]
Loading 0: 52%|█████▏ | 152/291 [00:40<00:32, 4.30it/s]
Loading 0: 53%|█████▎ | 153/291 [00:40<00:30, 4.54it/s]
Loading 0: 53%|█████▎ | 154/291 [00:40<00:26, 5.11it/s]
Loading 0: 54%|█████▎ | 156/291 [00:41<00:31, 4.26it/s]
Loading 0: 54%|█████▍ | 157/291 [00:41<00:41, 3.22it/s]
Loading 0: 54%|█████▍ | 158/291 [00:42<00:52, 2.54it/s]
Loading 0: 55%|█████▌ | 161/291 [00:42<00:30, 4.30it/s]
Loading 0: 56%|█████▌ | 162/291 [00:42<00:28, 4.53it/s]
Loading 0: 56%|█████▌ | 163/291 [00:42<00:25, 5.11it/s]
Loading 0: 57%|█████▋ | 165/291 [00:43<00:29, 4.23it/s]
Loading 0: 57%|█████▋ | 166/291 [00:43<00:39, 3.20it/s]
Loading 0: 57%|█████▋ | 167/291 [00:44<00:48, 2.56it/s]
Loading 0: 58%|█████▊ | 170/291 [00:44<00:27, 4.34it/s]
Loading 0: 59%|█████▉ | 171/291 [00:44<00:26, 4.57it/s]
Loading 0: 59%|█████▉ | 172/291 [00:45<00:23, 5.15it/s]
Loading 0: 60%|█████▉ | 174/291 [00:45<00:27, 4.29it/s]
Loading 0: 60%|██████ | 175/291 [00:46<00:35, 3.23it/s]
Loading 0: 60%|██████ | 176/291 [00:46<00:44, 2.57it/s]
Loading 0: 62%|██████▏ | 179/291 [00:47<00:25, 4.36it/s]
Loading 0: 62%|██████▏ | 180/291 [00:47<00:24, 4.58it/s]
Loading 0: 62%|██████▏ | 181/291 [00:47<00:21, 5.18it/s]
Loading 0: 63%|██████▎ | 183/291 [00:47<00:17, 6.23it/s]
Loading 0: 63%|██████▎ | 184/291 [00:47<00:17, 6.15it/s]
Loading 0: 64%|██████▎ | 185/291 [00:47<00:15, 6.71it/s]
Loading 0: 64%|██████▍ | 186/291 [00:48<00:16, 6.43it/s]
Loading 0: 64%|██████▍ | 187/291 [00:48<00:28, 3.71it/s]
Loading 0: 65%|██████▍ | 188/291 [00:49<00:36, 2.81it/s]
Loading 0: 65%|██████▍ | 189/291 [00:49<00:44, 2.27it/s]
Loading 0: 66%|██████▌ | 192/291 [00:50<00:30, 3.24it/s]
Loading 0: 66%|██████▋ | 193/291 [00:51<00:35, 2.74it/s]
Loading 0: 67%|██████▋ | 194/291 [00:51<00:41, 2.32it/s]
Loading 0: 68%|██████▊ | 197/291 [00:51<00:23, 3.93it/s]
Loading 0: 68%|██████▊ | 198/291 [00:52<00:22, 4.19it/s]
Loading 0: 68%|██████▊ | 199/291 [00:52<00:19, 4.76it/s]
Loading 0: 69%|██████▉ | 201/291 [00:52<00:21, 4.12it/s]
Loading 0: 69%|██████▉ | 202/291 [00:53<00:28, 3.17it/s]
Loading 0: 70%|██████▉ | 203/291 [00:54<00:35, 2.48it/s]
Loading 0: 71%|███████ | 206/291 [00:54<00:20, 4.21it/s]
Loading 0: 71%|███████ | 207/291 [00:54<00:18, 4.45it/s]
Loading 0: 71%|███████▏ | 208/291 [00:54<00:16, 4.99it/s]
Loading 0: 72%|███████▏ | 210/291 [00:55<00:19, 4.22it/s]
Loading 0: 73%|███████▎ | 211/291 [00:55<00:24, 3.22it/s]
Loading 0: 73%|███████▎ | 212/291 [00:56<00:30, 2.57it/s]
Loading 0: 74%|███████▍ | 215/291 [00:56<00:17, 4.34it/s]
Loading 0: 74%|███████▍ | 216/291 [00:56<00:16, 4.58it/s]
Loading 0: 75%|███████▍ | 217/291 [00:56<00:14, 5.15it/s]
Loading 0: 75%|███████▌ | 219/291 [00:57<00:16, 4.28it/s]
Loading 0: 76%|███████▌ | 220/291 [00:58<00:21, 3.24it/s]
Loading 0: 76%|███████▌ | 221/291 [00:58<00:27, 2.57it/s]
Loading 0: 77%|███████▋ | 224/291 [00:58<00:15, 4.36it/s]
Loading 0: 77%|███████▋ | 225/291 [00:59<00:14, 4.58it/s]
Loading 0: 78%|███████▊ | 226/291 [00:59<00:12, 5.12it/s]
Loading 0: 78%|███████▊ | 227/291 [00:59<00:18, 3.46it/s]
Loading 0: 78%|███████▊ | 228/291 [01:00<00:24, 2.60it/s]
Loading 0: 79%|███████▉ | 230/291 [01:00<00:16, 3.73it/s]
Loading 0: 79%|███████▉ | 231/291 [01:00<00:14, 4.07it/s]
Loading 0: 80%|███████▉ | 232/291 [01:01<00:12, 4.70it/s]
Loading 0: 80%|████████ | 233/291 [01:01<00:11, 4.92it/s]
Loading 0: 80%|████████ | 234/291 [01:01<00:17, 3.24it/s]
Loading 0: 81%|████████▏ | 237/291 [01:02<00:13, 4.03it/s]
Loading 0: 82%|████████▏ | 238/291 [01:02<00:16, 3.14it/s]
Loading 0: 82%|████████▏ | 239/291 [01:03<00:20, 2.52it/s]
Loading 0: 83%|████████▎ | 242/291 [01:03<00:11, 4.23it/s]
Loading 0: 84%|████████▎ | 243/291 [01:03<00:10, 4.47it/s]
Loading 0: 84%|████████▍ | 244/291 [01:04<00:09, 5.03it/s]
Loading 0: 85%|████████▍ | 246/291 [01:04<00:10, 4.23it/s]
Loading 0: 85%|████████▍ | 247/291 [01:05<00:13, 3.22it/s]
Loading 0: 85%|████████▌ | 248/291 [01:05<00:16, 2.56it/s]
Loading 0: 86%|████████▋ | 251/291 [01:06<00:09, 4.34it/s]
Loading 0: 87%|████████▋ | 252/291 [01:06<00:08, 4.56it/s]
Loading 0: 87%|████████▋ | 253/291 [01:06<00:07, 5.11it/s]
Loading 0: 88%|████████▊ | 255/291 [01:07<00:08, 4.28it/s]
Loading 0: 88%|████████▊ | 256/291 [01:07<00:10, 3.23it/s]
Loading 0: 88%|████████▊ | 257/291 [01:08<00:13, 2.52it/s]
Loading 0: 89%|████████▉ | 260/291 [01:08<00:07, 4.28it/s]
Loading 0: 90%|████████▉ | 261/291 [01:08<00:06, 4.51it/s]
Loading 0: 90%|█████████ | 262/291 [01:08<00:05, 5.06it/s]
Loading 0: 91%|█████████ | 264/291 [01:09<00:06, 4.26it/s]
Loading 0: 91%|█████████ | 265/291 [01:09<00:08, 3.24it/s]
Loading 0: 91%|█████████▏| 266/291 [01:10<00:09, 2.57it/s]
Loading 0: 92%|█████████▏| 269/291 [01:10<00:05, 4.36it/s]
Loading 0: 93%|█████████▎| 270/291 [01:10<00:04, 4.60it/s]
Loading 0: 93%|█████████▎| 271/291 [01:11<00:03, 5.17it/s]
Loading 0: 94%|█████████▍| 273/291 [01:11<00:04, 4.31it/s]
Loading 0: 94%|█████████▍| 274/291 [01:12<00:05, 3.25it/s]
Loading 0: 95%|█████████▍| 275/291 [01:12<00:06, 2.58it/s]
Loading 0: 96%|█████████▌| 278/291 [01:13<00:02, 4.37it/s]
Loading 0: 96%|█████████▌| 279/291 [01:13<00:02, 4.60it/s]
Loading 0: 96%|█████████▌| 280/291 [01:13<00:02, 5.16it/s]
Loading 0: 97%|█████████▋| 281/291 [01:13<00:02, 3.48it/s]
Loading 0: 97%|█████████▋| 283/291 [01:14<00:01, 4.65it/s]
Loading 0: 98%|█████████▊| 284/291 [01:14<00:01, 4.87it/s]
Loading 0: 98%|█████████▊| 285/291 [01:14<00:01, 5.38it/s]
Loading 0: 98%|█████████▊| 286/291 [01:14<00:00, 5.54it/s]
Loading 0: 99%|█████████▊| 287/291 [01:15<00:01, 3.44it/s]
Loading 0: 99%|█████████▉| 288/291 [01:15<00:01, 2.56it/s]
Job rirv938-llama-8b-multihe-6884-v2-mkmlizer completed after 175.18s with status: succeeded
Stopping job with name rirv938-llama-8b-multihe-6884-v2-mkmlizer
Pipeline stage MKMLizer completed in 175.70s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rirv938-llama-8b-multihe-6884-v2
Waiting for inference service rirv938-llama-8b-multihe-6884-v2 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Inference service rirv938-llama-8b-multihe-6884-v2 ready after 140.7455267906189s
Pipeline stage MKMLDeployer completed in 141.31s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.762026786804199s
Received healthy response to inference request in 3.5326805114746094s
Received healthy response to inference request in 3.3266072273254395s
Received healthy response to inference request in 4.798598051071167s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 15.6293306350708s
5 requests
0 failed requests
5th percentile: 3.3678218841552736
10th percentile: 3.4090365409851073
20th percentile: 3.491465854644775
30th percentile: 3.5785497665405273
40th percentile: 3.6702882766723635
50th percentile: 3.762026786804199
60th percentile: 4.176655292510986
70th percentile: 4.591283798217773
80th percentile: 6.9647445678710955
90th percentile: 11.297037601470947
95th percentile: 13.463184118270872
99th percentile: 15.196101331710816
mean time: 6.209848642349243
%s, retrying in %s seconds...
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 3.517648935317993s
Received healthy response to inference request in 2.9926395416259766s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 3.524838447570801s
Received healthy response to inference request in 3.6662485599517822s
Received healthy response to inference request in 2.5049896240234375s
5 requests
0 failed requests
5th percentile: 2.6025196075439454
10th percentile: 2.7000495910644533
20th percentile: 2.8951095581054687
30th percentile: 3.09764142036438
40th percentile: 3.3076451778411866
50th percentile: 3.517648935317993
60th percentile: 3.520524740219116
70th percentile: 3.5234005451202393
80th percentile: 3.553120470046997
90th percentile: 3.6096845149993895
95th percentile: 3.637966537475586
99th percentile: 3.660592155456543
mean time: 3.241273021697998
Pipeline stage StressChecker completed in 50.20s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.46s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 2.28s
Shutdown handler de-registered
rirv938-llama-8b-multihe_6884_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.16s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-multihe-6884-v2-profiler
Waiting for inference service rirv938-llama-8b-multihe-6884-v2-profiler to be ready
Inference service rirv938-llama-8b-multihe-6884-v2-profiler ready after 140.45786356925964s
Pipeline stage MKMLProfilerDeployer completed in 140.89s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplot7qvn:/code/chaiverse_profiler_1732826129 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplot7qvn --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1732826129 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1732826129/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplot7qvn:/code/chaiverse_profiler_1732828907 --namespace tenant-chaiml-guanaco
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplot7qvn:/code/chaiverse_profiler_1732828907 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplot7qvn --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1732828907 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1732828907/summary.json'
Received signal 2, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-multihe-6884-v2-profiler is running
Tearing down inference service rirv938-llama-8b-multihe-6884-v2-profiler
Service rirv938-llama-8b-multihe-6884-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 1.51s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-multihe-6884-v2-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 1.46s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-multihe-6884-v2-profiler
Waiting for inference service rirv938-llama-8b-multihe-6884-v2-profiler to be ready
Inference service rirv938-llama-8b-multihe-6884-v2-profiler ready after 60.14953899383545s
Pipeline stage MKMLProfilerDeployer completed in 60.48s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplo6sk85:/code/chaiverse_profiler_1732829681 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplo6sk85 --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1732829681 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1732829681/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplo6sk85:/code/chaiverse_profiler_1732832458 --namespace tenant-chaiml-guanaco
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplo6sk85:/code/chaiverse_profiler_1732832459 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplo6sk85 --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1732832459 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1732832459/summary.json'
Received signal 2, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-multihe-6884-v2-profiler is running
Tearing down inference service rirv938-llama-8b-multihe-6884-v2-profiler
Service rirv938-llama-8b-multihe-6884-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 1.64s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-multihe-6884-v2-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 1.45s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-multihe-6884-v2-profiler
Waiting for inference service rirv938-llama-8b-multihe-6884-v2-profiler to be ready
Inference service rirv938-llama-8b-multihe-6884-v2-profiler ready after 30.088762283325195s
Pipeline stage MKMLProfilerDeployer completed in 30.46s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplotkhkn:/code/chaiverse_profiler_1732833276 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplotkhkn --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1732833276 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1732833276/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplotkhkn:/code/chaiverse_profiler_1732836077 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-mulf10c6170823883b98f59daad7d90865c-deplotkhkn --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1732836077 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1732836077/summary.json'
Received signal 2, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-multihe-6884-v2-profiler is running
Tearing down inference service rirv938-llama-8b-multihe-6884-v2-profiler
Service rirv938-llama-8b-multihe-6884-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 1.55s
Shutdown handler de-registered
rirv938-llama-8b-multihe_6884_v2 status is now inactive due to auto deactivation removed underperforming models