Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rirv938-llama-8b-pairwis-9349-v2-mkmlizer
Waiting for job on rirv938-llama-8b-pairwis-9349-v2-mkmlizer to finish
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ _____ __ __ ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ /___/ ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ Version: 0.11.12 ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ https://mk1.ai ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ The license key for the current software has been verified as ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ belonging to: ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ Chai Research Corp. ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ║ ║
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: Downloaded to shared memory in 22.965s
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:t0, folder:/tmp/tmpckn532ed, device:0
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: quantized model in 86.651s
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: Processed model rirv938/llama_8b_pairwise_64m_256_tokens_step_3906 in 109.617s
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: creating bucket guanaco-mkml-models
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rirv938-llama-8b-pairwis-9349-v2
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/rirv938-llama-8b-pairwis-9349-v2/config.json
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/rirv938-llama-8b-pairwis-9349-v2/special_tokens_map.json
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/rirv938-llama-8b-pairwis-9349-v2/tokenizer_config.json
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rirv938-llama-8b-pairwis-9349-v2/tokenizer.json
rirv938-llama-8b-pairwis-9349-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/rirv938-llama-8b-pairwis-9349-v2/flywheel_model.0.safetensors
rirv938-llama-8b-pairwis-9349-v2-mkmlizer:
Loading 0: 0%| | 0/291 [00:00<?, ?it/s]
Loading 0: 1%| | 3/291 [00:00<00:56, 5.08it/s]
Loading 0: 1%|▏ | 4/291 [00:01<01:32, 3.11it/s]
Loading 0: 2%|▏ | 5/291 [00:01<02:02, 2.34it/s]
Loading 0: 3%|▎ | 8/291 [00:02<01:03, 4.46it/s]
Loading 0: 3%|▎ | 9/291 [00:02<01:02, 4.51it/s]
Loading 0: 3%|▎ | 10/291 [00:02<00:54, 5.13it/s]
Loading 0: 4%|▍ | 12/291 [00:02<01:05, 4.26it/s]
Loading 0: 4%|▍ | 13/291 [00:03<01:27, 3.18it/s]
Loading 0: 5%|▍ | 14/291 [00:04<01:50, 2.50it/s]
Loading 0: 6%|▌ | 17/291 [00:04<01:03, 4.30it/s]
Loading 0: 6%|▌ | 18/291 [00:04<01:00, 4.53it/s]
Loading 0: 7%|▋ | 19/291 [00:04<00:53, 5.07it/s]
Loading 0: 7%|▋ | 21/291 [00:05<01:03, 4.27it/s]
Loading 0: 8%|▊ | 22/291 [00:05<01:24, 3.20it/s]
Loading 0: 8%|▊ | 23/291 [00:06<01:46, 2.51it/s]
Loading 0: 9%|▉ | 26/291 [00:06<01:01, 4.28it/s]
Loading 0: 9%|▉ | 27/291 [00:06<00:58, 4.51it/s]
Loading 0: 10%|▉ | 28/291 [00:07<00:51, 5.09it/s]
Loading 0: 10%|█ | 30/291 [00:07<01:01, 4.25it/s]
Loading 0: 11%|█ | 31/291 [00:08<01:20, 3.21it/s]
Loading 0: 11%|█ | 32/291 [00:08<01:41, 2.54it/s]
Loading 0: 12%|█▏ | 35/291 [00:09<00:59, 4.32it/s]
Loading 0: 12%|█▏ | 36/291 [00:09<00:55, 4.56it/s]
Loading 0: 13%|█▎ | 37/291 [00:09<00:49, 5.14it/s]
Loading 0: 13%|█▎ | 39/291 [00:09<00:58, 4.31it/s]
Loading 0: 14%|█▎ | 40/291 [00:10<01:17, 3.25it/s]
Loading 0: 14%|█▍ | 41/291 [00:11<01:38, 2.55it/s]
Loading 0: 15%|█▌ | 44/291 [00:11<00:56, 4.39it/s]
Loading 0: 15%|█▌ | 45/291 [00:11<00:53, 4.62it/s]
Loading 0: 16%|█▌ | 46/291 [00:11<00:47, 5.21it/s]
Loading 0: 16%|█▋ | 48/291 [00:12<00:56, 4.33it/s]
Loading 0: 17%|█▋ | 49/291 [00:12<01:14, 3.26it/s]
Loading 0: 17%|█▋ | 50/291 [00:13<01:34, 2.55it/s]
Loading 0: 18%|█▊ | 53/291 [00:13<00:54, 4.39it/s]
Loading 0: 19%|█▊ | 54/291 [00:13<00:51, 4.63it/s]
Loading 0: 19%|█▉ | 55/291 [00:14<00:45, 5.22it/s]
Loading 0: 20%|█▉ | 57/291 [00:14<00:54, 4.32it/s]
Loading 0: 20%|█▉ | 58/291 [00:15<01:11, 3.25it/s]
Loading 0: 20%|██ | 59/291 [00:15<01:30, 2.57it/s]
Loading 0: 21%|██▏ | 62/291 [00:16<00:51, 4.42it/s]
Loading 0: 22%|██▏ | 63/291 [00:16<00:48, 4.66it/s]
Loading 0: 22%|██▏ | 64/291 [00:16<00:43, 5.26it/s]
Loading 0: 23%|██▎ | 66/291 [00:16<00:51, 4.35it/s]
Loading 0: 23%|██▎ | 67/291 [00:17<01:08, 3.27it/s]
Loading 0: 23%|██▎ | 68/291 [00:18<01:27, 2.56it/s]
Loading 0: 24%|██▍ | 71/291 [00:18<00:50, 4.36it/s]
Loading 0: 25%|██▍ | 72/291 [00:18<00:47, 4.60it/s]
Loading 0: 25%|██▌ | 73/291 [00:18<00:42, 5.16it/s]
Loading 0: 26%|██▌ | 75/291 [00:19<00:49, 4.33it/s]
Loading 0: 26%|██▌ | 76/291 [00:19<01:05, 3.26it/s]
Loading 0: 26%|██▋ | 77/291 [00:20<01:23, 2.55it/s]
Loading 0: 27%|██▋ | 80/291 [00:20<00:48, 4.35it/s]
Loading 0: 28%|██▊ | 81/291 [00:20<00:45, 4.58it/s]
Loading 0: 28%|██▊ | 82/291 [00:20<00:40, 5.14it/s]
Loading 0: 29%|██▊ | 83/291 [00:21<00:41, 5.01it/s]
Loading 0: 29%|██▉ | 84/291 [00:21<01:01, 3.37it/s]
Loading 0: 29%|██▉ | 85/291 [00:22<01:16, 2.69it/s]
Loading 0: 30%|██▉ | 86/291 [00:22<01:31, 2.24it/s]
Loading 0: 31%|███ | 89/291 [00:23<00:48, 4.12it/s]
Loading 0: 31%|███ | 90/291 [00:23<00:45, 4.41it/s]
Loading 0: 31%|███▏ | 91/291 [00:23<00:39, 5.03it/s]
Loading 0: 32%|███▏ | 93/291 [00:24<00:46, 4.24it/s]
Loading 0: 32%|███▏ | 94/291 [00:24<01:01, 3.21it/s]
Loading 0: 33%|███▎ | 95/291 [00:25<01:17, 2.52it/s]
Loading 0: 34%|███▎ | 98/291 [00:25<00:44, 4.36it/s]
Loading 0: 34%|███▍ | 99/291 [00:25<00:41, 4.60it/s]
Loading 0: 34%|███▍ | 100/291 [00:25<00:37, 5.10it/s]
Loading 0: 35%|███▌ | 102/291 [00:26<00:44, 4.29it/s]
Loading 0: 35%|███▌ | 103/291 [00:26<00:57, 3.25it/s]
Loading 0: 36%|███▌ | 104/291 [00:27<01:12, 2.57it/s]
Loading 0: 37%|███▋ | 107/291 [00:27<00:41, 4.43it/s]
Loading 0: 37%|███▋ | 108/291 [00:27<00:39, 4.66it/s]
Loading 0: 37%|███▋ | 109/291 [00:28<00:34, 5.25it/s]
Loading 0: 38%|███▊ | 111/291 [00:28<00:41, 4.35it/s]
Loading 0: 38%|███▊ | 112/291 [00:29<00:54, 3.28it/s]
Loading 0: 39%|███▉ | 113/291 [00:29<01:09, 2.57it/s]
Loading 0: 40%|███▉ | 116/291 [00:30<00:40, 4.37it/s]
Loading 0: 40%|████ | 117/291 [00:30<00:37, 4.61it/s]
Loading 0: 41%|████ | 118/291 [00:30<00:33, 5.19it/s]
Loading 0: 41%|████ | 120/291 [00:30<00:39, 4.33it/s]
Loading 0: 42%|████▏ | 121/291 [00:31<00:52, 3.26it/s]
Loading 0: 42%|████▏ | 122/291 [00:32<01:05, 2.59it/s]
Loading 0: 43%|████▎ | 125/291 [00:32<00:37, 4.43it/s]
Loading 0: 43%|████▎ | 126/291 [00:32<00:35, 4.67it/s]
Loading 0: 44%|████▎ | 127/291 [00:32<00:31, 5.22it/s]
Loading 0: 44%|████▍ | 129/291 [00:33<00:37, 4.33it/s]
Loading 0: 45%|████▍ | 130/291 [00:33<00:49, 3.26it/s]
Loading 0: 45%|████▌ | 131/291 [00:34<01:02, 2.57it/s]
Loading 0: 46%|████▌ | 134/291 [00:34<00:35, 4.43it/s]
Loading 0: 46%|████▋ | 135/291 [00:34<00:33, 4.67it/s]
Loading 0: 47%|████▋ | 136/291 [00:34<00:29, 5.26it/s]
Loading 0: 47%|████▋ | 138/291 [00:35<00:35, 4.36it/s]
Loading 0: 48%|████▊ | 139/291 [00:36<00:46, 3.24it/s]
Loading 0: 48%|████▊ | 140/291 [00:36<00:59, 2.55it/s]
Loading 0: 49%|████▉ | 143/291 [00:37<00:34, 4.35it/s]
Loading 0: 49%|████▉ | 144/291 [00:37<00:32, 4.59it/s]
Loading 0: 50%|████▉ | 145/291 [00:37<00:28, 5.14it/s]
Loading 0: 51%|█████ | 147/291 [00:37<00:33, 4.31it/s]
Loading 0: 51%|█████ | 148/291 [00:38<00:44, 3.25it/s]
Loading 0: 51%|█████ | 149/291 [00:39<00:55, 2.56it/s]
Loading 0: 52%|█████▏ | 152/291 [00:39<00:32, 4.33it/s]
Loading 0: 53%|█████▎ | 153/291 [00:39<00:30, 4.58it/s]
Loading 0: 53%|█████▎ | 154/291 [00:39<00:26, 5.17it/s]
Loading 0: 54%|█████▎ | 156/291 [00:40<00:31, 4.33it/s]
Loading 0: 54%|█████▍ | 157/291 [00:40<00:41, 3.26it/s]
Loading 0: 54%|█████▍ | 158/291 [00:41<00:51, 2.59it/s]
Loading 0: 55%|█████▌ | 161/291 [00:41<00:29, 4.44it/s]
Loading 0: 56%|█████▌ | 162/291 [00:41<00:27, 4.67it/s]
Loading 0: 56%|█████▌ | 163/291 [00:41<00:24, 5.27it/s]
Loading 0: 57%|█████▋ | 165/291 [00:42<00:28, 4.38it/s]
Loading 0: 57%|█████▋ | 166/291 [00:43<00:38, 3.29it/s]
Loading 0: 57%|█████▋ | 167/291 [00:43<00:48, 2.58it/s]
Loading 0: 58%|█████▊ | 170/291 [00:43<00:27, 4.43it/s]
Loading 0: 59%|█████▉ | 171/291 [00:44<00:25, 4.66it/s]
Loading 0: 59%|█████▉ | 172/291 [00:44<00:22, 5.26it/s]
Loading 0: 59%|█████▉ | 173/291 [00:44<00:33, 3.53it/s]
Loading 0: 60%|██████ | 175/291 [00:44<00:23, 4.84it/s]
Loading 0: 60%|██████ | 176/291 [00:45<00:22, 5.03it/s]
Loading 0: 61%|██████ | 177/291 [00:45<00:19, 5.71it/s]
Loading 0: 62%|██████▏ | 179/291 [00:45<00:24, 4.50it/s]
Loading 0: 62%|██████▏ | 180/291 [00:46<00:33, 3.30it/s]
Loading 0: 62%|██████▏ | 181/291 [00:47<00:42, 2.57it/s]
Loading 0: 63%|██████▎ | 184/291 [00:47<00:23, 4.51it/s]
Loading 0: 64%|██████▎ | 185/291 [00:47<00:22, 4.74it/s]
Loading 0: 64%|██████▍ | 186/291 [00:47<00:19, 5.34it/s]
Loading 0: 64%|██████▍ | 187/291 [00:47<00:19, 5.38it/s]
Loading 0: 65%|██████▍ | 188/291 [00:48<00:29, 3.49it/s]
Loading 0: 65%|██████▍ | 189/291 [00:48<00:38, 2.62it/s]
Loading 0: 66%|██████▌ | 192/291 [00:49<00:27, 3.54it/s]
Loading 0: 66%|██████▋ | 193/291 [00:50<00:33, 2.90it/s]
Loading 0: 67%|██████▋ | 194/291 [00:50<00:40, 2.41it/s]
Loading 0: 68%|██████▊ | 197/291 [00:50<00:23, 4.07it/s]
Loading 0: 68%|██████▊ | 198/291 [00:51<00:21, 4.33it/s]
Loading 0: 68%|██████▊ | 199/291 [00:51<00:18, 4.89it/s]
Loading 0: 69%|██████▉ | 201/291 [00:51<00:21, 4.20it/s]
Loading 0: 69%|██████▉ | 202/291 [00:52<00:27, 3.21it/s]
Loading 0: 70%|██████▉ | 203/291 [00:53<00:34, 2.56it/s]
Loading 0: 71%|███████ | 206/291 [00:53<00:19, 4.36it/s]
Loading 0: 71%|███████ | 207/291 [00:53<00:18, 4.60it/s]
Loading 0: 71%|███████▏ | 208/291 [00:53<00:16, 5.19it/s]
Loading 0: 72%|███████▏ | 210/291 [00:54<00:18, 4.34it/s]
Loading 0: 73%|███████▎ | 211/291 [00:54<00:24, 3.27it/s]
Loading 0: 73%|███████▎ | 212/291 [00:55<00:30, 2.57it/s]
Loading 0: 74%|███████▍ | 215/291 [00:55<00:17, 4.37it/s]
Loading 0: 74%|███████▍ | 216/291 [00:55<00:16, 4.60it/s]
Loading 0: 75%|███████▍ | 217/291 [00:55<00:14, 5.14it/s]
Loading 0: 75%|███████▌ | 219/291 [00:56<00:16, 4.32it/s]
Loading 0: 76%|███████▌ | 220/291 [00:56<00:21, 3.26it/s]
Loading 0: 76%|███████▌ | 221/291 [00:57<00:27, 2.56it/s]
Loading 0: 77%|███████▋ | 224/291 [00:57<00:15, 4.34it/s]
Loading 0: 77%|███████▋ | 225/291 [00:58<00:14, 4.59it/s]
Loading 0: 78%|███████▊ | 226/291 [00:58<00:12, 5.12it/s]
Loading 0: 78%|███████▊ | 228/291 [00:58<00:14, 4.28it/s]
Loading 0: 79%|███████▊ | 229/291 [00:59<00:19, 3.24it/s]
Loading 0: 79%|███████▉ | 230/291 [00:59<00:23, 2.58it/s]
Loading 0: 80%|████████ | 233/291 [01:00<00:13, 4.44it/s]
Loading 0: 80%|████████ | 234/291 [01:00<00:12, 4.67it/s]
Loading 0: 81%|████████ | 235/291 [01:00<00:10, 5.18it/s]
Loading 0: 81%|████████▏ | 237/291 [01:01<00:12, 4.33it/s]
Loading 0: 82%|████████▏ | 238/291 [01:01<00:16, 3.25it/s]
Loading 0: 82%|████████▏ | 239/291 [01:02<00:20, 2.58it/s]
Loading 0: 83%|████████▎ | 242/291 [01:02<00:11, 4.38it/s]
Loading 0: 84%|████████▎ | 243/291 [01:02<00:10, 4.61it/s]
Loading 0: 84%|████████▍ | 244/291 [01:02<00:09, 5.17it/s]
Loading 0: 85%|████████▍ | 246/291 [01:03<00:10, 4.34it/s]
Loading 0: 85%|████████▍ | 247/291 [01:03<00:13, 3.27it/s]
Loading 0: 85%|████████▌ | 248/291 [01:04<00:16, 2.59it/s]
Loading 0: 86%|████████▋ | 251/291 [01:04<00:09, 4.40it/s]
Loading 0: 87%|████████▋ | 252/291 [01:04<00:08, 4.63it/s]
Loading 0: 87%|████████▋ | 253/291 [01:05<00:07, 5.20it/s]
Loading 0: 88%|████████▊ | 255/291 [01:05<00:08, 4.34it/s]
Loading 0: 88%|████████▊ | 256/291 [01:06<00:10, 3.27it/s]
Loading 0: 88%|████████▊ | 257/291 [01:06<00:13, 2.57it/s]
Loading 0: 89%|████████▉ | 260/291 [01:07<00:07, 4.42it/s]
Loading 0: 90%|████████▉ | 261/291 [01:07<00:06, 4.65it/s]
Loading 0: 90%|█████████ | 262/291 [01:07<00:05, 5.14it/s]
Loading 0: 91%|█████████ | 264/291 [01:07<00:06, 4.30it/s]
Loading 0: 91%|█████████ | 265/291 [01:08<00:08, 3.25it/s]
Loading 0: 91%|█████████▏| 266/291 [01:09<00:09, 2.57it/s]
Loading 0: 92%|█████████▏| 269/291 [01:09<00:05, 4.35it/s]
Loading 0: 93%|█████████▎| 270/291 [01:09<00:04, 4.59it/s]
Loading 0: 93%|█████████▎| 271/291 [01:09<00:03, 5.09it/s]
Loading 0: 94%|█████████▍| 273/291 [01:10<00:04, 4.28it/s]
Loading 0: 94%|█████████▍| 274/291 [01:10<00:05, 3.25it/s]
Loading 0: 95%|█████████▍| 275/291 [01:11<00:06, 2.58it/s]
Loading 0: 96%|█████████▌| 278/291 [01:11<00:02, 4.44it/s]
Loading 0: 96%|█████████▌| 279/291 [01:11<00:02, 4.68it/s]
Loading 0: 96%|█████████▌| 280/291 [01:11<00:02, 5.27it/s]
Loading 0: 97%|█████████▋| 281/291 [01:12<00:02, 3.54it/s]
Loading 0: 97%|█████████▋| 282/291 [01:13<00:03, 2.67it/s]
Loading 0: 98%|█████████▊| 284/291 [01:13<00:01, 3.84it/s]
Loading 0: 98%|█████████▊| 285/291 [01:13<00:01, 4.19it/s]
Loading 0: 98%|█████████▊| 286/291 [01:13<00:01, 4.87it/s]
Loading 0: 99%|█████████▊| 287/291 [01:13<00:00, 5.06it/s]
Loading 0: 99%|█████████▉| 288/291 [01:14<00:00, 3.30it/s]
Job rirv938-llama-8b-pairwis-9349-v2-mkmlizer completed after 135.65s with status: succeeded
Stopping job with name rirv938-llama-8b-pairwis-9349-v2-mkmlizer
Pipeline stage MKMLizer completed in 136.21s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.20s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rirv938-llama-8b-pairwis-9349-v2
Waiting for inference service rirv938-llama-8b-pairwis-9349-v2 to be ready
Inference service rirv938-llama-8b-pairwis-9349-v2 ready after 190.65594172477722s
Pipeline stage MKMLDeployer completed in 191.68s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 4.2874226570129395s
Received healthy response to inference request in 2.6294636726379395s
Received healthy response to inference request in 3.0656402111053467s
Received healthy response to inference request in 5.165889501571655s
Received healthy response to inference request in 3.522270917892456s
5 requests
0 failed requests
5th percentile: 2.716698980331421
10th percentile: 2.8039342880249025
20th percentile: 2.978404903411865
30th percentile: 3.1569663524627685
40th percentile: 3.3396186351776125
50th percentile: 3.522270917892456
60th percentile: 3.8283316135406493
70th percentile: 4.134392309188843
80th percentile: 4.463116025924683
90th percentile: 4.814502763748169
95th percentile: 4.990196132659912
99th percentile: 5.130750827789306
mean time: 3.7341373920440675
%s, retrying in %s seconds...
Received healthy response to inference request in 5.598313331604004s
Received healthy response to inference request in 3.58318829536438s
Received healthy response to inference request in 5.137587308883667s
Received healthy response to inference request in 3.5208654403686523s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 3.480902671813965s
5 requests
0 failed requests
5th percentile: 3.488895225524902
10th percentile: 3.49688777923584
20th percentile: 3.512872886657715
30th percentile: 3.5333300113677977
40th percentile: 3.558259153366089
50th percentile: 3.58318829536438
60th percentile: 4.204947900772095
70th percentile: 4.826707506179809
80th percentile: 5.229732513427734
90th percentile: 5.414022922515869
95th percentile: 5.5061681270599365
99th percentile: 5.57988429069519
mean time: 4.264171409606933
%s, retrying in %s seconds...
Received healthy response to inference request in 2.7479796409606934s
Received healthy response to inference request in 3.3847320079803467s
Received healthy response to inference request in 3.5923688411712646s
Received healthy response to inference request in 2.536978244781494s
Received healthy response to inference request in 3.3974485397338867s
5 requests
0 failed requests
5th percentile: 2.579178524017334
10th percentile: 2.6213788032531737
20th percentile: 2.7057793617248533
30th percentile: 2.875330114364624
40th percentile: 3.1300310611724855
50th percentile: 3.3847320079803467
60th percentile: 3.389818620681763
70th percentile: 3.3949052333831786
80th percentile: 3.4364326000213623
90th percentile: 3.5144007205963135
95th percentile: 3.553384780883789
99th percentile: 3.5845720291137697
mean time: 3.1319014549255373
Pipeline stage StressChecker completed in 59.82s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.17s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 2.03s
Shutdown handler de-registered
rirv938-llama-8b-pairwis_9349_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-pairwis-9349-v2-profiler
Waiting for inference service rirv938-llama-8b-pairwis-9349-v2-profiler to be ready
Inference service rirv938-llama-8b-pairwis-9349-v2-profiler ready after 200.4850311279297s
Pipeline stage MKMLProfilerDeployer completed in 200.96s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplos8cw4:/code/chaiverse_profiler_1734045213 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplos8cw4 --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734045213 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734045213/summary.json'
kubectl exec -it rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplos8cw4 --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1734045213/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplos8cw4:/code/chaiverse_profiler_1734045777 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplos8cw4 --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734045777 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734045777/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplos8cw4:/code/chaiverse_profiler_1734046088 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplos8cw4 --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734046088 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734046088/summary.json'
clean up pipeline due to error=ISVCScriptError('Command failed with error: Defaulted container "kserve-container" out of: kserve-container, queue-proxy\nUnable to use a TTY - input is not a terminal or the right kind of file\nTraceback (most recent call last):\n File "/code/chaiverse_profiler_1734046088/profiles.py", line 574, in <module>\n cli()\n File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__\n return self.main(*args, **kwargs)\n File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1078, in main\n rv = self.invoke(ctx)\n File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke\n return _process_result(sub_ctx.command.invoke(sub_ctx))\n File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke\n return ctx.invoke(self.callback, **ctx.params)\n File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke\n return __callback(*args, **kwargs)\n File "/code/chaiverse_profiler_1734046088/profiles.py", line 90, in profile_batches\n client.wait_for_server_startup(target, max_wait=300)\n File "/code/inference_analysis/client.py", line 136, in wait_for_server_startup\n raise RuntimeError(msg)\nRuntimeError: Timed out after 300s waiting for startup\ncommand terminated with exit code 1\n, output: waiting for startup of TargetModel(endpoint=\'localhost\', route=\'GPT-J-6B-lit-v2\', namespace=\'tenant-chaiml-guanaco\', max_characters=9999, reward=False, url_format=\'{endpoint}-predictor-default.{namespace}.knative.ord1.coreweave.cloud\')\n')
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-pairwis-9349-v2-profiler is running
Tearing down inference service rirv938-llama-8b-pairwis-9349-v2-profiler
Service rirv938-llama-8b-pairwis-9349-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.21s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-pairwis-9349-v2-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 2.25s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.12s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-pairwis-9349-v2-profiler
Waiting for inference service rirv938-llama-8b-pairwis-9349-v2-profiler to be ready
Inference service rirv938-llama-8b-pairwis-9349-v2-profiler ready after 200.4460949897766s
Pipeline stage MKMLProfilerDeployer completed in 200.79s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplonm2nr:/code/chaiverse_profiler_1734046633 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplonm2nr --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734046633 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734046633/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplonm2nr:/code/chaiverse_profiler_1734049378 --namespace tenant-chaiml-guanaco
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplonm2nr:/code/chaiverse_profiler_1734049379 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplonm2nr --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734049379 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734049379/summary.json'
Received signal 2, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-pairwis-9349-v2-profiler is running
Tearing down inference service rirv938-llama-8b-pairwis-9349-v2-profiler
Service rirv938-llama-8b-pairwis-9349-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.38s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-pairwis-9349-v2-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 2.21s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-pairwis-9349-v2-profiler
Waiting for inference service rirv938-llama-8b-pairwis-9349-v2-profiler to be ready
Inference service rirv938-llama-8b-pairwis-9349-v2-profiler ready after 70.17449617385864s
Pipeline stage MKMLProfilerDeployer completed in 70.52s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplor2khm:/code/chaiverse_profiler_1734050123 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplor2khm --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734050123 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734050123/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplor2khm:/code/chaiverse_profiler_1734052921 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-pai172da686ad7590fd1e4e6500710d0a3d-deplor2khm --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1734052921 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1734052921/summary.json'
Received signal 2, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-pairwis-9349-v2-profiler is running
Tearing down inference service rirv938-llama-8b-pairwis-9349-v2-profiler
Service rirv938-llama-8b-pairwis-9349-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.19s
Shutdown handler de-registered
rirv938-llama-8b-pairwis_9349_v2 status is now inactive due to auto deactivation removed underperforming models