developer_uid: ChaiHarshitSheoran
submission_id: chaiml-mn12b-syn1-tune14_v1
model_name: chaiml-mn12b-syn1-tune14_v1
model_group: ChaiML/mn12b_syn1_tune14
status: torndown
timestamp: 2025-09-05T08:42:21+00:00
num_battles: 5937
num_wins: 3102
celo_rating: 1265.38
family_friendly_score: 0.5127999999999999
family_friendly_standard_error: 0.007068750384615374
submission_type: basic
model_repo: ChaiML/mn12b_syn1_tune14
model_architecture: MistralForCausalLM
model_num_parameters: 12772070400.0
best_of: 8
max_input_tokens: 1024
max_output_tokens: 64
reward_model: default
display_name: chaiml-mn12b-syn1-tune14_v1
is_internal_developer: False
language_model: ChaiML/mn12b_syn1_tune14
model_size: 13B
ranking_group: single
us_pacific_date: 2025-09-05
win_ratio: 0.5224861040929762
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name chaiml-mn12b-syn1-tune14-v1-mkmlizer
Waiting for job on chaiml-mn12b-syn1-tune14-v1-mkmlizer to finish
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ██████ ██████ █████ ████ ████ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ░░██████ ██████ ░░███ ███░ ░░███ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ░███░█████░███ ░███ ███ ░███ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ░███░░███ ░███ ░███████ ░███ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ░███ ░░░ ░███ ░███░░███ ░███ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ░███ ░███ ░███ ░░███ ░███ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ █████ █████ █████ ░░████ █████ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ░░░░░ ░░░░░ ░░░░░ ░░░░ ░░░░░ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ Version: 0.30.2 ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ Features: FLYWHEEL, CUDA ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ Copyright 2023-2025 MK ONE TECHNOLOGIES Inc. ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ https://mk1.ai ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ belonging to: ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ Chai Research Corp. ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ Expiration: 2028-03-31 23:59:59 ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ║ ║
chaiml-mn12b-syn1-tune14-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-mn12b-syn1-tune14-v1-mkmlizer: Downloaded to shared memory in 44.235s
chaiml-mn12b-syn1-tune14-v1-mkmlizer: Checking if ChaiML/mn12b_syn1_tune14 already exists in ChaiML
chaiml-mn12b-syn1-tune14-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:q4, folder:/tmp/tmprxw44ufp, device:0
chaiml-mn12b-syn1-tune14-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
chaiml-mn12b-syn1-tune14-v1-mkmlizer: quantized model in 147.644s
chaiml-mn12b-syn1-tune14-v1-mkmlizer: Processed model ChaiML/mn12b_syn1_tune14 in 191.879s
chaiml-mn12b-syn1-tune14-v1-mkmlizer: creating bucket guanaco-mkml-models
chaiml-mn12b-syn1-tune14-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
chaiml-mn12b-syn1-tune14-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/chaiml-mn12b-syn1-tune14-v1/nvidia
chaiml-mn12b-syn1-tune14-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-mn12b-syn1-tune14-v1/nvidia/tokenizer.json
HTTP Request: %s %s "%s %d %s"
chaiml-mn12b-syn1-tune14-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-mn12b-syn1-tune14-v1/nvidia/flywheel_model.0.safetensors
chaiml-mn12b-syn1-tune14-v1-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%| | 3/363 [00:00<01:46, 3.38it/s] Loading 0: 1%| | 4/363 [00:01<02:54, 2.06it/s] Loading 0: 1%|▏ | 5/363 [00:02<03:48, 1.57it/s] Loading 0: 2%|▏ | 8/363 [00:03<01:58, 3.00it/s] Loading 0: 2%|▏ | 9/363 [00:03<01:54, 3.09it/s] Loading 0: 3%|▎ | 10/363 [00:03<01:39, 3.56it/s] Loading 0: 3%|▎ | 12/363 [00:04<02:00, 2.91it/s] Loading 0: 4%|▎ | 13/363 [00:05<02:41, 2.16it/s] Loading 0: 4%|▍ | 14/363 [00:06<03:22, 1.73it/s] Loading 0: 5%|▍ | 17/363 [00:06<01:56, 2.97it/s] Loading 0: 5%|▍ | 18/363 [00:06<01:50, 3.12it/s] Loading 0: 5%|▌ | 19/363 [00:06<01:34, 3.62it/s] Loading 0: 6%|▌ | 21/363 [00:07<01:56, 2.95it/s] Loading 0: 6%|▌ | 22/363 [00:08<02:35, 2.20it/s] Loading 0: 6%|▋ | 23/363 [00:09<03:13, 1.76it/s] Loading 0: 7%|▋ | 26/363 [00:09<01:53, 2.98it/s] Loading 0: 7%|▋ | 27/363 [00:10<01:47, 3.13it/s] Loading 0: 8%|▊ | 28/363 [00:10<01:33, 3.58it/s] Loading 0: 8%|▊ | 30/363 [00:11<01:53, 2.94it/s] Loading 0: 9%|▊ | 31/363 [00:12<02:30, 2.20it/s] Loading 0: 9%|▉ | 32/363 [00:12<03:08, 1.75it/s] Loading 0: 10%|▉ | 35/363 [00:13<01:50, 2.97it/s] Loading 0: 10%|▉ | 36/363 [00:13<01:44, 3.12it/s] Loading 0: 10%|█ | 37/363 [00:13<01:30, 3.61it/s] Loading 0: 11%|█ | 39/363 [00:14<01:50, 2.94it/s] Loading 0: 11%|█ | 40/363 [00:15<02:26, 2.20it/s] Loading 0: 11%|█▏ | 41/363 [00:16<03:03, 1.75it/s] Loading 0: 12%|█▏ | 44/363 [00:16<01:47, 2.97it/s] Loading 0: 12%|█▏ | 45/363 [00:16<01:41, 3.13it/s] Loading 0: 13%|█▎ | 46/363 [00:17<01:27, 3.61it/s] Loading 0: 13%|█▎ | 48/363 [00:17<01:47, 2.94it/s] Loading 0: 13%|█▎ | 49/363 [00:18<02:23, 2.19it/s] Loading 0: 14%|█▍ | 50/363 [00:19<02:58, 1.75it/s] Loading 0: 15%|█▍ | 53/363 [00:20<01:44, 2.96it/s] Loading 0: 15%|█▍ | 54/363 [00:20<01:39, 3.12it/s] Loading 0: 15%|█▌ | 55/363 [00:20<01:25, 3.60it/s] Loading 0: 15%|█▌ | 56/363 [00:21<02:08, 2.38it/s] Loading 0: 16%|█▌ | 58/363 [00:21<01:35, 3.19it/s] Loading 0: 16%|█▋ | 59/363 [00:21<01:31, 3.34it/s] Loading 0: 17%|█▋ | 60/363 [00:22<01:18, 3.86it/s] Loading 0: 17%|█▋ | 61/363 [00:22<01:09, 4.34it/s] Loading 0: 17%|█▋ | 62/363 [00:23<02:02, 2.45it/s] Loading 0: 17%|█▋ | 63/363 [00:23<02:41, 1.86it/s] Loading 0: 18%|█▊ | 64/363 [00:24<03:15, 1.53it/s] Loading 0: 18%|█▊ | 67/363 [00:25<01:44, 2.84it/s] Loading 0: 19%|█▊ | 68/363 [00:25<01:37, 3.02it/s] Loading 0: 19%|█▉ | 69/363 [00:25<01:23, 3.53it/s] Loading 0: 20%|█▉ | 71/363 [00:26<01:41, 2.88it/s] Loading 0: 20%|█▉ | 72/363 [00:27<02:15, 2.15it/s] Loading 0: 20%|██ | 73/363 [00:28<02:49, 1.71it/s] Loading 0: 21%|██ | 76/363 [00:28<01:38, 2.93it/s] Loading 0: 21%|██ | 77/363 [00:28<01:32, 3.09it/s] Loading 0: 21%|██▏ | 78/363 [00:28<01:19, 3.56it/s] Loading 0: 22%|██▏ | 80/363 [00:29<01:37, 2.91it/s] Loading 0: 22%|██▏ | 81/363 [00:30<02:09, 2.18it/s] Loading 0: 23%|██▎ | 82/363 [00:31<02:41, 1.74it/s] Loading 0: 23%|██▎ | 85/363 [00:32<01:34, 2.95it/s] Loading 0: 24%|██▎ | 86/363 [00:32<01:29, 3.10it/s] Loading 0: 24%|██▍ | 87/363 [00:32<01:17, 3.57it/s] Loading 0: 25%|██▍ | 89/363 [00:33<01:33, 2.93it/s] Loading 0: 25%|██▍ | 90/363 [00:34<02:04, 2.19it/s] Loading 0: 25%|██▌ | 91/363 [00:35<02:36, 1.74it/s] Loading 0: 26%|██▌ | 94/363 [00:35<01:31, 2.95it/s] Loading 0: 26%|██▌ | 95/363 [00:35<01:26, 3.10it/s] Loading 0: 26%|██▋ | 96/363 [00:35<01:14, 3.59it/s] Loading 0: 27%|██▋ | 98/363 [00:36<01:30, 2.93it/s] Loading 0: 27%|██▋ | 99/363 [00:37<02:00, 2.19it/s] Loading 0: 28%|██▊ | 100/363 [00:38<02:30, 1.75it/s] Loading 0: 28%|██▊ | 103/363 [00:38<01:27, 2.96it/s] Loading 0: 29%|██▊ | 104/363 [00:39<01:23, 3.11it/s] Loading 0: 29%|██▉ | 105/363 [00:39<01:11, 3.58it/s] Loading 0: 29%|██▉ | 106/363 [00:40<01:48, 2.37it/s] Loading 0: 30%|██▉ | 108/363 [00:40<01:20, 3.17it/s] Loading 0: 30%|███ | 109/363 [00:40<01:16, 3.32it/s] Loading 0: 30%|███ | 110/363 [00:40<01:05, 3.88it/s] Loading 0: 31%|███ | 112/363 [00:41<01:23, 2.99it/s] Loading 0: 31%|███ | 113/363 [00:42<01:58, 2.11it/s] Loading 0: 32%|███▏ | 116/363 [00:43<01:35, 2.59it/s] Loading 0: 32%|███▏ | 117/363 [00:44<01:57, 2.09it/s] Loading 0: 33%|███▎ | 118/363 [00:45<02:21, 1.73it/s] Loading 0: 33%|███▎ | 121/363 [00:45<01:25, 2.84it/s] Loading 0: 34%|███▎ | 122/363 [00:45<01:20, 3.00it/s] Loading 0: 34%|███▍ | 123/363 [00:46<01:09, 3.46it/s] Loading 0: 34%|███▍ | 125/363 [00:46<01:22, 2.88it/s] Loading 0: 35%|███▍ | 126/363 [00:47<01:49, 2.17it/s] Loading 0: 35%|███▍ | 127/363 [00:48<02:15, 1.74it/s] Loading 0: 36%|███▌ | 130/363 [00:49<01:19, 2.94it/s] Loading 0: 36%|███▌ | 131/363 [00:49<01:14, 3.10it/s] Loading 0: 36%|███▋ | 132/363 [00:49<01:04, 3.58it/s] Loading 0: 37%|███▋ | 134/363 [00:50<01:18, 2.92it/s] Loading 0: 37%|███▋ | 135/363 [00:51<01:44, 2.19it/s] Loading 0: 37%|███▋ | 136/363 [00:52<02:09, 1.75it/s] Loading 0: 38%|███▊ | 139/363 [00:52<01:15, 2.97it/s] Loading 0: 39%|███▊ | 140/363 [00:52<01:11, 3.12it/s] Loading 0: 39%|███▉ | 141/363 [00:52<01:01, 3.61it/s] Loading 0: 39%|███▉ | 142/363 [00:52<00:54, 4.06it/s] Loading 0: 39%|███▉ | 143/363 [00:53<01:29, 2.46it/s] Loading 0: 40%|███▉ | 144/363 [00:54<02:00, 1.82it/s] Loading 0: 40%|████ | 147/363 [00:55<01:29, 2.42it/s] Loading 0: 41%|████ | 148/363 [00:56<01:49, 1.97it/s] Loading 0: 41%|████ | 149/363 [00:57<02:09, 1.65it/s] Loading 0: 42%|████▏ | 152/363 [00:57<01:15, 2.78it/s] Loading 0: 42%|████▏ | 153/363 [00:58<01:11, 2.95it/s] Loading 0: 42%|████▏ | 154/363 [00:58<01:01, 3.41it/s] Loading 0: 43%|████▎ | 156/363 [00:59<01:12, 2.86it/s] Loading 0: 43%|████▎ | 157/363 [00:59<01:35, 2.16it/s] Loading 0: 44%|████▎ | 158/363 [01:00<01:58, 1.74it/s] Loading 0: 44%|████▍ | 161/363 [01:01<01:08, 2.93it/s] Loading 0: 45%|████▍ | 162/363 [01:01<01:05, 3.09it/s] Loading 0: 45%|████▍ | 163/363 [01:01<00:55, 3.57it/s] Loading 0: 45%|████▌ | 165/363 [01:02<01:07, 2.92it/s] Loading 0: 46%|████▌ | 166/363 [01:03<01:29, 2.19it/s] Loading 0: 46%|████▌ | 167/363 [01:04<01:51, 1.76it/s] Loading 0: 47%|████▋ | 170/363 [01:04<01:04, 2.98it/s] Loading 0: 47%|████▋ | 171/363 [01:04<01:01, 3.13it/s] Loading 0: 47%|████▋ | 172/363 [01:04<00:53, 3.60it/s] Loading 0: 48%|████▊ | 174/363 [01:05<01:04, 2.94it/s] Loading 0: 48%|████▊ | 175/363 [01:06<01:25, 2.20it/s] Loading 0: 48%|████▊ | 176/363 [01:07<01:46, 1.75it/s] Loading 0: 49%|████▉ | 179/363 [01:08<01:02, 2.97it/s] Loading 0: 50%|████▉ | 180/363 [01:08<00:58, 3.12it/s] Loading 0: 50%|████▉ | 181/363 [01:08<00:50, 3.60it/s] Loading 0: 50%|█████ | 183/363 [01:09<01:01, 2.94it/s] Loading 0: 51%|█████ | 184/363 [01:10<01:21, 2.20it/s] Loading 0: 51%|█████ | 185/363 [01:11<01:40, 1.76it/s] Loading 0: 52%|█████▏ | 188/363 [01:11<00:58, 2.98it/s] Loading 0: 52%|█████▏ | 189/363 [01:11<00:55, 3.14it/s] Loading 0: 52%|█████▏ | 190/363 [01:11<00:47, 3.63it/s] Loading 0: 53%|█████▎ | 192/363 [01:12<00:57, 2.96it/s] Loading 0: 53%|█████▎ | 193/363 [01:13<01:16, 2.21it/s] Loading 0: 53%|█████▎ | 194/363 [01:14<01:35, 1.77it/s] Loading 0: 54%|█████▍ | 197/363 [01:14<00:55, 2.98it/s] Loading 0: 55%|█████▍ | 198/363 [01:15<00:52, 3.14it/s] Loading 0: 55%|█████▍ | 199/363 [01:15<00:45, 3.62it/s] Loading 0: 55%|█████▌ | 201/363 [01:16<00:54, 2.95it/s] Loading 0: 56%|█████▌ | 202/363 [01:16<01:13, 2.20it/s] Loading 0: 56%|█████▌ | 203/363 [01:17<01:30, 1.76it/s] Loading 0: 57%|█████▋ | 206/363 [01:18<00:52, 2.98it/s] Loading 0: 57%|█████▋ | 207/363 [01:18<00:49, 3.14it/s] Loading 0: 57%|█████▋ | 208/363 [01:18<00:42, 3.63it/s] Loading 0: 58%|█████▊ | 210/363 [01:19<00:51, 2.95it/s] Loading 0: 58%|█████▊ | 211/363 [01:20<01:08, 2.21it/s] Loading 0: 58%|█████▊ | 212/363 [01:21<01:25, 1.76it/s] Loading 0: 59%|█████▉ | 215/363 [01:21<00:49, 2.98it/s] Loading 0: 60%|█████▉ | 216/363 [01:21<00:46, 3.13it/s] Loading 0: 60%|█████▉ | 217/363 [01:21<00:40, 3.62it/s] Loading 0: 60%|██████ | 218/363 [01:22<01:00, 2.39it/s] Loading 0: 61%|██████ | 220/363 [01:23<00:44, 3.20it/s] Loading 0: 61%|██████ | 221/363 [01:23<00:42, 3.35it/s] Loading 0: 61%|██████ | 222/363 [01:23<00:35, 3.92it/s] Loading 0: 61%|██████▏ | 223/363 [01:23<00:31, 4.49it/s] Loading 0: 62%|██████▏ | 224/363 [01:24<00:55, 2.51it/s] Loading 0: 62%|██████▏ | 225/363 [01:25<01:15, 1.83it/s] Loading 0: 63%|██████▎ | 228/363 [01:26<00:55, 2.44it/s] Loading 0: 63%|██████▎ | 229/363 [01:27<01:07, 1.98it/s] Loading 0: 63%|██████▎ | 230/363 [01:28<01:20, 1.66it/s] Loading 0: 64%|██████▍ | 233/363 [01:28<00:46, 2.80it/s] Loading 0: 64%|██████▍ | 234/363 [01:28<00:43, 2.97it/s] Loading 0: 65%|██████▍ | 235/363 [01:28<00:37, 3.44it/s] Loading 0: 65%|██████▌ | 237/363 [01:29<00:43, 2.88it/s] Loading 0: 66%|██████▌ | 238/363 [01:30<00:57, 2.18it/s] Loading 0: 66%|██████▌ | 239/363 [01:31<01:10, 1.75it/s] Loading 0: 67%|██████▋ | 242/363 [01:31<00:40, 2.96it/s] Loading 0: 67%|██████▋ | 243/363 [01:32<00:38, 3.12it/s] Loading 0: 67%|██████▋ | 244/363 [01:32<00:33, 3.58it/s] Loading 0: 68%|██████▊ | 246/363 [01:33<00:39, 2.94it/s] Loading 0: 68%|██████▊ | 247/363 [01:33<00:52, 2.20it/s] Loading 0: 68%|██████▊ | 248/363 [01:34<01:05, 1.77it/s] Loading 0: 69%|██████▉ | 251/363 [01:35<00:37, 2.99it/s] Loading 0: 69%|██████▉ | 252/363 [01:35<00:35, 3.15it/s] Loading 0: 70%|██████▉ | 253/363 [01:35<00:30, 3.61it/s] Loading 0: 70%|███████ | 255/363 [01:36<00:36, 2.95it/s] Loading 0: 71%|███████ | 256/363 [01:37<00:48, 2.21it/s] Loading 0: 71%|███████ | 257/363 [01:38<00:59, 1.77it/s] Loading 0: 72%|███████▏ | 260/363 [01:38<00:34, 2.99it/s] Loading 0: 72%|███████▏ | 261/363 [01:38<00:32, 3.15it/s] Loading 0: 72%|███████▏ | 262/363 [01:38<00:27, 3.63it/s] Loading 0: 73%|███████▎ | 264/363 [01:39<00:33, 2.96it/s] Loading 0: 73%|███████▎ | 265/363 [01:40<00:44, 2.21it/s] Loading 0: 73%|███████▎ | 266/363 [01:41<00:54, 1.77it/s] Loading 0: 74%|███████▍ | 269/363 [01:41<00:31, 3.00it/s] Loading 0: 74%|███████▍ | 270/363 [01:42<00:29, 3.15it/s] Loading 0: 75%|███████▍ | 271/363 [01:42<00:25, 3.60it/s] Loading 0: 75%|███████▌ | 273/363 [01:43<00:30, 2.95it/s] Loading 0: 75%|███████▌ | 274/363 [01:44<00:40, 2.21it/s] Loading 0: 76%|███████▌ | 275/363 [01:44<00:49, 1.77it/s] Loading 0: 77%|███████▋ | 278/363 [01:45<00:28, 3.00it/s] Loading 0: 77%|███████▋ | 279/363 [01:45<00:26, 3.15it/s] Loading 0: 77%|███████▋ | 280/363 [01:45<00:22, 3.62it/s] Loading 0: 78%|███████▊ | 282/363 [01:46<00:27, 2.96it/s] Loading 0: 78%|███████▊ | 283/363 [01:47<00:36, 2.21it/s] Loading 0: 78%|███████▊ | 284/363 [01:48<00:44, 1.77it/s] Loading 0: 79%|███████▉ | 287/363 [01:48<00:25, 2.99it/s] Loading 0: 79%|███████▉ | 288/363 [01:48<00:23, 3.15it/s] Loading 0: 80%|███████▉ | 289/363 [01:48<00:20, 3.62it/s] Loading 0: 80%|████████ | 291/363 [01:49<00:24, 2.96it/s] Loading 0: 80%|████████ | 292/363 [01:50<00:32, 2.21it/s] Loading 0: 81%|████████ | 293/363 [01:51<00:39, 1.76it/s] Loading 0: 82%|████████▏ | 296/363 [01:51<00:22, 2.98it/s] Loading 0: 82%|████████▏ | 297/363 [01:52<00:21, 3.14it/s] Loading 0: 82%|████████▏ | 298/363 [01:52<00:17, 3.63it/s] Loading 0: 82%|████████▏ | 299/363 [01:53<00:26, 2.37it/s] Loading 0: 83%|████████▎ | 301/363 [01:53<00:19, 3.19it/s] Loading 0: 83%|████████▎ | 302/363 [01:53<00:18, 3.34it/s] Loading 0: 83%|████████▎ | 303/363 [01:53<00:15, 3.91it/s] Loading 0: 84%|████████▎ | 304/363 [01:54<00:16, 3.68it/s] Loading 0: 84%|████████▍ | 306/363 [01:55<00:19, 2.92it/s] Loading 0: 85%|████████▍ | 307/363 [01:56<00:26, 2.09it/s] Loading 0: 85%|████████▌ | 310/363 [01:56<00:20, 2.59it/s] Loading 0: 86%|████████▌ | 311/363 [01:57<00:24, 2.08it/s] Loading 0: 86%|████████▌ | 312/363 [01:58<00:29, 1.73it/s] Loading 0: 87%|████████▋ | 315/363 [01:59<00:16, 2.85it/s] Loading 0: 87%|████████▋ | 316/363 [01:59<00:15, 3.01it/s] Loading 0: 87%|████████▋ | 317/363 [01:59<00:13, 3.47it/s] Loading 0: 88%|████████▊ | 319/363 [02:00<00:15, 2.90it/s] Loading 0: 88%|████████▊ | 320/363 [02:01<00:19, 2.19it/s] Loading 0: 88%|████████▊ | 321/363 [02:02<00:23, 1.76it/s] Loading 0: 89%|████████▉ | 324/363 [02:02<00:13, 2.97it/s] Loading 0: 90%|████████▉ | 325/363 [02:02<00:12, 3.12it/s] Loading 0: 90%|████████▉ | 326/363 [02:02<00:10, 3.61it/s] Loading 0: 90%|█████████ | 328/363 [02:03<00:11, 2.96it/s] Loading 0: 91%|█████████ | 329/363 [02:04<00:15, 2.21it/s] Loading 0: 91%|█████████ | 330/363 [02:05<00:18, 1.76it/s] Loading 0: 92%|█████████▏| 333/363 [02:05<00:10, 2.98it/s] Loading 0: 92%|█████████▏| 334/363 [02:06<00:09, 3.14it/s] Loading 0: 92%|█████████▏| 335/363 [02:06<00:07, 3.62it/s] Loading 0: 93%|█████████▎| 337/363 [02:07<00:08, 2.95it/s] Loading 0: 93%|█████████▎| 338/363 [02:07<00:11, 2.21it/s] Loading 0: 93%|█████████▎| 339/363 [02:08<00:13, 1.77it/s] Loading 0: 94%|█████████▍| 342/363 [02:09<00:07, 2.99it/s] Loading 0: 94%|█████████▍| 343/363 [02:09<00:06, 3.14it/s] Loading 0: 95%|█████████▍| 344/363 [02:09<00:05, 3.63it/s] Loading 0: 95%|█████████▌| 346/363 [02:10<00:05, 2.95it/s] Loading 0: 96%|█████████▌| 347/363 [02:11<00:07, 2.21it/s] Loading 0: 96%|█████████▌| 348/363 [02:12<00:08, 1.77it/s] Loading 0: 97%|█████████▋| 351/363 [02:12<00:04, 2.99it/s] Loading 0: 97%|█████████▋| 352/363 [02:12<00:03, 3.14it/s] Loading 0: 97%|█████████▋| 353/363 [02:12<00:02, 3.62it/s] Loading 0: 98%|█████████▊| 355/363 [02:13<00:02, 2.92it/s] Loading 0: 98%|█████████▊| 356/363 [02:14<00:03, 2.19it/s] Loading 0: 98%|█████████▊| 357/363 [02:15<00:03, 1.77it/s] Loading 0: 99%|█████████▉| 360/363 [02:15<00:01, 2.99it/s] Loading 0: 99%|█████████▉| 361/363 [02:16<00:00, 3.14it/s] Loading 0: 100%|█████████▉| 362/363 [02:16<00:00, 3.63it/s]
Job chaiml-mn12b-syn1-tune14-v1-mkmlizer completed after 216.87s with status: succeeded
Stopping job with name chaiml-mn12b-syn1-tune14-v1-mkmlizer
Pipeline stage MKMLizer completed in 217.40s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.28s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service chaiml-mn12b-syn1-tune14-v1
Waiting for inference service chaiml-mn12b-syn1-tune14-v1 to be ready
Inference service chaiml-mn12b-syn1-tune14-v1 ready after 70.13229560852051s
Pipeline stage MKMLDeployer completed in 70.59s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.5392510890960693s
Received healthy response to inference request in 1.1230099201202393s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 1.3509180545806885s
Received healthy response to inference request in 1.4745726585388184s
Received healthy response to inference request in 1.042144536972046s
5 requests
0 failed requests
5th percentile: 1.0583176136016845
10th percentile: 1.0744906902313232
20th percentile: 1.1068368434906006
30th percentile: 1.1685915470123291
40th percentile: 1.2597548007965087
50th percentile: 1.3509180545806885
60th percentile: 1.4003798961639404
70th percentile: 1.4498417377471924
80th percentile: 1.4875083446502686
90th percentile: 1.5133797168731689
95th percentile: 1.526315402984619
99th percentile: 1.5366639518737792
mean time: 1.3059792518615723
Pipeline stage StressChecker completed in 8.21s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.70s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.70s
Shutdown handler de-registered
chaiml-mn12b-syn1-tune14_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.10s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-mn12b-syn1-tune14-v1-profiler
Waiting for inference service chaiml-mn12b-syn1-tune14-v1-profiler to be ready
Tearing down inference service chaiml-mn12b-syn1-tune14-v1-profiler
%s, retrying in %s seconds...
Creating inference service chaiml-mn12b-syn1-tune14-v1-profiler
Waiting for inference service chaiml-mn12b-syn1-tune14-v1-profiler to be ready
Tearing down inference service chaiml-mn12b-syn1-tune14-v1-profiler
%s, retrying in %s seconds...
Creating inference service chaiml-mn12b-syn1-tune14-v1-profiler
Waiting for inference service chaiml-mn12b-syn1-tune14-v1-profiler to be ready
Tearing down inference service chaiml-mn12b-syn1-tune14-v1-profiler
clean up pipeline due to error=DeploymentError('Timeout to start the InferenceService chaiml-mn12b-syn1-tune14-v1-profiler. The InferenceService is as following: {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'kind\': \'InferenceService\', \'metadata\': {\'annotations\': {\'autoscaling.knative.dev/class\': \'hpa.autoscaling.knative.dev\', \'autoscaling.knative.dev/container-concurrency-target-percentage\': \'70\', \'autoscaling.knative.dev/initial-scale\': \'1\', \'autoscaling.knative.dev/max-scale-down-rate\': \'1.1\', \'autoscaling.knative.dev/max-scale-up-rate\': \'2\', \'autoscaling.knative.dev/metric\': \'mean_pod_latency_ms_v2\', \'autoscaling.knative.dev/panic-threshold-percentage\': \'650\', \'autoscaling.knative.dev/panic-window-percentage\': \'35\', \'autoscaling.knative.dev/scale-down-delay\': \'30s\', \'autoscaling.knative.dev/scale-to-zero-grace-period\': \'10m\', \'autoscaling.knative.dev/stable-window\': \'180s\', \'autoscaling.knative.dev/target\': \'4000\', \'autoscaling.knative.dev/target-burst-capacity\': \'-1\', \'autoscaling.knative.dev/tick-interval\': \'15s\', \'features.knative.dev/http-full-duplex\': \'Enabled\', \'networking.knative.dev/ingress-class\': \'istio.ingress.networking.knative.dev\'}, \'creationTimestamp\': \'2025-09-05T09:09:05Z\', \'finalizers\': [\'inferenceservice.finalizers\'], \'generation\': 1, \'labels\': {\'istio.io/rev\': \'prod-canary\', \'knative.coreweave.cloud/ingress\': \'istio.ingress.networking.knative.dev\', \'prometheus.k.chaiverse.com\': \'true\', \'qos.coreweave.cloud/latency\': \'low\'}, \'managedFields\': [{\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:annotations\': {\'.\': {}, \'f:autoscaling.knative.dev/class\': {}, \'f:autoscaling.knative.dev/container-concurrency-target-percentage\': {}, \'f:autoscaling.knative.dev/initial-scale\': {}, \'f:autoscaling.knative.dev/max-scale-down-rate\': {}, \'f:autoscaling.knative.dev/max-scale-up-rate\': {}, \'f:autoscaling.knative.dev/metric\': {}, \'f:autoscaling.knative.dev/panic-threshold-percentage\': {}, \'f:autoscaling.knative.dev/panic-window-percentage\': {}, \'f:autoscaling.knative.dev/scale-down-delay\': {}, \'f:autoscaling.knative.dev/scale-to-zero-grace-period\': {}, \'f:autoscaling.knative.dev/stable-window\': {}, \'f:autoscaling.knative.dev/target\': {}, \'f:autoscaling.knative.dev/target-burst-capacity\': {}, \'f:autoscaling.knative.dev/tick-interval\': {}, \'f:features.knative.dev/http-full-duplex\': {}, \'f:networking.knative.dev/ingress-class\': {}}, \'f:labels\': {\'.\': {}, \'f:istio.io/rev\': {}, \'f:knative.coreweave.cloud/ingress\': {}, \'f:prometheus.k.chaiverse.com\': {}, \'f:qos.coreweave.cloud/latency\': {}}}, \'f:spec\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:affinity\': {\'.\': {}, \'f:nodeAffinity\': {\'.\': {}, \'f:tion\': {}, \'f:requiredDuringSchedulingIgnoredDuringExecution\': {}}}, \'f:containerConcurrency\': {}, \'f:containers\': {}, \'f:imagePullSecrets\': {}, \'f:maxReplicas\': {}, \'f:minReplicas\': {}, \'f:priorityClassName\': {}, \'f:timeout\': {}, \'f:volumes\': {}}}}, \'manager\': \'OpenAPI-Generator\', \'operation\': \'Update\', \'time\': \'2025-09-05T09:09:05Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:finalizers\': {\'.\': {}, \'v:"inferenceservice.finalizers"\': {}}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'time\': \'2025-09-05T09:09:05Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:status\': {\'.\': {}, \'f:components\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:latestCreatedRevision\': {}}}, \'f:conditions\': {}, \'f:modelStatus\': {\'.\': {}, \'f:states\': {\'.\': {}, \'f:activeModelState\': {}, \'f:targetModelState\': {}}, \'f:transitionStatus\': {}}, \'f:observedGeneration\': {}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'subresource\': \'status\', \'time\': \'2025-09-05T09:09:05Z\'}], \'name\': \'chaiml-mn12b-syn1-tune14-v1-profiler\', \'namespace\': \'tenant-chaiml-guanaco\', \'resourceVersion\': \'718761123\', \'uid\': \'87c5b075-a54b-4557-a0b3-b3211fe49d10\'}, \'spec\': {\'predictor\': {\'affinity\': {\'nodeAffinity\': {\'tion\': [{\'preference\': {\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'RTX_A5000\']}]}, \'weight\': 5}], \'requiredDuringSchedulingIgnoredDuringExecution\': {\'nodeSelectorTerms\': [{\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'RTX_A5000\', \'L40S\']}]}]}}}, \'containerConcurrency\': 0, \'containers\': [{\'env\': [{\'name\': \'MAX_TOKEN_INPUT\', \'value\': \'1024\'}, {\'name\': \'BEST_OF\', \'value\': \'8\'}, {\'name\': \'TEMPERATURE\', \'value\': \'1.0\'}, {\'name\': \'PRESENCE_PENALTY\', \'value\': \'0.0\'}, {\'name\': \'FREQUENCY_PENALTY\', \'value\': \'0.0\'}, {\'name\': \'TOP_P\', \'value\': \'1.0\'}, {\'name\': \'MIN_P\', \'value\': \'0.0\'}, {\'name\': \'TOP_K\', \'value\': \'40\'}, {\'name\': \'STOPPING_WORDS\', \'value\': \'["\\\\\\\\n"]\'}, {\'name\': \'MAX_TOKENS\', \'value\': \'64\'}, {\'name\': \'MAX_BATCH_SIZE\', \'value\': \'128\'}, {\'name\': \'MAX_CACHED_RESPONSES\', \'value\': \'-1\'}, {\'name\': \'URL_ROUTE\', \'value\': \'GPT-J-6B-lit-v2\'}, {\'name\': \'OBJ_ACCESS_KEY_ID\', \'value\': \'LETMTTRMLFFAMTBK\'}, {\'name\': \'OBJ_SECRET_ACCESS_KEY\', \'value\': \'VwwZaqefOOoaouNxUk03oUmK9pVEfruJhjBHPGdgycK\'}, {\'name\': \'OBJ_ENDPOINT\', \'value\': \'https://accel-object.ord1.coreweave.com\'}, {\'name\': \'TENSORIZER_URI\', \'value\': \'s3://guanaco-mkml-models/chaiml-mn12b-syn1-tune14-v1/nvidia\'}, {\'name\': \'RESERVE_MEMORY\', \'value\': \'2048\'}, {\'name\': \'DOWNLOAD_TO_LOCAL\', \'value\': \'/dev/shm/model_cache\'}, {\'name\': \'NUM_GPUS\', \'value\': \'1\'}, {\'name\': \'MK1_QUANTIZATION_PROFILE\', \'value\': \'q4\'}, {\'name\': \'MK1_MKML_LICENSE_KEY\', \'valueFrom\': {\'secretKeyRef\': {\'key\': \'key\', \'name\': \'mkml-license-key\'}}}], \'image\': \'gcr.io/chai-959f8/chai-guanaco/mkml:mkml_v0.30.2\', \'imagePullPolicy\': \'IfNotPresent\', \'name\': \'kserve-container\', \'readinessProbe\': {\'exec\': {\'command\': [\'cat\', \'/tmp/ready\']}, \'failureThreshold\': 1, \'initialDelaySeconds\': 10, \'periodSeconds\': 10, \'successThreshold\': 1, \'timeoutSeconds\': 5}, \'resources\': {\'limits\': {\'cpu\': \'2\', \'memory\': \'14Gi\', \'nvidia.com/gpu\': \'1\'}, \'requests\': {\'cpu\': \'2\', \'memory\': \'14Gi\', \'nvidia.com/gpu\': \'1\'}}, \'volumeMounts\': [{\'mountPath\': \'/dev/shm\', \'name\': \'shared-memory-cache\'}]}], \'imagePullSecrets\': [{\'name\': \'docker-creds\'}], \'maxReplicas\': 1, \'minReplicas\': 1, \'priorityClassName\': \'chaiverse\', \'timeout\': 60, \'volumes\': [{\'emptyDir\': {\'medium\': \'Memory\'}, \'name\': \'shared-memory-cache\'}]}}, \'status\': {\'components\': {\'predictor\': {\'latestCreatedRevision\': \'chaiml-mn12b-syn1-tune14-v1-profiler-predictor-00001\'}}, \'conditions\': [{\'lastTransitionTime\': \'2025-09-05T09:09:05Z\', \'reason\': \'PredictorConfigurationReady not ready\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'LatestDeploymentReady\'}, {\'lastTransitionTime\': \'2025-09-05T09:09:05Z\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'PredictorConfigurationReady\'}, {\'lastTransitionTime\': \'2025-09-05T09:09:05Z\', \'message\': \'Configuration "chaiml-mn12b-syn1-tune14-v1-profiler-predictor" is waiting for a Revision to become ready.\', \'reason\': \'RevisionMissing\', \'status\': \'Unknown\', \'type\': \'PredictorReady\'}, {\'lastTransitionTime\': \'2025-09-05T09:09:05Z\', \'message\': \'Configuration "chaiml-mn12b-syn1-tune14-v1-profiler-predictor" is waiting for a Revision to become ready.\', \'reason\': \'RevisionMissing\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'PredictorRouteReady\'}, {\'lastTransitionTime\': \'2025-09-05T09:09:05Z\', \'message\': \'Configuration "chaiml-mn12b-syn1-tune14-v1-profiler-predictor" is waiting for a Revision to become ready.\', \'reason\': \'RevisionMissing\', \'status\': \'Unknown\', \'type\': \'Ready\'}, {\'lastTransitionTime\': \'2025-09-05T09:09:05Z\', \'reason\': \'PredictorRouteReady not ready\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'RoutesReady\'}], \'modelStatus\': {\'states\': {\'activeModelState\': \'\', \'targetModelState\': \'Pending\'}, \'transitionStatus\': \'InProgress\'}, \'observedGeneration\': 1}}')
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.14s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.14s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-mn12b-syn1-tune14-v1-profiler
Waiting for inference service chaiml-mn12b-syn1-tune14-v1-profiler to be ready
Tearing down inference service chaiml-mn12b-syn1-tune14-v1-profiler
%s, retrying in %s seconds...
Creating inference service chaiml-mn12b-syn1-tune14-v1-profiler
Waiting for inference service chaiml-mn12b-syn1-tune14-v1-profiler to be ready
Tearing down inference service chaiml-mn12b-syn1-tune14-v1-profiler
%s, retrying in %s seconds...
Creating inference service chaiml-mn12b-syn1-tune14-v1-profiler
Waiting for inference service chaiml-mn12b-syn1-tune14-v1-profiler to be ready
Tearing down inference service chaiml-mn12b-syn1-tune14-v1-profiler
clean up pipeline due to error=DeploymentError('Timeout to start the InferenceService chaiml-mn12b-syn1-tune14-v1-profiler. The InferenceService is as following: {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'kind\': \'InferenceService\', \'metadata\': {\'annotations\': {\'autoscaling.knative.dev/class\': \'hpa.autoscaling.knative.dev\', \'autoscaling.knative.dev/container-concurrency-target-percentage\': \'70\', \'autoscaling.knative.dev/initial-scale\': \'1\', \'autoscaling.knative.dev/max-scale-down-rate\': \'1.1\', \'autoscaling.knative.dev/max-scale-up-rate\': \'2\', \'autoscaling.knative.dev/metric\': \'mean_pod_latency_ms_v2\', \'autoscaling.knative.dev/panic-threshold-percentage\': \'650\', \'autoscaling.knative.dev/panic-window-percentage\': \'35\', \'autoscaling.knative.dev/scale-down-delay\': \'30s\', \'autoscaling.knative.dev/scale-to-zero-grace-period\': \'10m\', \'autoscaling.knative.dev/stable-window\': \'180s\', \'autoscaling.knative.dev/target\': \'4000\', \'autoscaling.knative.dev/target-burst-capacity\': \'-1\', \'autoscaling.knative.dev/tick-interval\': \'15s\', \'features.knative.dev/http-full-duplex\': \'Enabled\', \'networking.knative.dev/ingress-class\': \'istio.ingress.networking.knative.dev\'}, \'creationTimestamp\': \'2025-09-05T09:40:48Z\', \'finalizers\': [\'inferenceservice.finalizers\'], \'generation\': 1, \'labels\': {\'istio.io/rev\': \'prod-canary\', \'knative.coreweave.cloud/ingress\': \'istio.ingress.networking.knative.dev\', \'prometheus.k.chaiverse.com\': \'true\', \'qos.coreweave.cloud/latency\': \'low\'}, \'managedFields\': [{\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:annotations\': {\'.\': {}, \'f:autoscaling.knative.dev/class\': {}, \'f:autoscaling.knative.dev/container-concurrency-target-percentage\': {}, \'f:autoscaling.knative.dev/initial-scale\': {}, \'f:autoscaling.knative.dev/max-scale-down-rate\': {}, \'f:autoscaling.knative.dev/max-scale-up-rate\': {}, \'f:autoscaling.knative.dev/metric\': {}, \'f:autoscaling.knative.dev/panic-threshold-percentage\': {}, \'f:autoscaling.knative.dev/panic-window-percentage\': {}, \'f:autoscaling.knative.dev/scale-down-delay\': {}, \'f:autoscaling.knative.dev/scale-to-zero-grace-period\': {}, \'f:autoscaling.knative.dev/stable-window\': {}, \'f:autoscaling.knative.dev/target\': {}, \'f:autoscaling.knative.dev/target-burst-capacity\': {}, \'f:autoscaling.knative.dev/tick-interval\': {}, \'f:features.knative.dev/http-full-duplex\': {}, \'f:networking.knative.dev/ingress-class\': {}}, \'f:labels\': {\'.\': {}, \'f:istio.io/rev\': {}, \'f:knative.coreweave.cloud/ingress\': {}, \'f:prometheus.k.chaiverse.com\': {}, \'f:qos.coreweave.cloud/latency\': {}}}, \'f:spec\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:affinity\': {\'.\': {}, \'f:nodeAffinity\': {\'.\': {}, \'f:tion\': {}, \'f:requiredDuringSchedulingIgnoredDuringExecution\': {}}}, \'f:containerConcurrency\': {}, \'f:containers\': {}, \'f:imagePullSecrets\': {}, \'f:maxReplicas\': {}, \'f:minReplicas\': {}, \'f:priorityClassName\': {}, \'f:timeout\': {}, \'f:volumes\': {}}}}, \'manager\': \'OpenAPI-Generator\', \'operation\': \'Update\', \'time\': \'2025-09-05T09:40:48Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:finalizers\': {\'.\': {}, \'v:"inferenceservice.finalizers"\': {}}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'time\': \'2025-09-05T09:40:48Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:status\': {\'.\': {}, \'f:components\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:latestCreatedRevision\': {}}}, \'f:conditions\': {}, \'f:modelStatus\': {\'.\': {}, \'f:states\': {\'.\': {}, \'f:activeModelState\': {}, \'f:targetModelState\': {}}, \'f:transitionStatus\': {}}, \'f:observedGeneration\': {}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'subresource\': \'status\', \'time\': \'2025-09-05T09:40:48Z\'}], \'name\': \'chaiml-mn12b-syn1-tune14-v1-profiler\', \'namespace\': \'tenant-chaiml-guanaco\', \'resourceVersion\': \'718815152\', \'uid\': \'851690ce-0bdd-4e9e-8c6b-d2a396a92a84\'}, \'spec\': {\'predictor\': {\'affinity\': {\'nodeAffinity\': {\'tion\': [{\'preference\': {\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'RTX_A5000\']}]}, \'weight\': 5}], \'requiredDuringSchedulingIgnoredDuringExecution\': {\'nodeSelectorTerms\': [{\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'RTX_A5000\', \'L40S\']}]}]}}}, \'containerConcurrency\': 0, \'containers\': [{\'env\': [{\'name\': \'MAX_TOKEN_INPUT\', \'value\': \'1024\'}, {\'name\': \'BEST_OF\', \'value\': \'8\'}, {\'name\': \'TEMPERATURE\', \'value\': \'1.0\'}, {\'name\': \'PRESENCE_PENALTY\', \'value\': \'0.0\'}, {\'name\': \'FREQUENCY_PENALTY\', \'value\': \'0.0\'}, {\'name\': \'TOP_P\', \'value\': \'1.0\'}, {\'name\': \'MIN_P\', \'value\': \'0.0\'}, {\'name\': \'TOP_K\', \'value\': \'40\'}, {\'name\': \'STOPPING_WORDS\', \'value\': \'["\\\\\\\\n"]\'}, {\'name\': \'MAX_TOKENS\', \'value\': \'64\'}, {\'name\': \'MAX_BATCH_SIZE\', \'value\': \'128\'}, {\'name\': \'MAX_CACHED_RESPONSES\', \'value\': \'-1\'}, {\'name\': \'URL_ROUTE\', \'value\': \'GPT-J-6B-lit-v2\'}, {\'name\': \'OBJ_ACCESS_KEY_ID\', \'value\': \'LETMTTRMLFFAMTBK\'}, {\'name\': \'OBJ_SECRET_ACCESS_KEY\', \'value\': \'VwwZaqefOOoaouNxUk03oUmK9pVEfruJhjBHPGdgycK\'}, {\'name\': \'OBJ_ENDPOINT\', \'value\': \'https://accel-object.ord1.coreweave.com\'}, {\'name\': \'TENSORIZER_URI\', \'value\': \'s3://guanaco-mkml-models/chaiml-mn12b-syn1-tune14-v1/nvidia\'}, {\'name\': \'RESERVE_MEMORY\', \'value\': \'2048\'}, {\'name\': \'DOWNLOAD_TO_LOCAL\', \'value\': \'/dev/shm/model_cache\'}, {\'name\': \'NUM_GPUS\', \'value\': \'1\'}, {\'name\': \'MK1_QUANTIZATION_PROFILE\', \'value\': \'q4\'}, {\'name\': \'MK1_MKML_LICENSE_KEY\', \'valueFrom\': {\'secretKeyRef\': {\'key\': \'key\', \'name\': \'mkml-license-key\'}}}], \'image\': \'gcr.io/chai-959f8/chai-guanaco/mkml:mkml_v0.30.2\', \'imagePullPolicy\': \'IfNotPresent\', \'name\': \'kserve-container\', \'readinessProbe\': {\'exec\': {\'command\': [\'cat\', \'/tmp/ready\']}, \'failureThreshold\': 1, \'initialDelaySeconds\': 10, \'periodSeconds\': 10, \'successThreshold\': 1, \'timeoutSeconds\': 5}, \'resources\': {\'limits\': {\'cpu\': \'2\', \'memory\': \'14Gi\', \'nvidia.com/gpu\': \'1\'}, \'requests\': {\'cpu\': \'2\', \'memory\': \'14Gi\', \'nvidia.com/gpu\': \'1\'}}, \'volumeMounts\': [{\'mountPath\': \'/dev/shm\', \'name\': \'shared-memory-cache\'}]}], \'imagePullSecrets\': [{\'name\': \'docker-creds\'}], \'maxReplicas\': 1, \'minReplicas\': 1, \'priorityClassName\': \'chaiverse\', \'timeout\': 60, \'volumes\': [{\'emptyDir\': {\'medium\': \'Memory\'}, \'name\': \'shared-memory-cache\'}]}}, \'status\': {\'components\': {\'predictor\': {\'latestCreatedRevision\': \'chaiml-mn12b-syn1-tune14-v1-profiler-predictor-00001\'}}, \'conditions\': [{\'lastTransitionTime\': \'2025-09-05T09:40:48Z\', \'reason\': \'PredictorConfigurationReady not ready\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'LatestDeploymentReady\'}, {\'lastTransitionTime\': \'2025-09-05T09:40:48Z\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'PredictorConfigurationReady\'}, {\'lastTransitionTime\': \'2025-09-05T09:40:48Z\', \'message\': \'Configuration "chaiml-mn12b-syn1-tune14-v1-profiler-predictor" is waiting for a Revision to become ready.\', \'reason\': \'RevisionMissing\', \'status\': \'Unknown\', \'type\': \'PredictorReady\'}, {\'lastTransitionTime\': \'2025-09-05T09:40:48Z\', \'message\': \'Configuration "chaiml-mn12b-syn1-tune14-v1-profiler-predictor" is waiting for a Revision to become ready.\', \'reason\': \'RevisionMissing\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'PredictorRouteReady\'}, {\'lastTransitionTime\': \'2025-09-05T09:40:48Z\', \'message\': \'Configuration "chaiml-mn12b-syn1-tune14-v1-profiler-predictor" is waiting for a Revision to become ready.\', \'reason\': \'RevisionMissing\', \'status\': \'Unknown\', \'type\': \'Ready\'}, {\'lastTransitionTime\': \'2025-09-05T09:40:48Z\', \'reason\': \'PredictorRouteReady not ready\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'RoutesReady\'}], \'modelStatus\': {\'states\': {\'activeModelState\': \'\', \'targetModelState\': \'Pending\'}, \'transitionStatus\': \'InProgress\'}, \'observedGeneration\': 1}}')
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.14s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.14s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.12s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-mn12b-syn1-tune14-v1-profiler
Waiting for inference service chaiml-mn12b-syn1-tune14-v1-profiler to be ready
Tearing down inference service chaiml-mn12b-syn1-tune14-v1-profiler
%s, retrying in %s seconds...
Creating inference service chaiml-mn12b-syn1-tune14-v1-profiler
Waiting for inference service chaiml-mn12b-syn1-tune14-v1-profiler to be ready
Tearing down inference service chaiml-mn12b-syn1-tune14-v1-profiler
%s, retrying in %s seconds...
Creating inference service chaiml-mn12b-syn1-tune14-v1-profiler
Waiting for inference service chaiml-mn12b-syn1-tune14-v1-profiler to be ready
Tearing down inference service chaiml-mn12b-syn1-tune14-v1-profiler
clean up pipeline due to error=DeploymentError('Timeout to start the InferenceService chaiml-mn12b-syn1-tune14-v1-profiler. The InferenceService is as following: {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'kind\': \'InferenceService\', \'metadata\': {\'annotations\': {\'autoscaling.knative.dev/class\': \'hpa.autoscaling.knative.dev\', \'autoscaling.knative.dev/container-concurrency-target-percentage\': \'70\', \'autoscaling.knative.dev/initial-scale\': \'1\', \'autoscaling.knative.dev/max-scale-down-rate\': \'1.1\', \'autoscaling.knative.dev/max-scale-up-rate\': \'2\', \'autoscaling.knative.dev/metric\': \'mean_pod_latency_ms_v2\', \'autoscaling.knative.dev/panic-threshold-percentage\': \'650\', \'autoscaling.knative.dev/panic-window-percentage\': \'35\', \'autoscaling.knative.dev/scale-down-delay\': \'30s\', \'autoscaling.knative.dev/scale-to-zero-grace-period\': \'10m\', \'autoscaling.knative.dev/stable-window\': \'180s\', \'autoscaling.knative.dev/target\': \'4000\', \'autoscaling.knative.dev/target-burst-capacity\': \'-1\', \'autoscaling.knative.dev/tick-interval\': \'15s\', \'features.knative.dev/http-full-duplex\': \'Enabled\', \'networking.knative.dev/ingress-class\': \'istio.ingress.networking.knative.dev\'}, \'creationTimestamp\': \'2025-09-05T10:11:55Z\', \'finalizers\': [\'inferenceservice.finalizers\'], \'generation\': 1, \'labels\': {\'istio.io/rev\': \'prod-canary\', \'knative.coreweave.cloud/ingress\': \'istio.ingress.networking.knative.dev\', \'prometheus.k.chaiverse.com\': \'true\', \'qos.coreweave.cloud/latency\': \'low\'}, \'managedFields\': [{\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:annotations\': {\'.\': {}, \'f:autoscaling.knative.dev/class\': {}, \'f:autoscaling.knative.dev/container-concurrency-target-percentage\': {}, \'f:autoscaling.knative.dev/initial-scale\': {}, \'f:autoscaling.knative.dev/max-scale-down-rate\': {}, \'f:autoscaling.knative.dev/max-scale-up-rate\': {}, \'f:autoscaling.knative.dev/metric\': {}, \'f:autoscaling.knative.dev/panic-threshold-percentage\': {}, \'f:autoscaling.knative.dev/panic-window-percentage\': {}, \'f:autoscaling.knative.dev/scale-down-delay\': {}, \'f:autoscaling.knative.dev/scale-to-zero-grace-period\': {}, \'f:autoscaling.knative.dev/stable-window\': {}, \'f:autoscaling.knative.dev/target\': {}, \'f:autoscaling.knative.dev/target-burst-capacity\': {}, \'f:autoscaling.knative.dev/tick-interval\': {}, \'f:features.knative.dev/http-full-duplex\': {}, \'f:networking.knative.dev/ingress-class\': {}}, \'f:labels\': {\'.\': {}, \'f:istio.io/rev\': {}, \'f:knative.coreweave.cloud/ingress\': {}, \'f:prometheus.k.chaiverse.com\': {}, \'f:qos.coreweave.cloud/latency\': {}}}, \'f:spec\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:affinity\': {\'.\': {}, \'f:nodeAffinity\': {\'.\': {}, \'f:tion\': {}, \'f:requiredDuringSchedulingIgnoredDuringExecution\': {}}}, \'f:containerConcurrency\': {}, \'f:containers\': {}, \'f:imagePullSecrets\': {}, \'f:maxReplicas\': {}, \'f:minReplicas\': {}, \'f:priorityClassName\': {}, \'f:timeout\': {}, \'f:volumes\': {}}}}, \'manager\': \'OpenAPI-Generator\', \'operation\': \'Update\', \'time\': \'2025-09-05T10:11:55Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:finalizers\': {\'.\': {}, \'v:"inferenceservice.finalizers"\': {}}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'time\': \'2025-09-05T10:11:56Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:status\': {\'.\': {}, \'f:components\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:latestCreatedRevision\': {}}}, \'f:conditions\': {}, \'f:modelStatus\': {\'.\': {}, \'f:states\': {\'.\': {}, \'f:activeModelState\': {}, \'f:targetModelState\': {}}, \'f:transitionStatus\': {}}, \'f:observedGeneration\': {}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'subresource\': \'status\', \'time\': \'2025-09-05T10:11:56Z\'}], \'name\': \'chaiml-mn12b-syn1-tune14-v1-profiler\', \'namespace\': \'tenant-chaiml-guanaco\', \'resourceVersion\': \'718863405\', \'uid\': \'0af61f9b-16c5-45cd-8d5b-7425738d5219\'}, \'spec\': {\'predictor\': {\'affinity\': {\'nodeAffinity\': {\'tion\': [{\'preference\': {\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'RTX_A5000\']}]}, \'weight\': 5}], \'requiredDuringSchedulingIgnoredDuringExecution\': {\'nodeSelectorTerms\': [{\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'RTX_A5000\', \'L40S\']}]}]}}}, \'containerConcurrency\': 0, \'containers\': [{\'env\': [{\'name\': \'MAX_TOKEN_INPUT\', \'value\': \'1024\'}, {\'name\': \'BEST_OF\', \'value\': \'8\'}, {\'name\': \'TEMPERATURE\', \'value\': \'1.0\'}, {\'name\': \'PRESENCE_PENALTY\', \'value\': \'0.0\'}, {\'name\': \'FREQUENCY_PENALTY\', \'value\': \'0.0\'}, {\'name\': \'TOP_P\', \'value\': \'1.0\'}, {\'name\': \'MIN_P\', \'value\': \'0.0\'}, {\'name\': \'TOP_K\', \'value\': \'40\'}, {\'name\': \'STOPPING_WORDS\', \'value\': \'["\\\\\\\\n"]\'}, {\'name\': \'MAX_TOKENS\', \'value\': \'64\'}, {\'name\': \'MAX_BATCH_SIZE\', \'value\': \'128\'}, {\'name\': \'MAX_CACHED_RESPONSES\', \'value\': \'-1\'}, {\'name\': \'URL_ROUTE\', \'value\': \'GPT-J-6B-lit-v2\'}, {\'name\': \'OBJ_ACCESS_KEY_ID\', \'value\': \'LETMTTRMLFFAMTBK\'}, {\'name\': \'OBJ_SECRET_ACCESS_KEY\', \'value\': \'VwwZaqefOOoaouNxUk03oUmK9pVEfruJhjBHPGdgycK\'}, {\'name\': \'OBJ_ENDPOINT\', \'value\': \'https://accel-object.ord1.coreweave.com\'}, {\'name\': \'TENSORIZER_URI\', \'value\': \'s3://guanaco-mkml-models/chaiml-mn12b-syn1-tune14-v1/nvidia\'}, {\'name\': \'RESERVE_MEMORY\', \'value\': \'2048\'}, {\'name\': \'DOWNLOAD_TO_LOCAL\', \'value\': \'/dev/shm/model_cache\'}, {\'name\': \'NUM_GPUS\', \'value\': \'1\'}, {\'name\': \'MK1_QUANTIZATION_PROFILE\', \'value\': \'q4\'}, {\'name\': \'MK1_MKML_LICENSE_KEY\', \'valueFrom\': {\'secretKeyRef\': {\'key\': \'key\', \'name\': \'mkml-license-key\'}}}], \'image\': \'gcr.io/chai-959f8/chai-guanaco/mkml:mkml_v0.30.2\', \'imagePullPolicy\': \'IfNotPresent\', \'name\': \'kserve-container\', \'readinessProbe\': {\'exec\': {\'command\': [\'cat\', \'/tmp/ready\']}, \'failureThreshold\': 1, \'initialDelaySeconds\': 10, \'periodSeconds\': 10, \'successThreshold\': 1, \'timeoutSeconds\': 5}, \'resources\': {\'limits\': {\'cpu\': \'2\', \'memory\': \'14Gi\', \'nvidia.com/gpu\': \'1\'}, \'requests\': {\'cpu\': \'2\', \'memory\': \'14Gi\', \'nvidia.com/gpu\': \'1\'}}, \'volumeMounts\': [{\'mountPath\': \'/dev/shm\', \'name\': \'shared-memory-cache\'}]}], \'imagePullSecrets\': [{\'name\': \'docker-creds\'}], \'maxReplicas\': 1, \'minReplicas\': 1, \'priorityClassName\': \'chaiverse\', \'timeout\': 60, \'volumes\': [{\'emptyDir\': {\'medium\': \'Memory\'}, \'name\': \'shared-memory-cache\'}]}}, \'status\': {\'components\': {\'predictor\': {\'latestCreatedRevision\': \'chaiml-mn12b-syn1-tune14-v1-profiler-predictor-00001\'}}, \'conditions\': [{\'lastTransitionTime\': \'2025-09-05T10:11:56Z\', \'reason\': \'PredictorConfigurationReady not ready\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'LatestDeploymentReady\'}, {\'lastTransitionTime\': \'2025-09-05T10:11:56Z\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'PredictorConfigurationReady\'}, {\'lastTransitionTime\': \'2025-09-05T10:11:56Z\', \'message\': \'Configuration "chaiml-mn12b-syn1-tune14-v1-profiler-predictor" is waiting for a Revision to become ready.\', \'reason\': \'RevisionMissing\', \'status\': \'Unknown\', \'type\': \'PredictorReady\'}, {\'lastTransitionTime\': \'2025-09-05T10:11:56Z\', \'message\': \'Configuration "chaiml-mn12b-syn1-tune14-v1-profiler-predictor" is waiting for a Revision to become ready.\', \'reason\': \'RevisionMissing\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'PredictorRouteReady\'}, {\'lastTransitionTime\': \'2025-09-05T10:11:56Z\', \'message\': \'Configuration "chaiml-mn12b-syn1-tune14-v1-profiler-predictor" is waiting for a Revision to become ready.\', \'reason\': \'RevisionMissing\', \'status\': \'Unknown\', \'type\': \'Ready\'}, {\'lastTransitionTime\': \'2025-09-05T10:11:56Z\', \'reason\': \'PredictorRouteReady not ready\', \'severity\': \'Info\', \'status\': \'Unknown\', \'type\': \'RoutesReady\'}], \'modelStatus\': {\'states\': {\'activeModelState\': \'\', \'targetModelState\': \'Pending\'}, \'transitionStatus\': \'InProgress\'}, \'observedGeneration\': 1}}')
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.15s
Shutdown handler de-registered
chaiml-mn12b-syn1-tune14_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-mn12b-syn1-tune14_v1 status is now torndown due to DeploymentManager action