Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name chaiml-espresso-llama-24-9292-v2-mkmlizer
Waiting for job on chaiml-espresso-llama-24-9292-v2-mkmlizer to finish
chaiml-espresso-llama-24-9292-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ _____ __ __ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ /___/ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ Version: 0.11.12 ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ https://mk1.ai ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ belonging to: ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ Chai Research Corp. ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-espresso-llama-24-9292-v2-mkmlizer: Downloaded to shared memory in 46.995s
chaiml-espresso-llama-24-9292-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpif7p5txa, device:0
chaiml-espresso-llama-24-9292-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
chaiml-espresso-llama-24-9292-v2-mkmlizer: quantized model in 45.302s
chaiml-espresso-llama-24-9292-v2-mkmlizer: Processed model ChaiML/espresso_llama_241204_albert_v2_sft_2epoch_128alpha in 92.297s
chaiml-espresso-llama-24-9292-v2-mkmlizer: creating bucket guanaco-mkml-models
chaiml-espresso-llama-24-9292-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
chaiml-espresso-llama-24-9292-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/special_tokens_map.json
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/config.json
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/tokenizer_config.json
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/tokenizer.json
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.1.safetensors s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/flywheel_model.1.safetensors
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/flywheel_model.0.safetensors
chaiml-espresso-llama-24-9292-v2-mkmlizer:
Loading 0: 0%| | 0/507 [00:00<?, ?it/s]
Loading 0: 1%| | 5/507 [00:00<00:21, 23.20it/s]
Loading 0: 2%|▏ | 12/507 [00:00<00:13, 36.96it/s]
Loading 0: 3%|▎ | 17/507 [00:00<00:13, 36.10it/s]
Loading 0: 4%|▍ | 22/507 [00:00<00:13, 36.74it/s]
Loading 0: 5%|▌ | 26/507 [00:00<00:13, 36.60it/s]
Loading 0: 6%|▌ | 31/507 [00:00<00:12, 36.93it/s]
Loading 0: 7%|▋ | 35/507 [00:00<00:12, 37.08it/s]
Loading 0: 8%|▊ | 40/507 [00:01<00:12, 37.73it/s]
Loading 0: 9%|▊ | 44/507 [00:01<00:12, 37.99it/s]
Loading 0: 9%|▉ | 48/507 [00:01<00:15, 30.25it/s]
Loading 0: 10%|█ | 53/507 [00:01<00:19, 22.84it/s]
Loading 0: 11%|█ | 56/507 [00:01<00:20, 21.64it/s]
Loading 0: 12%|█▏ | 63/507 [00:02<00:15, 29.18it/s]
Loading 0: 13%|█▎ | 67/507 [00:02<00:14, 29.89it/s]
Loading 0: 14%|█▍ | 72/507 [00:02<00:12, 33.65it/s]
Loading 0: 15%|█▌ | 78/507 [00:02<00:11, 37.88it/s]
Loading 0: 16%|█▋ | 83/507 [00:02<00:12, 34.55it/s]
Loading 0: 17%|█▋ | 87/507 [00:02<00:12, 33.72it/s]
Loading 0: 18%|█▊ | 91/507 [00:02<00:12, 32.11it/s]
Loading 0: 19%|█▉ | 96/507 [00:02<00:11, 34.52it/s]
Loading 0: 20%|█▉ | 100/507 [00:03<00:12, 32.98it/s]
Loading 0: 21%|██ | 105/507 [00:03<00:11, 35.62it/s]
Loading 0: 21%|██▏ | 109/507 [00:03<00:11, 34.00it/s]
Loading 0: 22%|██▏ | 113/507 [00:03<00:16, 24.11it/s]
Loading 0: 23%|██▎ | 116/507 [00:03<00:17, 22.38it/s]
Loading 0: 24%|██▍ | 122/507 [00:03<00:14, 25.88it/s]
Loading 0: 25%|██▌ | 127/507 [00:04<00:12, 30.48it/s]
Loading 0: 26%|██▌ | 131/507 [00:04<00:12, 29.12it/s]
Loading 0: 27%|██▋ | 138/507 [00:04<00:09, 36.92it/s]
Loading 0: 28%|██▊ | 143/507 [00:04<00:09, 37.14it/s]
Loading 0: 29%|██▉ | 148/507 [00:04<00:09, 38.15it/s]
Loading 0: 30%|███ | 153/507 [00:04<00:08, 39.93it/s]
Loading 0: 31%|███ | 158/507 [00:04<00:10, 34.37it/s]
Loading 0: 32%|███▏ | 164/507 [00:05<00:08, 39.11it/s]
Loading 0: 33%|███▎ | 169/507 [00:05<00:11, 29.76it/s]
Loading 0: 34%|███▍ | 173/507 [00:05<00:11, 30.36it/s]
Loading 0: 35%|███▍ | 177/507 [00:05<00:11, 29.61it/s]
Loading 0: 36%|███▌ | 183/507 [00:05<00:09, 35.12it/s]
Loading 0: 37%|███▋ | 187/507 [00:05<00:09, 34.98it/s]
Loading 0: 38%|███▊ | 192/507 [00:05<00:08, 37.93it/s]
Loading 0: 39%|███▉ | 197/507 [00:06<00:08, 37.92it/s]
Loading 0: 40%|███▉ | 201/507 [00:06<00:07, 38.31it/s]
Loading 0: 40%|████ | 205/507 [00:06<00:08, 37.63it/s]
Loading 0: 41%|████▏ | 210/507 [00:06<00:07, 40.61it/s]
Loading 0: 42%|████▏ | 215/507 [00:06<00:07, 38.29it/s]
Loading 0: 43%|████▎ | 219/507 [00:06<00:09, 29.87it/s]
Loading 0: 44%|████▍ | 224/507 [00:06<00:11, 25.04it/s]
Loading 0: 45%|████▌ | 230/507 [00:07<00:10, 26.99it/s]
Loading 0: 47%|████▋ | 237/507 [00:07<00:08, 33.65it/s]
Loading 0: 48%|████▊ | 241/507 [00:07<00:07, 33.80it/s]
Loading 0: 49%|████▊ | 246/507 [00:07<00:07, 35.95it/s]
Loading 0: 49%|████▉ | 250/507 [00:07<00:07, 35.48it/s]
Loading 0: 50%|█████ | 255/507 [00:07<00:06, 37.76it/s]
Loading 0: 51%|█████ | 259/507 [00:07<00:06, 35.71it/s]
Loading 0: 52%|█████▏ | 264/507 [00:07<00:06, 37.86it/s]
Loading 0: 53%|█████▎ | 268/507 [00:08<00:06, 35.32it/s]
Loading 0: 54%|█████▍ | 273/507 [00:08<00:06, 37.72it/s]
Loading 0: 55%|█████▍ | 277/507 [00:08<00:06, 35.27it/s]
Loading 0: 56%|█████▌ | 283/507 [00:08<00:05, 37.66it/s]
Loading 0: 57%|█████▋ | 287/507 [00:08<00:09, 23.78it/s]
Loading 0: 58%|█████▊ | 293/507 [00:09<00:07, 27.21it/s]
Loading 0: 59%|█████▉ | 299/507 [00:20<00:07, 27.21it/s]
Loading 0: 59%|█████▉ | 300/507 [00:24<02:54, 1.19it/s]
Loading 0: 60%|█████▉ | 302/507 [00:24<02:32, 1.34it/s]
Loading 0: 61%|██████ | 307/507 [00:24<01:43, 1.94it/s]
Loading 0: 61%|██████ | 310/507 [00:24<01:21, 2.41it/s]
Loading 0: 62%|██████▏ | 314/507 [00:24<00:58, 3.32it/s]
Loading 0: 63%|██████▎ | 319/507 [00:25<00:38, 4.83it/s]
Loading 0: 64%|██████▎ | 323/507 [00:25<00:28, 6.39it/s]
Loading 0: 65%|██████▍ | 328/507 [00:25<00:20, 8.88it/s]
Loading 0: 65%|██████▌ | 332/507 [00:25<00:15, 11.28it/s]
Loading 0: 66%|██████▋ | 337/507 [00:25<00:11, 15.01it/s]
Loading 0: 67%|██████▋ | 341/507 [00:25<00:11, 14.77it/s]
Loading 0: 68%|██████▊ | 345/507 [00:25<00:09, 17.25it/s]
Loading 0: 69%|██████▉ | 349/507 [00:26<00:08, 19.42it/s]
Loading 0: 70%|██████▉ | 354/507 [00:26<00:06, 23.69it/s]
Loading 0: 71%|███████ | 358/507 [00:26<00:05, 25.13it/s]
Loading 0: 72%|███████▏ | 363/507 [00:26<00:04, 29.10it/s]
Loading 0: 72%|███████▏ | 367/507 [00:26<00:04, 29.93it/s]
Loading 0: 73%|███████▎ | 372/507 [00:26<00:03, 34.10it/s]
Loading 0: 74%|███████▍ | 376/507 [00:26<00:03, 33.89it/s]
Loading 0: 75%|███████▌ | 381/507 [00:26<00:03, 36.62it/s]
Loading 0: 76%|███████▌ | 385/507 [00:27<00:03, 34.44it/s]
Loading 0: 77%|███████▋ | 389/507 [00:27<00:03, 35.18it/s]
Loading 0: 78%|███████▊ | 393/507 [00:27<00:03, 35.30it/s]
Loading 0: 78%|███████▊ | 397/507 [00:27<00:04, 25.13it/s]
Loading 0: 79%|███████▉ | 401/507 [00:27<00:04, 25.23it/s]
Loading 0: 80%|████████ | 408/507 [00:27<00:02, 33.05it/s]
Loading 0: 81%|████████▏ | 412/507 [00:27<00:02, 33.18it/s]
Loading 0: 82%|████████▏ | 417/507 [00:28<00:02, 35.15it/s]
Loading 0: 83%|████████▎ | 421/507 [00:28<00:02, 34.99it/s]
Loading 0: 84%|████████▍ | 426/507 [00:28<00:02, 38.33it/s]
Loading 0: 85%|████████▍ | 430/507 [00:28<00:02, 36.28it/s]
Loading 0: 86%|████████▌ | 435/507 [00:28<00:01, 37.28it/s]
Loading 0: 87%|████████▋ | 439/507 [00:28<00:02, 33.98it/s]
Loading 0: 88%|████████▊ | 444/507 [00:28<00:01, 35.70it/s]
Loading 0: 88%|████████▊ | 448/507 [00:28<00:01, 35.57it/s]
Loading 0: 90%|████████▉ | 454/507 [00:29<00:01, 38.26it/s]
Loading 0: 90%|█████████ | 458/507 [00:31<00:08, 6.01it/s]
Loading 0: 91%|█████████ | 461/507 [00:31<00:06, 7.21it/s]
Loading 0: 92%|█████████▏| 465/507 [00:31<00:04, 9.24it/s]
Loading 0: 93%|█████████▎| 472/507 [00:31<00:02, 14.35it/s]
Loading 0: 94%|█████████▍| 476/507 [00:31<00:01, 16.75it/s]
Loading 0: 95%|█████████▍| 481/507 [00:31<00:01, 20.93it/s]
Loading 0: 96%|█████████▌| 485/507 [00:32<00:00, 23.33it/s]
Loading 0: 97%|█████████▋| 490/507 [00:32<00:00, 27.48it/s]
Loading 0: 97%|█████████▋| 494/507 [00:32<00:00, 28.71it/s]
Loading 0: 98%|█████████▊| 499/507 [00:32<00:00, 31.20it/s]
Loading 0: 99%|█████████▉| 503/507 [00:32<00:00, 30.13it/s]
Loading 0: 100%|██████████| 507/507 [00:32<00:00, 32.07it/s]
Job chaiml-espresso-llama-24-9292-v2-mkmlizer completed after 122.89s with status: succeeded
Stopping job with name chaiml-espresso-llama-24-9292-v2-mkmlizer
Pipeline stage MKMLizer completed in 124.23s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service chaiml-espresso-llama-24-9292-v2
Waiting for inference service chaiml-espresso-llama-24-9292-v2 to be ready
Inference service chaiml-espresso-llama-24-9292-v2 ready after 184.75290846824646s
Pipeline stage MKMLDeployer completed in 186.49s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.983168363571167s
Received healthy response to inference request in 2.445164442062378s
Received healthy response to inference request in 2.7090847492218018s
Retrying (%r) after connection broken by '%r': %s
Received healthy response to inference request in 2.8613054752349854s
Received healthy response to inference request in 2.352644443511963s
5 requests
0 failed requests
5th percentile: 2.371148443222046
10th percentile: 2.389652442932129
20th percentile: 2.426660442352295
30th percentile: 2.497948503494263
40th percentile: 2.603516626358032
50th percentile: 2.7090847492218018
60th percentile: 2.769973039627075
70th percentile: 2.8308613300323486
80th percentile: 2.8856780529022217
90th percentile: 2.9344232082366943
95th percentile: 2.9587957859039307
99th percentile: 2.97829384803772
mean time: 2.670273494720459
Pipeline stage StressChecker completed in 14.69s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.43s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 2.26s
Shutdown handler de-registered
chaiml-espresso-llama-24_9292_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3197.92s
Shutdown handler de-registered
chaiml-espresso-llama-24_9292_v2 status is now inactive due to auto deactivation removed underperforming models