Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name qwen-qwq-32b-v2-mkmlizer
Waiting for job on qwen-qwq-32b-v2-mkmlizer to finish
qwen-qwq-32b-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
qwen-qwq-32b-v2-mkmlizer: ║ _____ __ __ ║
qwen-qwq-32b-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
qwen-qwq-32b-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
qwen-qwq-32b-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
qwen-qwq-32b-v2-mkmlizer: ║ /___/ ║
qwen-qwq-32b-v2-mkmlizer: ║ ║
qwen-qwq-32b-v2-mkmlizer: ║ Version: 0.12.8 ║
qwen-qwq-32b-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
qwen-qwq-32b-v2-mkmlizer: ║ https://mk1.ai ║
qwen-qwq-32b-v2-mkmlizer: ║ ║
qwen-qwq-32b-v2-mkmlizer: ║ The license key for the current software has been verified as ║
qwen-qwq-32b-v2-mkmlizer: ║ belonging to: ║
qwen-qwq-32b-v2-mkmlizer: ║ ║
qwen-qwq-32b-v2-mkmlizer: ║ Chai Research Corp. ║
qwen-qwq-32b-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
qwen-qwq-32b-v2-mkmlizer: ║ Expiration: 2025-04-15 23:59:59 ║
qwen-qwq-32b-v2-mkmlizer: ║ ║
qwen-qwq-32b-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Failed to get response for submission qwen-qwq-32b_v1: HTTPConnectionPool(host='qwen-qwq-32b-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission qwen-qwq-32b_v1: HTTPConnectionPool(host='qwen-qwq-32b-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission chaiml-sft-gemma2-28b-v_83370_v5: HTTPConnectionPool(host='chaiml-sft-gemma2-28b-v-83370-v5-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
qwen-qwq-32b-v2-mkmlizer: Downloaded to shared memory in 71.327s
qwen-qwq-32b-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpthntwdb0, device:0
qwen-qwq-32b-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
qwen-qwq-32b-v2-mkmlizer: quantized model in 64.368s
qwen-qwq-32b-v2-mkmlizer: Processed model Qwen/QwQ-32B in 135.696s
qwen-qwq-32b-v2-mkmlizer: creating bucket guanaco-mkml-models
qwen-qwq-32b-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
qwen-qwq-32b-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/qwen-qwq-32b-v2
qwen-qwq-32b-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/qwen-qwq-32b-v2/config.json
qwen-qwq-32b-v2-mkmlizer: cp /dev/shm/model_cache/added_tokens.json s3://guanaco-mkml-models/qwen-qwq-32b-v2/added_tokens.json
qwen-qwq-32b-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/qwen-qwq-32b-v2/special_tokens_map.json
qwen-qwq-32b-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/qwen-qwq-32b-v2/tokenizer_config.json
qwen-qwq-32b-v2-mkmlizer: cp /dev/shm/model_cache/merges.txt s3://guanaco-mkml-models/qwen-qwq-32b-v2/merges.txt
qwen-qwq-32b-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/qwen-qwq-32b-v2/tokenizer.json
qwen-qwq-32b-v2-mkmlizer: cp /dev/shm/model_cache/vocab.json s3://guanaco-mkml-models/qwen-qwq-32b-v2/vocab.json
qwen-qwq-32b-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.2.safetensors s3://guanaco-mkml-models/qwen-qwq-32b-v2/flywheel_model.2.safetensors
qwen-qwq-32b-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.1.safetensors s3://guanaco-mkml-models/qwen-qwq-32b-v2/flywheel_model.1.safetensors
qwen-qwq-32b-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/qwen-qwq-32b-v2/flywheel_model.0.safetensors
qwen-qwq-32b-v2-mkmlizer:
Loading 0: 0%| | 0/771 [00:00<?, ?it/s]
Loading 0: 1%| | 5/771 [00:00<00:20, 38.01it/s]
Loading 0: 2%|▏ | 16/771 [00:00<00:10, 72.83it/s]
Loading 0: 4%|▎ | 27/771 [00:00<00:08, 86.46it/s]
Loading 0: 5%|▍ | 36/771 [00:00<00:08, 85.90it/s]
Loading 0: 6%|▌ | 46/771 [00:00<00:13, 54.89it/s]
Loading 0: 7%|▋ | 53/771 [00:00<00:15, 46.53it/s]
Loading 0: 8%|▊ | 64/771 [00:01<00:11, 58.99it/s]
Loading 0: 10%|▉ | 75/771 [00:01<00:10, 69.32it/s]
Loading 0: 11%|█ | 84/771 [00:01<00:09, 73.30it/s]
Loading 0: 12%|█▏ | 93/771 [00:01<00:09, 70.51it/s]
Loading 0: 14%|█▎ | 106/771 [00:01<00:10, 61.08it/s]
Loading 0: 15%|█▍ | 113/771 [00:01<00:11, 58.46it/s]
Loading 0: 16%|█▌ | 121/771 [00:01<00:11, 58.75it/s]
Loading 0: 17%|█▋ | 132/771 [00:02<00:09, 69.49it/s]
Loading 0: 18%|█▊ | 142/771 [00:02<00:08, 76.01it/s]
Loading 0: 20%|█▉ | 152/771 [00:02<00:07, 81.34it/s]
Loading 0: 21%|██ | 161/771 [00:02<00:08, 76.10it/s]
Loading 0: 22%|██▏ | 169/771 [00:02<00:11, 52.70it/s]
Loading 0: 23%|██▎ | 176/771 [00:02<00:11, 53.09it/s]
Loading 0: 24%|██▍ | 184/771 [00:02<00:10, 58.23it/s]
Loading 0: 25%|██▌ | 195/771 [00:03<00:08, 69.43it/s]
Loading 0: 26%|██▋ | 203/771 [00:03<00:07, 71.36it/s]
Loading 0: 27%|██▋ | 211/771 [00:03<00:08, 68.46it/s]
Loading 0: 29%|██▉ | 226/771 [00:03<00:08, 61.64it/s]
Loading 0: 30%|███ | 233/771 [00:03<00:10, 51.72it/s]
Loading 0: 32%|███▏ | 244/771 [00:03<00:08, 62.28it/s]
Loading 0: 33%|███▎ | 255/771 [00:03<00:07, 71.74it/s]
Loading 0: 34%|███▍ | 264/771 [00:04<00:06, 74.66it/s]
Loading 0: 35%|███▌ | 273/771 [00:04<00:06, 71.31it/s]
Loading 0: 37%|███▋ | 286/771 [00:04<00:08, 60.08it/s]
Loading 0: 38%|███▊ | 293/771 [00:04<00:09, 49.87it/s]
Loading 0: 39%|███▉ | 303/771 [00:19<03:39, 2.13it/s]
Loading 0: 40%|████ | 311/771 [00:19<02:41, 2.86it/s]
Loading 0: 41%|████ | 318/771 [00:19<02:01, 3.73it/s]
Loading 0: 43%|████▎ | 328/771 [00:19<01:20, 5.50it/s]
Loading 0: 44%|████▍ | 338/771 [00:19<00:54, 7.91it/s]
Loading 0: 45%|████▍ | 346/771 [00:20<00:42, 9.95it/s]
Loading 0: 46%|████▌ | 353/771 [00:20<00:36, 11.50it/s]
Loading 0: 47%|████▋ | 364/771 [00:20<00:24, 16.96it/s]
Loading 0: 49%|████▊ | 375/771 [00:20<00:16, 23.73it/s]
Loading 0: 50%|████▉ | 383/771 [00:20<00:13, 28.96it/s]
Loading 0: 51%|█████ | 391/771 [00:20<00:11, 33.86it/s]
Loading 0: 53%|█████▎ | 406/771 [00:21<00:09, 40.36it/s]
Loading 0: 54%|█████▎ | 413/771 [00:21<00:09, 38.28it/s]
Loading 0: 55%|█████▍ | 424/771 [00:21<00:07, 48.47it/s]
Loading 0: 56%|█████▋ | 435/771 [00:21<00:05, 58.68it/s]
Loading 0: 58%|█████▊ | 444/771 [00:21<00:05, 64.58it/s]
Loading 0: 59%|█████▉ | 453/771 [00:21<00:04, 64.35it/s]
Loading 0: 60%|██████ | 466/771 [00:22<00:05, 58.96it/s]
Loading 0: 61%|██████▏ | 473/771 [00:22<00:05, 50.32it/s]
Loading 0: 63%|██████▎ | 484/771 [00:22<00:04, 60.39it/s]
Loading 0: 64%|██████▍ | 495/771 [00:22<00:03, 70.15it/s]
Loading 0: 65%|██████▌ | 504/771 [00:22<00:03, 73.84it/s]
Loading 0: 67%|██████▋ | 513/771 [00:22<00:03, 71.22it/s]
Loading 0: 68%|██████▊ | 526/771 [00:22<00:03, 61.52it/s]
Loading 0: 69%|██████▉ | 533/771 [00:23<00:04, 51.63it/s]
Loading 0: 71%|███████ | 544/771 [00:23<00:03, 62.23it/s]
Loading 0: 72%|███████▏ | 555/771 [00:23<00:03, 71.69it/s]
Loading 0: 73%|███████▎ | 564/771 [00:23<00:02, 74.78it/s]
Loading 0: 74%|███████▍ | 573/771 [00:23<00:02, 71.37it/s]
Loading 0: 76%|███████▌ | 586/771 [00:23<00:03, 61.38it/s]
Loading 0: 77%|███████▋ | 593/771 [00:24<00:03, 51.92it/s]
Loading 0: 78%|███████▊ | 604/771 [00:24<00:02, 62.35it/s]
Loading 0: 80%|███████▉ | 615/771 [00:24<00:02, 71.73it/s]
Loading 0: 81%|████████ | 624/771 [00:24<00:01, 75.07it/s]
Loading 0: 82%|████████▏ | 633/771 [00:39<01:04, 2.13it/s]
Loading 0: 84%|████████▍ | 646/771 [00:39<00:38, 3.28it/s]
Loading 0: 85%|████████▍ | 654/771 [00:39<00:28, 4.14it/s]
Loading 0: 86%|████████▌ | 664/771 [00:39<00:18, 5.84it/s]
Loading 0: 88%|████████▊ | 675/771 [00:40<00:11, 8.40it/s]
Loading 0: 89%|████████▊ | 684/771 [00:40<00:07, 11.17it/s]
Loading 0: 90%|████████▉ | 693/771 [00:40<00:05, 14.56it/s]
Loading 0: 92%|█████████▏| 706/771 [00:40<00:03, 19.71it/s]
Loading 0: 92%|█████████▏| 713/771 [00:40<00:02, 21.60it/s]
Loading 0: 94%|█████████▍| 724/771 [00:40<00:01, 29.29it/s]
Loading 0: 95%|█████████▌| 735/771 [00:40<00:00, 38.16it/s]
Loading 0: 96%|█████████▋| 744/771 [00:41<00:00, 45.06it/s]
Loading 0: 98%|█████████▊| 753/771 [00:41<00:00, 49.37it/s]
Loading 0: 99%|█████████▉| 766/771 [00:49<00:01, 4.26it/s]
Job qwen-qwq-32b-v2-mkmlizer completed after 165.4s with status: succeeded
Stopping job with name qwen-qwq-32b-v2-mkmlizer
Pipeline stage MKMLizer completed in 165.87s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service qwen-qwq-32b-v2
Waiting for inference service qwen-qwq-32b-v2 to be ready
Inference service qwen-qwq-32b-v2 ready after 120.66636371612549s
Pipeline stage MKMLDeployer completed in 121.23s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.9910166263580322s
Received healthy response to inference request in 1.7418365478515625s
Failed to get response for submission rirv938-96p-4ff-rirv938_38486_v1: HTTPConnectionPool(host='rirv938-96p-4ff-rirv938-38486-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Received healthy response to inference request in 3.0881636142730713s
Received healthy response to inference request in 2.8581900596618652s
Received healthy response to inference request in 3.0620648860931396s
5 requests
0 failed requests
5th percentile: 1.965107250213623
10th percentile: 2.1883779525756837
20th percentile: 2.634919357299805
30th percentile: 2.8847553730010986
40th percentile: 2.9378859996795654
50th percentile: 2.9910166263580322
60th percentile: 3.019435930252075
70th percentile: 3.0478552341461183
80th percentile: 3.067284631729126
90th percentile: 3.0777241230010985
95th percentile: 3.082943868637085
99th percentile: 3.087119665145874
mean time: 2.748254346847534
Pipeline stage StressChecker completed in 15.38s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.79s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.74s
Shutdown handler de-registered
qwen-qwq-32b_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 4009.08s
Shutdown handler de-registered
qwen-qwq-32b_v2 status is now inactive due to auto deactivation removed underperforming models
qwen-qwq-32b_v2 status is now torndown due to DeploymentManager action