Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name chaiml-nemo-20241010-t-5991-v150-mkmlizer
Waiting for job on chaiml-nemo-20241010-t-5991-v150-mkmlizer to finish
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ _____ __ __ ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ /___/ ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ Version: 0.11.12 ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ https://mk1.ai ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ belonging to: ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ Chai Research Corp. ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ║ ║
chaiml-nemo-20241010-t-5991-v150-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-nemo-20241010-t-5991-v150-mkmlizer: Downloaded to shared memory in 28.999s
chaiml-nemo-20241010-t-5991-v150-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpv7mzu241, device:0
chaiml-nemo-20241010-t-5991-v150-mkmlizer: Saving flywheel model at /dev/shm/model_cache
chaiml-nemo-20241010-t-5991-v150-mkmlizer: quantized model in 35.792s
chaiml-nemo-20241010-t-5991-v150-mkmlizer: Processed model ChaiML/nemo-20241010_tier_merge_v4-albert in 64.791s
chaiml-nemo-20241010-t-5991-v150-mkmlizer: creating bucket guanaco-mkml-models
chaiml-nemo-20241010-t-5991-v150-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
chaiml-nemo-20241010-t-5991-v150-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/chaiml-nemo-20241010-t-5991-v150
chaiml-nemo-20241010-t-5991-v150-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/chaiml-nemo-20241010-t-5991-v150/config.json
chaiml-nemo-20241010-t-5991-v150-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-nemo-20241010-t-5991-v150/tokenizer.json
chaiml-nemo-20241010-t-5991-v150-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-nemo-20241010-t-5991-v150/flywheel_model.0.safetensors
chaiml-nemo-20241010-t-5991-v150-mkmlizer:
Loading 0: 0%| | 0/363 [00:00<?, ?it/s]
Loading 0: 1%| | 2/363 [00:06<18:05, 3.01s/it]
Loading 0: 2%|▏ | 6/363 [00:06<04:49, 1.23it/s]
Loading 0: 4%|▎ | 13/363 [00:06<01:43, 3.38it/s]
Loading 0: 5%|▍ | 17/363 [00:06<01:10, 4.94it/s]
Loading 0: 6%|▋ | 23/363 [00:06<00:42, 8.05it/s]
Loading 0: 8%|▊ | 29/363 [00:06<00:29, 11.49it/s]
Loading 0: 9%|▉ | 34/363 [00:06<00:22, 14.81it/s]
Loading 0: 11%|█ | 40/363 [00:07<00:19, 16.26it/s]
Loading 0: 12%|█▏ | 44/363 [00:07<00:17, 18.73it/s]
Loading 0: 13%|█▎ | 49/363 [00:07<00:13, 23.05it/s]
Loading 0: 15%|█▍ | 53/363 [00:07<00:12, 25.74it/s]
Loading 0: 16%|█▋ | 59/363 [00:07<00:09, 31.28it/s]
Loading 0: 18%|█▊ | 65/363 [00:07<00:08, 33.52it/s]
Loading 0: 19%|█▉ | 70/363 [00:07<00:08, 35.26it/s]
Loading 0: 21%|██ | 76/363 [00:07<00:07, 40.63it/s]
Loading 0: 22%|██▏ | 81/363 [00:08<00:06, 41.85it/s]
Loading 0: 24%|██▎ | 86/363 [00:08<00:06, 42.71it/s]
Loading 0: 25%|██▌ | 92/363 [00:08<00:06, 41.74it/s]
Loading 0: 27%|██▋ | 97/363 [00:08<00:06, 41.39it/s]
Loading 0: 29%|██▊ | 104/363 [00:08<00:05, 46.43it/s]
Loading 0: 30%|███ | 110/363 [00:08<00:05, 44.17it/s]
Loading 0: 32%|███▏ | 115/363 [00:08<00:05, 43.15it/s]
Loading 0: 33%|███▎ | 121/363 [00:09<00:07, 32.66it/s]
Loading 0: 34%|███▍ | 125/363 [00:09<00:07, 32.73it/s]
Loading 0: 36%|███▌ | 131/363 [00:09<00:06, 37.09it/s]
Loading 0: 38%|███▊ | 137/363 [00:09<00:06, 37.62it/s]
Loading 0: 39%|███▉ | 141/363 [00:09<00:06, 36.15it/s]
Loading 0: 41%|████ | 149/363 [00:09<00:04, 44.23it/s]
Loading 0: 43%|████▎ | 155/363 [00:09<00:04, 42.76it/s]
Loading 0: 44%|████▍ | 160/363 [00:10<00:04, 42.21it/s]
Loading 0: 46%|████▌ | 166/363 [00:10<00:04, 46.22it/s]
Loading 0: 47%|████▋ | 171/363 [00:10<00:04, 45.22it/s]
Loading 0: 48%|████▊ | 176/363 [00:10<00:04, 44.59it/s]
Loading 0: 50%|█████ | 182/363 [00:10<00:04, 42.52it/s]
Loading 0: 52%|█████▏ | 187/363 [00:10<00:04, 42.11it/s]
Loading 0: 53%|█████▎ | 194/363 [00:10<00:03, 47.05it/s]
Loading 0: 55%|█████▌ | 200/363 [00:10<00:03, 44.34it/s]
Loading 0: 56%|█████▋ | 205/363 [00:11<00:05, 30.25it/s]
Loading 0: 58%|█████▊ | 211/363 [00:11<00:04, 34.83it/s]
Loading 0: 60%|█████▉ | 216/363 [00:11<00:04, 36.66it/s]
Loading 0: 61%|██████ | 221/363 [00:11<00:03, 38.07it/s]
Loading 0: 62%|██████▏ | 226/363 [00:11<00:03, 40.52it/s]
Loading 0: 64%|██████▎ | 231/363 [00:11<00:03, 35.59it/s]
Loading 0: 66%|██████▌ | 238/363 [00:11<00:02, 42.93it/s]
Loading 0: 67%|██████▋ | 243/363 [00:12<00:02, 43.21it/s]
Loading 0: 68%|██████▊ | 248/363 [00:12<00:02, 43.37it/s]
Loading 0: 70%|██████▉ | 253/363 [00:12<00:02, 43.66it/s]
Loading 0: 71%|███████ | 258/363 [00:12<00:02, 37.56it/s]
Loading 0: 73%|███████▎ | 265/363 [00:12<00:02, 44.68it/s]
Loading 0: 74%|███████▍ | 270/363 [00:12<00:02, 44.32it/s]
Loading 0: 76%|███████▌ | 275/363 [00:12<00:01, 44.03it/s]
Loading 0: 77%|███████▋ | 280/363 [00:12<00:01, 44.32it/s]
Loading 0: 79%|███████▊ | 285/363 [00:13<00:02, 26.78it/s]
Loading 0: 81%|████████ | 293/363 [00:13<00:01, 35.15it/s]
Loading 0: 82%|████████▏ | 299/363 [00:13<00:01, 36.06it/s]
Loading 0: 84%|████████▎ | 304/363 [00:13<00:01, 36.86it/s]
Loading 0: 85%|████████▌ | 310/363 [00:13<00:01, 41.22it/s]
Loading 0: 87%|████████▋ | 315/363 [00:13<00:01, 41.11it/s]
Loading 0: 88%|████████▊ | 320/363 [00:14<00:01, 41.60it/s]
Loading 0: 90%|████████▉ | 325/363 [00:14<00:00, 43.57it/s]
Loading 0: 91%|█████████ | 330/363 [00:14<00:00, 37.39it/s]
Loading 0: 93%|█████████▎| 338/363 [00:14<00:00, 45.43it/s]
Loading 0: 95%|█████████▍| 344/363 [00:14<00:00, 43.36it/s]
Loading 0: 96%|█████████▌| 349/363 [00:14<00:00, 42.59it/s]
Loading 0: 98%|█████████▊| 356/363 [00:14<00:00, 47.24it/s]
Loading 0: 100%|█████████▉| 362/363 [00:14<00:00, 44.28it/s]
Job chaiml-nemo-20241010-t-5991-v150-mkmlizer completed after 83.6s with status: succeeded
Stopping job with name chaiml-nemo-20241010-t-5991-v150-mkmlizer
Pipeline stage MKMLizer completed in 84.15s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.18s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service chaiml-nemo-20241010-t-5991-v150
Waiting for inference service chaiml-nemo-20241010-t-5991-v150 to be ready
Inference service chaiml-nemo-20241010-t-5991-v150 ready after 160.56771731376648s
Pipeline stage MKMLDeployer completed in 161.08s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.8984622955322266s
Received healthy response to inference request in 1.4717097282409668s
Received healthy response to inference request in 1.3779511451721191s
Received healthy response to inference request in 1.700254201889038s
Received healthy response to inference request in 1.4427096843719482s
5 requests
0 failed requests
5th percentile: 1.390902853012085
10th percentile: 1.4038545608520507
20th percentile: 1.4297579765319823
30th percentile: 1.448509693145752
40th percentile: 1.4601097106933594
50th percentile: 1.4717097282409668
60th percentile: 1.5631275177001953
70th percentile: 1.6545453071594238
80th percentile: 1.7398958206176758
90th percentile: 1.8191790580749512
95th percentile: 1.8588206768035889
99th percentile: 1.890533971786499
mean time: 1.5782174110412597
Pipeline stage StressChecker completed in 9.45s
Shutdown handler de-registered
chaiml-nemo-20241010-t_5991_v150 status is now deployed due to DeploymentManager action
chaiml-nemo-20241010-t_5991_v150 status is now inactive due to auto deactivation removed underperforming models
chaiml-nemo-20241010-t_5991_v150 status is now torndown due to DeploymentManager action