Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name cgato-nemo-12b-thespice-8050-v1-mkmlizer
Waiting for job on cgato-nemo-12b-thespice-8050-v1-mkmlizer to finish
cgato-nemo-12b-thespice-8050-v1-mkmlizer: Downloaded to shared memory in 47.501s
cgato-nemo-12b-thespice-8050-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp4ynt6ikm, device:0
cgato-nemo-12b-thespice-8050-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
cgato-nemo-12b-thespice-8050-v1-mkmlizer: quantized model in 36.158s
cgato-nemo-12b-thespice-8050-v1-mkmlizer: Processed model cgato/Nemo-12b-TheSpice-V0.9-All-v1 in 83.659s
cgato-nemo-12b-thespice-8050-v1-mkmlizer: creating bucket guanaco-mkml-models
cgato-nemo-12b-thespice-8050-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
cgato-nemo-12b-thespice-8050-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/cgato-nemo-12b-thespice-8050-v1
cgato-nemo-12b-thespice-8050-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/cgato-nemo-12b-thespice-8050-v1/config.json
cgato-nemo-12b-thespice-8050-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/cgato-nemo-12b-thespice-8050-v1/special_tokens_map.json
cgato-nemo-12b-thespice-8050-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/cgato-nemo-12b-thespice-8050-v1/tokenizer_config.json
cgato-nemo-12b-thespice-8050-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/cgato-nemo-12b-thespice-8050-v1/tokenizer.json
cgato-nemo-12b-thespice-8050-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/cgato-nemo-12b-thespice-8050-v1/flywheel_model.0.safetensors
cgato-nemo-12b-thespice-8050-v1-mkmlizer:
Loading 0: 0%| | 0/363 [00:00<?, ?it/s]
Loading 0: 1%|▏ | 5/363 [00:00<00:11, 31.34it/s]
Loading 0: 4%|▎ | 13/363 [00:00<00:06, 50.95it/s]
Loading 0: 5%|▌ | 19/363 [00:00<00:07, 44.86it/s]
Loading 0: 7%|▋ | 24/363 [00:00<00:07, 43.60it/s]
Loading 0: 9%|▊ | 31/363 [00:00<00:06, 49.41it/s]
Loading 0: 10%|█ | 37/363 [00:00<00:07, 45.26it/s]
Loading 0: 12%|█▏ | 42/363 [00:00<00:07, 43.92it/s]
Loading 0: 13%|█▎ | 49/363 [00:01<00:06, 48.86it/s]
Loading 0: 15%|█▌ | 55/363 [00:01<00:06, 45.27it/s]
Loading 0: 17%|█▋ | 61/363 [00:01<00:08, 33.70it/s]
Loading 0: 18%|█▊ | 65/363 [00:01<00:09, 33.11it/s]
Loading 0: 20%|█▉ | 72/363 [00:01<00:07, 39.34it/s]
Loading 0: 21%|██ | 77/363 [00:01<00:07, 40.61it/s]
Loading 0: 23%|██▎ | 82/363 [00:02<00:07, 35.73it/s]
Loading 0: 24%|██▍ | 87/363 [00:02<00:07, 37.95it/s]
Loading 0: 25%|██▌ | 92/363 [00:02<00:07, 37.62it/s]
Loading 0: 27%|██▋ | 98/363 [00:02<00:06, 42.42it/s]
Loading 0: 28%|██▊ | 103/363 [00:02<00:06, 40.80it/s]
Loading 0: 30%|██▉ | 108/363 [00:02<00:06, 42.09it/s]
Loading 0: 31%|███ | 113/363 [00:02<00:07, 35.58it/s]
Loading 0: 33%|███▎ | 118/363 [00:02<00:06, 36.23it/s]
Loading 0: 34%|███▍ | 125/363 [00:03<00:05, 42.93it/s]
Loading 0: 36%|███▌ | 130/363 [00:03<00:05, 43.06it/s]
Loading 0: 37%|███▋ | 135/363 [00:03<00:05, 43.75it/s]
Loading 0: 39%|███▉ | 141/363 [00:03<00:05, 41.59it/s]
Loading 0: 40%|████ | 146/363 [00:03<00:07, 30.84it/s]
Loading 0: 41%|████▏ | 150/363 [00:03<00:06, 30.70it/s]
Loading 0: 43%|████▎ | 156/363 [00:03<00:05, 36.12it/s]
Loading 0: 44%|████▍ | 161/363 [00:04<00:05, 36.98it/s]
Loading 0: 45%|████▌ | 165/363 [00:04<00:05, 37.60it/s]
Loading 0: 47%|████▋ | 169/363 [00:04<00:05, 37.84it/s]
Loading 0: 48%|████▊ | 175/363 [00:04<00:04, 41.61it/s]
Loading 0: 50%|████▉ | 181/363 [00:04<00:04, 40.24it/s]
Loading 0: 51%|█████ | 186/363 [00:04<00:04, 39.89it/s]
Loading 0: 53%|█████▎ | 192/363 [00:04<00:03, 42.99it/s]
Loading 0: 54%|█████▍ | 197/363 [00:04<00:03, 42.65it/s]
Loading 0: 56%|█████▌ | 202/363 [00:05<00:03, 43.25it/s]
Loading 0: 57%|█████▋ | 207/363 [00:05<00:03, 44.87it/s]
Loading 0: 58%|█████▊ | 212/363 [00:05<00:03, 37.77it/s]
Loading 0: 60%|██████ | 218/363 [00:05<00:03, 42.45it/s]
Loading 0: 61%|██████▏ | 223/363 [00:05<00:04, 33.43it/s]
Loading 0: 63%|██████▎ | 227/363 [00:05<00:03, 34.29it/s]
Loading 0: 64%|██████▎ | 231/363 [00:05<00:03, 33.07it/s]
Loading 0: 65%|██████▌ | 237/363 [00:06<00:03, 38.51it/s]
Loading 0: 67%|██████▋ | 242/363 [00:06<00:03, 38.95it/s]
Loading 0: 68%|██████▊ | 247/363 [00:06<00:02, 40.19it/s]
Loading 0: 69%|██████▉ | 252/363 [00:06<00:02, 41.99it/s]
Loading 0: 71%|███████ | 257/363 [00:06<00:03, 34.81it/s]
Loading 0: 73%|███████▎ | 264/363 [00:06<00:02, 42.77it/s]
Loading 0: 74%|███████▍ | 269/363 [00:06<00:02, 43.24it/s]
Loading 0: 75%|███████▌ | 274/363 [00:06<00:02, 43.58it/s]
Loading 0: 77%|███████▋ | 279/363 [00:06<00:01, 44.83it/s]
Loading 0: 78%|███████▊ | 284/363 [00:07<00:02, 36.59it/s]
Loading 0: 80%|████████ | 291/363 [00:07<00:01, 43.62it/s]
Loading 0: 82%|████████▏ | 296/363 [00:07<00:01, 41.40it/s]
Loading 0: 83%|████████▎ | 301/363 [00:07<00:01, 42.04it/s]
Loading 0: 84%|████████▍ | 306/363 [00:14<00:23, 2.48it/s]
Loading 0: 85%|████████▌ | 310/363 [00:14<00:16, 3.18it/s]
Loading 0: 87%|████████▋ | 314/363 [00:14<00:11, 4.17it/s]
Loading 0: 88%|████████▊ | 320/363 [00:14<00:06, 6.24it/s]
Loading 0: 89%|████████▉ | 324/363 [00:14<00:04, 7.90it/s]
Loading 0: 90%|█████████ | 328/363 [00:14<00:03, 9.93it/s]
Loading 0: 91%|█████████▏| 332/363 [00:15<00:02, 12.32it/s]
Loading 0: 93%|█████████▎| 337/363 [00:15<00:01, 16.38it/s]
Loading 0: 94%|█████████▍| 341/363 [00:15<00:01, 19.40it/s]
Loading 0: 95%|█████████▌| 346/363 [00:15<00:00, 23.66it/s]
Loading 0: 97%|█████████▋| 351/363 [00:15<00:00, 27.79it/s]
Loading 0: 98%|█████████▊| 356/363 [00:15<00:00, 31.70it/s]
Loading 0: 99%|█████████▉| 361/363 [00:15<00:00, 35.46it/s]
Job cgato-nemo-12b-thespice-8050-v1-mkmlizer completed after 113.38s with status: succeeded
Stopping job with name cgato-nemo-12b-thespice-8050-v1-mkmlizer
Pipeline stage MKMLizer completed in 113.95s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service cgato-nemo-12b-thespice-8050-v1
Waiting for inference service cgato-nemo-12b-thespice-8050-v1 to be ready
Failed to get response for submission jic062-dpo-v3-0-nemo_v1: ('http://jic062-dpo-v3-0-nemo-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Inference service cgato-nemo-12b-thespice-8050-v1 ready after 130.5773491859436s
Pipeline stage MKMLDeployer completed in 131.02s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.130971670150757s
Received healthy response to inference request in 1.7366607189178467s
Received healthy response to inference request in 1.938791275024414s
Received healthy response to inference request in 1.725184679031372s
5 requests
1 failed requests
5th percentile: 1.727479887008667
10th percentile: 1.7297750949859618
20th percentile: 1.7343655109405518
30th percentile: 1.7770868301391602
40th percentile: 1.8579390525817872
50th percentile: 1.938791275024414
60th percentile: 2.015663433074951
70th percentile: 2.0925355911254884
80th percentile: 5.730849647521976
90th percentile: 12.930605602264407
95th percentile: 16.530483579635618
99th percentile: 19.410385961532594
mean time: 5.532393980026245
%s, retrying in %s seconds...
Received healthy response to inference request in 1.6527836322784424s
Received healthy response to inference request in 1.7096805572509766s
Received healthy response to inference request in 1.7386248111724854s
Received healthy response to inference request in 2.2183632850646973s
Received healthy response to inference request in 1.5982756614685059s
5 requests
0 failed requests
5th percentile: 1.609177255630493
10th percentile: 1.6200788497924805
20th percentile: 1.6418820381164552
30th percentile: 1.6641630172729491
40th percentile: 1.6869217872619628
50th percentile: 1.7096805572509766
60th percentile: 1.7212582588195802
70th percentile: 1.7328359603881835
80th percentile: 1.834572505950928
90th percentile: 2.0264678955078126
95th percentile: 2.1224155902862547
99th percentile: 2.199173746109009
mean time: 1.7835455894470216
Pipeline stage StressChecker completed in 39.13s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 5.23s
Shutdown handler de-registered
cgato-nemo-12b-thespice-_8050_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
clean up pipeline due to error=DeploymentChecksError("('http://meta-llama-llama-guard-3-8b-v4-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')")
Shutdown handler de-registered
cgato-nemo-12b-thespice-_8050_v1 status is now inactive due to auto deactivation removed underperforming models