Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rica40325-spin-first-v2-mkmlizer
Waiting for job on rica40325-spin-first-v2-mkmlizer to finish
rica40325-spin-first-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rica40325-spin-first-v2-mkmlizer: ║ _____ __ __ ║
rica40325-spin-first-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rica40325-spin-first-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rica40325-spin-first-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rica40325-spin-first-v2-mkmlizer: ║ /___/ ║
rica40325-spin-first-v2-mkmlizer: ║ ║
rica40325-spin-first-v2-mkmlizer: ║ Version: 0.10.1 ║
rica40325-spin-first-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rica40325-spin-first-v2-mkmlizer: ║ https://mk1.ai ║
rica40325-spin-first-v2-mkmlizer: ║ ║
rica40325-spin-first-v2-mkmlizer: ║ The license key for the current software has been verified as ║
rica40325-spin-first-v2-mkmlizer: ║ belonging to: ║
rica40325-spin-first-v2-mkmlizer: ║ ║
rica40325-spin-first-v2-mkmlizer: ║ Chai Research Corp. ║
rica40325-spin-first-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rica40325-spin-first-v2-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
rica40325-spin-first-v2-mkmlizer: ║ ║
rica40325-spin-first-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
rica40325-spin-first-v2-mkmlizer: Downloaded to shared memory in 62.413s
rica40325-spin-first-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp04qiqqsv, device:0
rica40325-spin-first-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
rica40325-spin-first-v2-mkmlizer: quantized model in 28.690s
rica40325-spin-first-v2-mkmlizer: Processed model rica40325/spin_first in 91.103s
rica40325-spin-first-v2-mkmlizer: creating bucket guanaco-mkml-models
rica40325-spin-first-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rica40325-spin-first-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rica40325-spin-first-v2
rica40325-spin-first-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/rica40325-spin-first-v2/config.json
rica40325-spin-first-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/rica40325-spin-first-v2/special_tokens_map.json
rica40325-spin-first-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/rica40325-spin-first-v2/tokenizer_config.json
rica40325-spin-first-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rica40325-spin-first-v2/tokenizer.json
rica40325-spin-first-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/rica40325-spin-first-v2/flywheel_model.0.safetensors
rica40325-spin-first-v2-mkmlizer:
Loading 0: 0%| | 0/291 [00:00<?, ?it/s]
Loading 0: 2%|▏ | 5/291 [00:00<00:10, 26.89it/s]
Loading 0: 4%|▍ | 12/291 [00:00<00:07, 37.72it/s]
Loading 0: 5%|▌ | 16/291 [00:00<00:07, 34.96it/s]
Loading 0: 7%|▋ | 21/291 [00:00<00:06, 39.23it/s]
Loading 0: 9%|▉ | 26/291 [00:00<00:06, 38.86it/s]
Loading 0: 11%|█ | 32/291 [00:00<00:06, 40.27it/s]
Loading 0: 13%|█▎ | 37/291 [00:01<00:09, 27.56it/s]
Loading 0: 14%|█▍ | 41/291 [00:01<00:09, 26.98it/s]
Loading 0: 16%|█▋ | 48/291 [00:01<00:07, 34.48it/s]
Loading 0: 18%|█▊ | 52/291 [00:01<00:07, 33.07it/s]
Loading 0: 20%|█▉ | 57/291 [00:01<00:06, 35.03it/s]
Loading 0: 21%|██ | 61/291 [00:01<00:06, 33.08it/s]
Loading 0: 23%|██▎ | 66/291 [00:01<00:06, 35.54it/s]
Loading 0: 24%|██▍ | 70/291 [00:02<00:06, 33.02it/s]
Loading 0: 25%|██▌ | 74/291 [00:02<00:06, 32.45it/s]
Loading 0: 27%|██▋ | 78/291 [00:02<00:06, 33.30it/s]
Loading 0: 28%|██▊ | 82/291 [00:02<00:08, 24.57it/s]
Loading 0: 30%|██▉ | 86/291 [00:02<00:07, 27.22it/s]
Loading 0: 31%|███▏ | 91/291 [00:02<00:06, 30.13it/s]
Loading 0: 33%|███▎ | 95/291 [00:02<00:06, 32.12it/s]
Loading 0: 34%|███▍ | 99/291 [00:03<00:05, 34.03it/s]
Loading 0: 35%|███▌ | 103/291 [00:03<00:05, 32.72it/s]
Loading 0: 37%|███▋ | 108/291 [00:03<00:05, 36.07it/s]
Loading 0: 38%|███▊ | 112/291 [00:03<00:05, 35.05it/s]
Loading 0: 40%|███▉ | 116/291 [00:03<00:04, 35.85it/s]
Loading 0: 42%|████▏ | 122/291 [00:03<00:04, 39.21it/s]
Loading 0: 44%|████▎ | 127/291 [00:03<00:04, 35.85it/s]
Loading 0: 46%|████▌ | 133/291 [00:04<00:05, 31.13it/s]
Loading 0: 47%|████▋ | 137/291 [00:04<00:05, 30.44it/s]
Loading 0: 48%|████▊ | 141/291 [00:04<00:05, 28.48it/s]
Loading 0: 51%|█████ | 147/291 [00:04<00:04, 32.91it/s]
Loading 0: 52%|█████▏ | 151/291 [00:04<00:04, 31.96it/s]
Loading 0: 54%|█████▎ | 156/291 [00:04<00:04, 33.67it/s]
Loading 0: 55%|█████▍ | 160/291 [00:04<00:04, 32.73it/s]
Loading 0: 57%|█████▋ | 165/291 [00:04<00:03, 35.98it/s]
Loading 0: 58%|█████▊ | 169/291 [00:05<00:03, 34.11it/s]
Loading 0: 60%|█████▉ | 174/291 [00:05<00:03, 36.79it/s]
Loading 0: 61%|██████ | 178/291 [00:05<00:03, 35.83it/s]
Loading 0: 64%|██████▎ | 185/291 [00:05<00:02, 40.07it/s]
Loading 0: 65%|██████▌ | 190/291 [00:05<00:03, 27.60it/s]
Loading 0: 67%|██████▋ | 194/291 [00:05<00:03, 27.19it/s]
Loading 0: 69%|██████▉ | 201/291 [00:06<00:02, 33.47it/s]
Loading 0: 70%|███████ | 205/291 [00:06<00:02, 32.62it/s]
Loading 0: 72%|███████▏ | 210/291 [00:06<00:02, 35.88it/s]
Loading 0: 74%|███████▎ | 214/291 [00:06<00:02, 34.15it/s]
Loading 0: 75%|███████▌ | 219/291 [00:06<00:01, 37.26it/s]
Loading 0: 77%|███████▋ | 223/291 [00:06<00:01, 34.25it/s]
Loading 0: 78%|███████▊ | 227/291 [00:06<00:01, 35.18it/s]
Loading 0: 79%|███████▉ | 231/291 [00:06<00:01, 35.41it/s]
Loading 0: 81%|████████ | 235/291 [00:07<00:02, 25.60it/s]
Loading 0: 82%|████████▏ | 239/291 [00:07<00:02, 25.33it/s]
Loading 0: 85%|████████▍ | 246/291 [00:07<00:01, 33.47it/s]
Loading 0: 86%|████████▌ | 250/291 [00:07<00:01, 33.32it/s]
Loading 0: 88%|████████▊ | 255/291 [00:07<00:01, 35.26it/s]
Loading 0: 89%|████████▉ | 259/291 [00:07<00:00, 32.71it/s]
Loading 0: 91%|█████████ | 264/291 [00:08<00:00, 34.27it/s]
Loading 0: 92%|█████████▏| 268/291 [00:08<00:00, 31.89it/s]
Loading 0: 94%|█████████▍| 273/291 [00:08<00:00, 33.99it/s]
Loading 0: 95%|█████████▌| 277/291 [00:08<00:00, 32.62it/s]
Loading 0: 97%|█████████▋| 281/291 [00:08<00:00, 33.35it/s]
Loading 0: 98%|█████████▊| 286/291 [00:14<00:01, 2.61it/s]
Loading 0: 99%|█████████▉| 289/291 [00:14<00:00, 3.25it/s]
Job rica40325-spin-first-v2-mkmlizer completed after 113.84s with status: succeeded
Stopping job with name rica40325-spin-first-v2-mkmlizer
Pipeline stage MKMLizer completed in 115.44s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.11s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rica40325-spin-first-v2
Waiting for inference service rica40325-spin-first-v2 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Inference service rica40325-spin-first-v2 ready after 192.06193470954895s
Pipeline stage MKMLDeployer completed in 192.68s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.0041515827178955s
Received healthy response to inference request in 1.9577128887176514s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 1.3457534313201904s
Received healthy response to inference request in 1.4597582817077637s
Received healthy response to inference request in 1.35898756980896s
5 requests
0 failed requests
5th percentile: 1.3484002590179442
10th percentile: 1.3510470867156983
20th percentile: 1.3563407421112061
30th percentile: 1.3791417121887206
40th percentile: 1.4194499969482421
50th percentile: 1.4597582817077637
60th percentile: 1.6589401245117188
70th percentile: 1.8581219673156737
80th percentile: 1.9670006275177
90th percentile: 1.9855761051177978
95th percentile: 1.9948638439178468
99th percentile: 2.002294034957886
mean time: 1.6252727508544922
Pipeline stage StressChecker completed in 8.78s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 5.78s
Shutdown handler de-registered
rica40325-spin-first_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.12s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rica40325-spin-first-v2-profiler
Waiting for inference service rica40325-spin-first-v2-profiler to be ready
Inference service rica40325-spin-first-v2-profiler ready after 190.44263243675232s
Pipeline stage MKMLProfilerDeployer completed in 190.82s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rica40325-spin-first-v2-profiler-predictor-00001-deploymen62v6t:/code/chaiverse_profiler_1726715101 --namespace tenant-chaiml-guanaco
kubectl exec -it rica40325-spin-first-v2-profiler-predictor-00001-deploymen62v6t --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1726715101 && python profiles.py profile --best_of_n 8 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 1024 --output_tokens 64 --summary /code/chaiverse_profiler_1726715101/summary.json'
kubectl exec -it rica40325-spin-first-v2-profiler-predictor-00001-deploymen62v6t --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1726715101/summary.json'
Pipeline stage MKMLProfilerRunner completed in 782.74s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rica40325-spin-first-v2-profiler is running
Tearing down inference service rica40325-spin-first-v2-profiler
Service rica40325-spin-first-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 1.99s
Shutdown handler de-registered
rica40325-spin-first_v2 status is now inactive due to auto deactivation removed underperforming models
rica40325-spin-first_v2 status is now torndown due to DeploymentManager action