Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name sao10k-mn-12b-lyra-v4a1-v9-mkmlizer
Waiting for job on sao10k-mn-12b-lyra-v4a1-v9-mkmlizer to finish
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ _____ __ __ ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ /___/ ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ Version: 0.11.12 ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ https://mk1.ai ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ The license key for the current software has been verified as ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ belonging to: ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ Chai Research Corp. ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ║ ║
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: Downloaded to shared memory in 20.194s
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpzzdj1zw_, device:0
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: Saving flywheel model at /dev/shm/model_cache
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: quantized model in 30.517s
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: Processed model Sao10K/MN-12B-Lyra-v4a1 in 50.711s
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: creating bucket guanaco-mkml-models
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/sao10k-mn-12b-lyra-v4a1-v9
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/sao10k-mn-12b-lyra-v4a1-v9/config.json
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/sao10k-mn-12b-lyra-v4a1-v9/special_tokens_map.json
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/sao10k-mn-12b-lyra-v4a1-v9/tokenizer_config.json
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/sao10k-mn-12b-lyra-v4a1-v9/tokenizer.json
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/sao10k-mn-12b-lyra-v4a1-v9/flywheel_model.0.safetensors
sao10k-mn-12b-lyra-v4a1-v9-mkmlizer:
Loading 0: 0%| | 0/363 [00:00<?, ?it/s]
Loading 0: 1%| | 2/363 [00:05<15:52, 2.64s/it]
Loading 0: 2%|▏ | 6/363 [00:05<04:13, 1.41it/s]
Loading 0: 4%|▍ | 14/363 [00:05<01:22, 4.22it/s]
Loading 0: 6%|▌ | 20/363 [00:05<00:49, 6.88it/s]
Loading 0: 7%|▋ | 27/363 [00:05<00:30, 10.92it/s]
Loading 0: 9%|▉ | 33/363 [00:05<00:22, 14.54it/s]
Loading 0: 12%|█▏ | 42/363 [00:06<00:15, 21.23it/s]
Loading 0: 14%|█▍ | 51/363 [00:06<00:11, 27.68it/s]
Loading 0: 16%|█▌ | 58/363 [00:06<00:10, 28.44it/s]
Loading 0: 17%|█▋ | 63/363 [00:06<00:09, 31.40it/s]
Loading 0: 19%|█▉ | 69/363 [00:06<00:08, 33.59it/s]
Loading 0: 21%|██▏ | 78/363 [00:06<00:07, 40.06it/s]
Loading 0: 24%|██▍ | 87/363 [00:06<00:06, 45.15it/s]
Loading 0: 26%|██▋ | 96/363 [00:07<00:05, 48.47it/s]
Loading 0: 29%|██▉ | 105/363 [00:07<00:05, 50.89it/s]
Loading 0: 31%|███▏ | 114/363 [00:07<00:04, 52.29it/s]
Loading 0: 34%|███▍ | 123/363 [00:07<00:04, 53.51it/s]
Loading 0: 36%|███▌ | 131/363 [00:07<00:03, 58.96it/s]
Loading 0: 38%|███▊ | 138/363 [00:07<00:03, 57.91it/s]
Loading 0: 40%|███▉ | 145/363 [00:07<00:03, 58.35it/s]
Loading 0: 42%|████▏ | 152/363 [00:08<00:03, 53.69it/s]
Loading 0: 44%|████▎ | 158/363 [00:08<00:04, 41.14it/s]
Loading 0: 45%|████▌ | 164/363 [00:08<00:04, 41.94it/s]
Loading 0: 47%|████▋ | 169/363 [00:08<00:04, 43.53it/s]
Loading 0: 48%|████▊ | 176/363 [00:08<00:03, 49.53it/s]
Loading 0: 50%|█████ | 182/363 [00:08<00:03, 49.58it/s]
Loading 0: 52%|█████▏ | 188/363 [00:08<00:03, 51.28it/s]
Loading 0: 54%|█████▎ | 195/363 [00:09<00:03, 49.24it/s]
Loading 0: 56%|█████▌ | 203/363 [00:09<00:02, 56.77it/s]
Loading 0: 58%|█████▊ | 209/363 [00:09<00:02, 54.42it/s]
Loading 0: 59%|█████▉ | 215/363 [00:09<00:02, 55.14it/s]
Loading 0: 61%|██████ | 222/363 [00:09<00:02, 50.45it/s]
Loading 0: 64%|██████▎ | 231/363 [00:09<00:02, 52.29it/s]
Loading 0: 66%|██████▌ | 240/363 [00:09<00:02, 53.76it/s]
Loading 0: 69%|██████▊ | 249/363 [00:10<00:02, 54.67it/s]
Loading 0: 71%|███████ | 256/363 [00:10<00:02, 44.85it/s]
Loading 0: 72%|███████▏ | 261/363 [00:10<00:02, 45.38it/s]
Loading 0: 74%|███████▎ | 267/363 [00:10<00:02, 43.64it/s]
Loading 0: 76%|███████▌ | 276/363 [00:10<00:01, 47.13it/s]
Loading 0: 79%|███████▊ | 285/363 [00:10<00:01, 49.62it/s]
Loading 0: 81%|████████ | 293/363 [00:10<00:01, 55.94it/s]
Loading 0: 82%|████████▏ | 299/363 [00:11<00:01, 53.65it/s]
Loading 0: 84%|████████▍ | 305/363 [00:11<00:01, 54.26it/s]
Loading 0: 86%|████████▌ | 312/363 [00:11<00:00, 51.03it/s]
Loading 0: 88%|████████▊ | 320/363 [00:11<00:00, 57.81it/s]
Loading 0: 90%|█████████ | 327/363 [00:11<00:00, 57.07it/s]
Loading 0: 92%|█████████▏| 333/363 [00:11<00:00, 55.13it/s]
Loading 0: 93%|█████████▎| 339/363 [00:11<00:00, 49.86it/s]
Loading 0: 96%|█████████▌| 347/363 [00:11<00:00, 56.84it/s]
Loading 0: 97%|█████████▋| 353/363 [00:12<00:00, 54.82it/s]
Loading 0: 99%|█████████▉| 359/363 [00:12<00:00, 38.94it/s]
Job sao10k-mn-12b-lyra-v4a1-v9-mkmlizer completed after 69.94s with status: succeeded
Stopping job with name sao10k-mn-12b-lyra-v4a1-v9-mkmlizer
Pipeline stage MKMLizer completed in 71.78s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.65s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service sao10k-mn-12b-lyra-v4a1-v9
Waiting for inference service sao10k-mn-12b-lyra-v4a1-v9 to be ready
Inference service sao10k-mn-12b-lyra-v4a1-v9 ready after 30.820784091949463s
Pipeline stage MKMLDeployer completed in 32.57s
run pipeline stage %s
Running pipeline stage StressChecker
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 3.384308099746704s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.353977918624878s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.1863176822662354s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.2778871059417725s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.9137117862701416s
5 requests
0 failed requests
5th percentile: 2.2046315670013428
10th percentile: 2.22294545173645
20th percentile: 2.259573221206665
30th percentile: 2.2931052684783935
40th percentile: 2.323541593551636
50th percentile: 2.353977918624878
60th percentile: 2.5778714656829833
70th percentile: 2.8017650127410887
80th percentile: 3.0078310489654543
90th percentile: 3.196069574356079
95th percentile: 3.2901888370513914
99th percentile: 3.3654842472076414
mean time: 2.623240518569946
Pipeline stage StressChecker completed in 19.58s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 2.64s
Shutdown handler de-registered
sao10k-mn-12b-lyra-v4a1_v9 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.11s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service sao10k-mn-12b-lyra-v4a1-v9-profiler
Waiting for inference service sao10k-mn-12b-lyra-v4a1-v9-profiler to be ready
Inference service sao10k-mn-12b-lyra-v4a1-v9-profiler ready after 240.53820180892944s
Pipeline stage MKMLProfilerDeployer completed in 240.90s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/sao10k-mn-12b-lyra-v4a1-v9-profiler-predictor-00001-deployfn9l9:/code/chaiverse_profiler_1727463691 --namespace tenant-chaiml-guanaco
kubectl exec -it sao10k-mn-12b-lyra-v4a1-v9-profiler-predictor-00001-deployfn9l9 --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1727463691 && python profiles.py profile --best_of_n 8 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 1024 --output_tokens 64 --summary /code/chaiverse_profiler_1727463691/summary.json'
kubectl exec -it sao10k-mn-12b-lyra-v4a1-v9-profiler-predictor-00001-deployfn9l9 --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1727463691/summary.json'
Pipeline stage MKMLProfilerRunner completed in 1165.04s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service sao10k-mn-12b-lyra-v4a1-v9-profiler is running
Tearing down inference service sao10k-mn-12b-lyra-v4a1-v9-profiler
Service sao10k-mn-12b-lyra-v4a1-v9-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.30s
Shutdown handler de-registered
sao10k-mn-12b-lyra-v4a1_v9 status is now deployed due to admin request
sao10k-mn-12b-lyra-v4a1_v9 status is now torndown due to DeploymentManager action