Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name google-gemma-2-9b-it-v3-mkmlizer
Waiting for job on google-gemma-2-9b-it-v3-mkmlizer to finish
google-gemma-2-9b-it-v3-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
google-gemma-2-9b-it-v3-mkmlizer: ║ _____ __ __ ║
google-gemma-2-9b-it-v3-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
google-gemma-2-9b-it-v3-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
google-gemma-2-9b-it-v3-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
google-gemma-2-9b-it-v3-mkmlizer: ║ /___/ ║
google-gemma-2-9b-it-v3-mkmlizer: ║ ║
google-gemma-2-9b-it-v3-mkmlizer: ║ Version: 0.12.8 ║
google-gemma-2-9b-it-v3-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
google-gemma-2-9b-it-v3-mkmlizer: ║ https://mk1.ai ║
google-gemma-2-9b-it-v3-mkmlizer: ║ ║
google-gemma-2-9b-it-v3-mkmlizer: ║ The license key for the current software has been verified as ║
google-gemma-2-9b-it-v3-mkmlizer: ║ belonging to: ║
google-gemma-2-9b-it-v3-mkmlizer: ║ ║
google-gemma-2-9b-it-v3-mkmlizer: ║ Chai Research Corp. ║
google-gemma-2-9b-it-v3-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
google-gemma-2-9b-it-v3-mkmlizer: ║ Expiration: 2025-04-15 23:59:59 ║
google-gemma-2-9b-it-v3-mkmlizer: ║ ║
google-gemma-2-9b-it-v3-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
google-gemma-2-9b-it-v3-mkmlizer: Downloaded to shared memory in 23.074s
google-gemma-2-9b-it-v3-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpjrlf8iwd, device:0
google-gemma-2-9b-it-v3-mkmlizer: Saving flywheel model at /dev/shm/model_cache
google-gemma-2-9b-it-v3-mkmlizer: quantized model in 36.403s
google-gemma-2-9b-it-v3-mkmlizer: Processed model google/gemma-2-9b-it in 59.478s
google-gemma-2-9b-it-v3-mkmlizer: creating bucket guanaco-mkml-models
google-gemma-2-9b-it-v3-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
google-gemma-2-9b-it-v3-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/google-gemma-2-9b-it-v3
google-gemma-2-9b-it-v3-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/google-gemma-2-9b-it-v3/config.json
google-gemma-2-9b-it-v3-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/google-gemma-2-9b-it-v3/special_tokens_map.json
google-gemma-2-9b-it-v3-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/google-gemma-2-9b-it-v3/tokenizer_config.json
google-gemma-2-9b-it-v3-mkmlizer: cp /dev/shm/model_cache/tokenizer.model s3://guanaco-mkml-models/google-gemma-2-9b-it-v3/tokenizer.model
google-gemma-2-9b-it-v3-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/google-gemma-2-9b-it-v3/tokenizer.json
google-gemma-2-9b-it-v3-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/google-gemma-2-9b-it-v3/flywheel_model.0.safetensors
google-gemma-2-9b-it-v3-mkmlizer:
Loading 0: 0%| | 0/464 [00:00<?, ?it/s]
Loading 0: 3%|▎ | 12/464 [00:00<00:06, 73.11it/s]
Loading 0: 5%|▍ | 23/464 [00:00<00:06, 69.73it/s]
Loading 0: 7%|▋ | 34/464 [00:00<00:05, 76.82it/s]
Loading 0: 10%|▉ | 45/464 [00:00<00:05, 77.74it/s]
Loading 0: 12%|█▏ | 56/464 [00:00<00:05, 79.21it/s]
Loading 0: 14%|█▍ | 67/464 [00:00<00:04, 81.71it/s]
Loading 0: 17%|█▋ | 78/464 [00:00<00:04, 83.56it/s]
Loading 0: 19%|█▉ | 87/464 [00:01<00:07, 53.07it/s]
Loading 0: 20%|██ | 95/464 [00:01<00:06, 57.13it/s]
Loading 0: 23%|██▎ | 106/464 [00:01<00:05, 62.74it/s]
Loading 0: 25%|██▌ | 117/464 [00:01<00:05, 67.58it/s]
Loading 0: 28%|██▊ | 128/464 [00:01<00:04, 70.35it/s]
Loading 0: 30%|██▉ | 139/464 [00:01<00:04, 70.49it/s]
Loading 0: 32%|███▏ | 150/464 [00:02<00:04, 72.69it/s]
Loading 0: 35%|███▍ | 161/464 [00:02<00:04, 74.70it/s]
Loading 0: 37%|███▋ | 172/464 [00:02<00:03, 76.52it/s]
Loading 0: 39%|███▉ | 183/464 [00:02<00:03, 77.02it/s]
Loading 0: 42%|████▏ | 194/464 [00:02<00:03, 78.56it/s]
Loading 0: 44%|████▎ | 202/464 [00:02<00:03, 78.86it/s]
Loading 0: 46%|████▌ | 214/464 [00:02<00:03, 82.09it/s]
Loading 0: 48%|████▊ | 225/464 [00:03<00:03, 79.47it/s]
Loading 0: 50%|█████ | 233/464 [00:03<00:03, 58.64it/s]
Loading 0: 52%|█████▏ | 243/464 [00:03<00:03, 61.46it/s]
Loading 0: 55%|█████▍ | 254/464 [00:03<00:03, 63.36it/s]
Loading 0: 57%|█████▋ | 265/464 [00:03<00:02, 68.25it/s]
Loading 0: 59%|█████▉ | 276/464 [00:03<00:02, 71.68it/s]
Loading 0: 62%|██████▏ | 287/464 [00:04<00:02, 72.23it/s]
Loading 0: 64%|██████▍ | 298/464 [00:04<00:02, 73.86it/s]
Loading 0: 67%|██████▋ | 309/464 [00:04<00:02, 73.42it/s]
Loading 0: 69%|██████▉ | 320/464 [00:04<00:01, 74.16it/s]
Loading 0: 71%|███████▏ | 331/464 [00:04<00:01, 74.76it/s]
Loading 0: 74%|███████▎ | 342/464 [00:04<00:01, 75.03it/s]
Loading 0: 76%|███████▌ | 353/464 [00:04<00:01, 75.05it/s]
Loading 0: 78%|███████▊ | 361/464 [00:05<00:02, 49.11it/s]
Loading 0: 81%|████████ | 375/464 [00:05<00:01, 61.01it/s]
Loading 0: 83%|████████▎ | 386/464 [00:05<00:01, 65.70it/s]
Loading 0: 86%|████████▌ | 397/464 [00:05<00:00, 69.12it/s]
Loading 0: 88%|████████▊ | 408/464 [00:05<00:00, 71.46it/s]
Loading 0: 90%|█████████ | 419/464 [00:05<00:00, 73.53it/s]
Loading 0: 93%|█████████▎| 430/464 [00:06<00:00, 75.77it/s]
Loading 0: 95%|█████████▌| 441/464 [00:06<00:00, 76.71it/s]
Loading 0: 97%|█████████▋| 452/464 [00:06<00:00, 76.77it/s]
Loading 0: 100%|█████████▉| 463/464 [00:06<00:00, 77.83it/s]
Job google-gemma-2-9b-it-v3-mkmlizer completed after 83.81s with status: succeeded
Stopping job with name google-gemma-2-9b-it-v3-mkmlizer
Pipeline stage MKMLizer completed in 84.34s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.17s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service google-gemma-2-9b-it-v3
Waiting for inference service google-gemma-2-9b-it-v3 to be ready
Inference service google-gemma-2-9b-it-v3 ready after 90.3642327785492s
Pipeline stage MKMLDeployer completed in 90.91s
run pipeline stage %s
Running pipeline stage StressChecker
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 12.226308584213257
10th percentile: 12.226574897766113
20th percentile: 12.227107524871826
30th percentile: 12.22965612411499
40th percentile: 12.234220695495605
50th percentile: 12.23878526687622
60th percentile: 12.241985607147218
70th percentile: 12.245185947418213
80th percentile: 12.250767135620118
90th percentile: 12.258729171752929
95th percentile: 12.262710189819336
99th percentile: 12.26589500427246
mean time: 12.241135740280152
%s, retrying in %s seconds...
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 12.219215059280396
10th percentile: 12.222797441482545
20th percentile: 12.22996220588684
30th percentile: 12.2360408782959
40th percentile: 12.241033458709717
50th percentile: 12.246026039123535
60th percentile: 12.252625465393066
70th percentile: 12.259224891662598
80th percentile: 12.264357995986938
90th percentile: 12.268024778366089
95th percentile: 12.269858169555665
99th percentile: 12.271324882507324
mean time: 12.245883893966674
%s, retrying in %s seconds...
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
{"detail":"HTTPConnectionPool(host='google-gemma-2-9b-it-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)"}
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 12.224599885940552
10th percentile: 12.230258560180664
20th percentile: 12.241575908660888
30th percentile: 12.248252773284912
40th percentile: 12.250289154052734
50th percentile: 12.252325534820557
60th percentile: 12.279523181915284
70th percentile: 12.30672082901001
80th percentile: 12.33959527015686
90th percentile: 12.378146505355835
95th percentile: 12.397422122955323
99th percentile: 12.412842617034912
mean time: 12.291103744506836
clean up pipeline due to error=DeploymentChecksError('Unacceptable number of predict errors: 100.0%')
Shutdown handler de-registered
google-gemma-2-9b-it_v3 status is now failed due to DeploymentManager action