Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name delta-vector-odin-9b-v2-mkmlizer
Waiting for job on delta-vector-odin-9b-v2-mkmlizer to finish
delta-vector-odin-9b-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
delta-vector-odin-9b-v2-mkmlizer: ║ _____ __ __ ║
delta-vector-odin-9b-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
delta-vector-odin-9b-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
delta-vector-odin-9b-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
delta-vector-odin-9b-v2-mkmlizer: ║ /___/ ║
delta-vector-odin-9b-v2-mkmlizer: ║ ║
delta-vector-odin-9b-v2-mkmlizer: ║ Version: 0.11.12 ║
delta-vector-odin-9b-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
delta-vector-odin-9b-v2-mkmlizer: ║ https://mk1.ai ║
delta-vector-odin-9b-v2-mkmlizer: ║ ║
delta-vector-odin-9b-v2-mkmlizer: ║ The license key for the current software has been verified as ║
delta-vector-odin-9b-v2-mkmlizer: ║ belonging to: ║
delta-vector-odin-9b-v2-mkmlizer: ║ ║
delta-vector-odin-9b-v2-mkmlizer: ║ Chai Research Corp. ║
delta-vector-odin-9b-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
delta-vector-odin-9b-v2-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
delta-vector-odin-9b-v2-mkmlizer: ║ ║
delta-vector-odin-9b-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Connection pool is full, discarding connection: %s. Connection pool size: %s
delta-vector-odin-9b-v2-mkmlizer: Downloaded to shared memory in 37.786s
delta-vector-odin-9b-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp6uzk4fde, device:0
delta-vector-odin-9b-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
delta-vector-odin-9b-v2-mkmlizer: quantized model in 35.212s
delta-vector-odin-9b-v2-mkmlizer: Processed model Delta-Vector/Odin-9B in 72.998s
Connection pool is full, discarding connection: %s. Connection pool size: %s
delta-vector-odin-9b-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
delta-vector-odin-9b-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/delta-vector-odin-9b-v2
delta-vector-odin-9b-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/delta-vector-odin-9b-v2/config.json
delta-vector-odin-9b-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/delta-vector-odin-9b-v2/special_tokens_map.json
delta-vector-odin-9b-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/delta-vector-odin-9b-v2/tokenizer_config.json
delta-vector-odin-9b-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.model s3://guanaco-mkml-models/delta-vector-odin-9b-v2/tokenizer.model
delta-vector-odin-9b-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/delta-vector-odin-9b-v2/tokenizer.json
delta-vector-odin-9b-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/delta-vector-odin-9b-v2/flywheel_model.0.safetensors
delta-vector-odin-9b-v2-mkmlizer:
Loading 0: 0%| | 0/464 [00:00<?, ?it/s]
Loading 0: 3%|▎ | 12/464 [00:00<00:05, 75.96it/s]
Loading 0: 5%|▍ | 23/464 [00:00<00:06, 72.39it/s]
Loading 0: 7%|▋ | 34/464 [00:00<00:05, 79.28it/s]
Loading 0: 10%|▉ | 45/464 [00:00<00:05, 82.27it/s]
Loading 0: 12%|█▏ | 56/464 [00:00<00:04, 85.10it/s]
Loading 0: 14%|█▍ | 67/464 [00:00<00:04, 88.56it/s]
Loading 0: 17%|█▋ | 78/464 [00:00<00:04, 91.32it/s]
Loading 0: 19%|█▉ | 88/464 [00:01<00:06, 59.71it/s]
Loading 0: 21%|██ | 98/464 [00:01<00:05, 66.60it/s]
Loading 0: 23%|██▎ | 108/464 [00:01<00:04, 71.63it/s]
Loading 0: 25%|██▌ | 117/464 [00:01<00:04, 72.87it/s]
Loading 0: 28%|██▊ | 128/464 [00:01<00:04, 76.80it/s]
Loading 0: 30%|██▉ | 139/464 [00:01<00:04, 80.31it/s]
Loading 0: 32%|███▏ | 150/464 [00:01<00:03, 81.73it/s]
Loading 0: 35%|███▍ | 161/464 [00:02<00:03, 83.38it/s]
Loading 0: 37%|███▋ | 172/464 [00:02<00:03, 82.94it/s]
Loading 0: 39%|███▉ | 183/464 [00:02<00:03, 85.16it/s]
Loading 0: 42%|████▏ | 194/464 [00:02<00:03, 87.42it/s]
Loading 0: 44%|████▍ | 205/464 [00:02<00:02, 91.29it/s]
Loading 0: 46%|████▋ | 215/464 [00:02<00:02, 90.97it/s]
Loading 0: 48%|████▊ | 225/464 [00:02<00:02, 86.83it/s]
Loading 0: 50%|█████ | 234/464 [00:02<00:03, 67.44it/s]
Loading 0: 52%|█████▏ | 243/464 [00:03<00:03, 70.85it/s]
Loading 0: 55%|█████▍ | 254/464 [00:03<00:02, 74.36it/s]
Loading 0: 57%|█████▋ | 265/464 [00:03<00:02, 78.82it/s]
Loading 0: 59%|█████▉ | 276/464 [00:03<00:02, 82.36it/s]
Loading 0: 62%|██████▏ | 287/464 [00:03<00:02, 83.31it/s]
Loading 0: 64%|██████▍ | 298/464 [00:03<00:01, 85.00it/s]
Loading 0: 67%|██████▋ | 309/464 [00:03<00:01, 85.71it/s]
Loading 0: 69%|██████▉ | 320/464 [00:03<00:01, 86.33it/s]
Loading 0: 71%|███████▏ | 331/464 [00:04<00:01, 86.59it/s]
Loading 0: 74%|███████▎ | 342/464 [00:04<00:01, 87.82it/s]
Loading 0: 76%|███████▌ | 353/464 [00:04<00:01, 88.84it/s]
Loading 0: 78%|███████▊ | 362/464 [00:04<00:01, 62.11it/s]
Loading 0: 81%|████████ | 375/464 [00:04<00:01, 73.36it/s]
Loading 0: 83%|████████▎ | 386/464 [00:04<00:01, 77.18it/s]
Loading 0: 86%|████████▌ | 397/464 [00:05<00:00, 76.78it/s]
Loading 0: 88%|████████▊ | 408/464 [00:05<00:00, 79.57it/s]
Loading 0: 90%|█████████ | 419/464 [00:05<00:00, 82.05it/s]
Loading 0: 93%|█████████▎| 430/464 [00:05<00:00, 84.01it/s]
Loading 0: 95%|█████████▌| 441/464 [00:05<00:00, 86.12it/s]
Loading 0: 97%|█████████▋| 452/464 [00:05<00:00, 85.69it/s]
Loading 0: 100%|█████████▉| 463/464 [00:05<00:00, 87.12it/s]
Job delta-vector-odin-9b-v2-mkmlizer completed after 93.98s with status: succeeded
Stopping job with name delta-vector-odin-9b-v2-mkmlizer
Pipeline stage MKMLizer completed in 94.65s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.19s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service delta-vector-odin-9b-v2
Waiting for inference service delta-vector-odin-9b-v2 to be ready
Inference service delta-vector-odin-9b-v2 ready after 150.57488298416138s
Pipeline stage MKMLDeployer completed in 151.19s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 20.15876908302307
10th percentile: 20.165912055969237
20th percentile: 20.180198001861573
30th percentile: 20.189283990859984
40th percentile: 20.193170022964477
50th percentile: 20.19705605506897
60th percentile: 20.20684370994568
70th percentile: 20.216631364822387
80th percentile: 20.225272989273073
90th percentile: 20.23276858329773
95th percentile: 20.236516380310057
99th percentile: 20.239514617919923
mean time: 20.199562501907348
%s, retrying in %s seconds...
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Connection pool is full, discarding connection: %s. Connection pool size: %s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 20.15786166191101
10th percentile: 20.16273703575134
20th percentile: 20.172487783432008
30th percentile: 20.18300094604492
40th percentile: 20.194276523590087
50th percentile: 20.205552101135254
60th percentile: 20.320332527160645
70th percentile: 20.435112953186035
80th percentile: 20.503349161148073
90th percentile: 20.525041151046754
95th percentile: 20.535887145996092
99th percentile: 20.544563941955566
mean time: 20.31502757072449
%s, retrying in %s seconds...
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 20.149541187286378
10th percentile: 20.151610231399538
20th percentile: 20.155748319625854
30th percentile: 20.1611665725708
40th percentile: 20.167864990234374
50th percentile: 20.17456340789795
60th percentile: 20.177322101593017
70th percentile: 20.180080795288085
80th percentile: 20.232873487472535
90th percentile: 20.33570017814636
95th percentile: 20.387113523483276
99th percentile: 20.428244199752807
mean time: 20.219967985153197
clean up pipeline due to error=DeploymentChecksError('Unacceptable number of predict errors: 100.0%')
Shutdown handler de-registered
delta-vector-odin-9b_v2 status is now failed due to DeploymentManager action
delta-vector-odin-9b_v2 status is now torndown due to DeploymentManager action