Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name hastagaras-f3btest-v1-mkmlizer
Waiting for job on hastagaras-f3btest-v1-mkmlizer to finish
hastagaras-f3btest-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
hastagaras-f3btest-v1-mkmlizer: ║ _____ __ __ ║
hastagaras-f3btest-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
hastagaras-f3btest-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
hastagaras-f3btest-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
hastagaras-f3btest-v1-mkmlizer: ║ /___/ ║
hastagaras-f3btest-v1-mkmlizer: ║ ║
hastagaras-f3btest-v1-mkmlizer: ║ Version: 0.11.12 ║
hastagaras-f3btest-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
hastagaras-f3btest-v1-mkmlizer: ║ https://mk1.ai ║
hastagaras-f3btest-v1-mkmlizer: ║ ║
hastagaras-f3btest-v1-mkmlizer: ║ The license key for the current software has been verified as ║
hastagaras-f3btest-v1-mkmlizer: ║ belonging to: ║
hastagaras-f3btest-v1-mkmlizer: ║ ║
hastagaras-f3btest-v1-mkmlizer: ║ Chai Research Corp. ║
hastagaras-f3btest-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
hastagaras-f3btest-v1-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
hastagaras-f3btest-v1-mkmlizer: ║ ║
hastagaras-f3btest-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
hastagaras-f3btest-v1-mkmlizer: Downloaded to shared memory in 18.034s
hastagaras-f3btest-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp9wahsa9k, device:0
hastagaras-f3btest-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
hastagaras-f3btest-v1-mkmlizer: quantized model in 15.787s
hastagaras-f3btest-v1-mkmlizer: Processed model Hastagaras/f3btest in 33.822s
hastagaras-f3btest-v1-mkmlizer: creating bucket guanaco-mkml-models
hastagaras-f3btest-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
hastagaras-f3btest-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/hastagaras-f3btest-v1
hastagaras-f3btest-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/hastagaras-f3btest-v1/config.json
hastagaras-f3btest-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/hastagaras-f3btest-v1/special_tokens_map.json
hastagaras-f3btest-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/hastagaras-f3btest-v1/tokenizer_config.json
hastagaras-f3btest-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/hastagaras-f3btest-v1/tokenizer.json
hastagaras-f3btest-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/hastagaras-f3btest-v1/flywheel_model.0.safetensors
hastagaras-f3btest-v1-mkmlizer:
Loading 0: 0%| | 0/201 [00:00<?, ?it/s]
Loading 0: 2%|▏ | 5/201 [00:00<00:05, 35.45it/s]
Loading 0: 5%|▍ | 10/201 [00:00<00:04, 41.78it/s]
Loading 0: 7%|▋ | 15/201 [00:00<00:04, 42.65it/s]
Loading 0: 11%|█ | 22/201 [00:00<00:03, 52.16it/s]
Loading 0: 14%|█▍ | 28/201 [00:00<00:03, 44.34it/s]
Loading 0: 17%|█▋ | 34/201 [00:00<00:03, 48.48it/s]
Loading 0: 20%|██ | 41/201 [00:00<00:03, 47.69it/s]
Loading 0: 25%|██▍ | 50/201 [00:01<00:02, 52.88it/s]
Loading 0: 29%|██▉ | 59/201 [00:01<00:02, 54.12it/s]
Loading 0: 34%|███▍ | 68/201 [00:01<00:02, 54.79it/s]
Loading 0: 38%|███▊ | 77/201 [00:01<00:02, 57.15it/s]
Loading 0: 43%|████▎ | 86/201 [00:01<00:01, 58.50it/s]
Loading 0: 47%|████▋ | 95/201 [00:01<00:01, 58.12it/s]
Loading 0: 52%|█████▏ | 104/201 [00:01<00:01, 59.53it/s]
Loading 0: 56%|█████▌ | 113/201 [00:02<00:01, 57.08it/s]
Loading 0: 60%|██████ | 121/201 [00:02<00:01, 60.65it/s]
Loading 0: 64%|██████▎ | 128/201 [00:02<00:01, 57.26it/s]
Loading 0: 68%|██████▊ | 136/201 [00:02<00:01, 57.67it/s]
Loading 0: 72%|███████▏ | 145/201 [00:02<00:00, 59.20it/s]
Loading 0: 77%|███████▋ | 154/201 [00:02<00:00, 60.98it/s]
Loading 0: 81%|████████ | 163/201 [00:02<00:00, 60.35it/s]
Loading 0: 86%|████████▌ | 172/201 [00:03<00:00, 61.04it/s]
Loading 0: 89%|████████▉ | 179/201 [00:07<00:03, 6.03it/s]
Loading 0: 93%|█████████▎| 186/201 [00:07<00:01, 7.88it/s]
Loading 0: 97%|█████████▋| 195/201 [00:07<00:00, 11.01it/s]
Job hastagaras-f3btest-v1-mkmlizer completed after 53.51s with status: succeeded
Stopping job with name hastagaras-f3btest-v1-mkmlizer
Pipeline stage MKMLizer completed in 54.00s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.18s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service hastagaras-f3btest-v1
Waiting for inference service hastagaras-f3btest-v1 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Failed to get response for submission blend_sojek_2024-11-29: ('http://chaiml-llama-8b-multihea-7878-v5-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:36210->127.0.0.1:8080: read: connection reset by peer\n')
Inference service hastagaras-f3btest-v1 ready after 270.91715240478516s
Pipeline stage MKMLDeployer completed in 271.45s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 20.12712154388428
10th percentile: 20.129341793060302
20th percentile: 20.133782291412352
30th percentile: 20.136930084228517
40th percentile: 20.13878517150879
50th percentile: 20.140640258789062
60th percentile: 20.14254446029663
70th percentile: 20.144448661804198
80th percentile: 20.146803426742554
90th percentile: 20.149608755111693
95th percentile: 20.151011419296264
99th percentile: 20.15213355064392
mean time: 20.1398717880249
%s, retrying in %s seconds...
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 20.13851704597473
10th percentile: 20.138899850845338
20th percentile: 20.13966546058655
30th percentile: 20.140058708190917
40th percentile: 20.14007959365845
50th percentile: 20.140100479125977
60th percentile: 20.143960571289064
70th percentile: 20.147820663452148
80th percentile: 20.153990650177
90th percentile: 20.162470531463622
95th percentile: 20.166710472106935
99th percentile: 20.170102424621582
mean time: 20.147796821594238
%s, retrying in %s seconds...
Failed to get response for submission function_pajeb_2024-11-26: ('http://chaiml-llama-8b-multihea-7878-v5-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:40702->127.0.0.1:8080: read: connection reset by peer\n')
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 20.155645990371703
10th percentile: 20.1564932346344
20th percentile: 20.15818772315979
30th percentile: 20.159131002426147
40th percentile: 20.159323072433473
50th percentile: 20.159515142440796
60th percentile: 20.163783979415893
70th percentile: 20.16805281639099
80th percentile: 20.170539331436157
90th percentile: 20.171243524551393
95th percentile: 20.17159562110901
99th percentile: 20.171877298355103
mean time: 20.163096761703493
clean up pipeline due to error=DeploymentChecksError('Unacceptable number of predict errors: 100.0%')
Shutdown handler de-registered
hastagaras-f3btest_v1 status is now failed due to DeploymentManager action
hastagaras-f3btest_v1 status is now torndown due to DeploymentManager action