Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-kimid-v12-mv1-to-30309-v2-uploader
Waiting for job on chaiml-kimid-v12-mv1-to-30309-v2-uploader to finish
chaiml-kimid-v12-mv1-to-30309-v2-uploader: Using quantization_mode: none
chaiml-kimid-v12-mv1-to-30309-v2-uploader: Downloading snapshot of ChaiML/kimid-v12-mv1-top2-q35b-lr5e6ep2g8...
chaiml-kimid-v12-mv1-to-30309-v2-uploader: Downloaded in 23.250s
2026-03-25T16:57:12.511102+00:00 monitor updated for chaiml-kimid-v12-mv1-to_30309_v2
chaiml-kimid-v12-mv1-to-30309-v2-uploader: Processed model ChaiML/kimid-v12-mv1-top2-q35b-lr5e6ep2g8 in 49.745s
chaiml-kimid-v12-mv1-to-30309-v2-uploader: creating bucket guanaco-vllm-models
chaiml-kimid-v12-mv1-to-30309-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-kimid-v12-mv1-to-30309-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-kimid-v12-mv1-to-30309-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-kimid-v12-mv1-to-30309-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-kimid-v12-mv1-to-30309-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-kimid-v12-mv1-to-30309-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-kimid-v12-mv1-to-30309-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-kimid-v12-mv1-to-30309-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-kimid-v12-mv1-to-30309-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-kimid-v12-mv1-to-30309-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-kimid-v12-mv1-to-30309-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-kimid-v12-mv1-to-30309-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-kimid-v12-mv1-to-30309-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-kimid-v12-mv1-to-30309-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-kimid-v12-mv1-to-30309-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-kimid-v12-mv1-to-30309-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-kimid-v12-mv1-to-30309-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-kimid-v12-mv1-to-30309-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/.gitattributes
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/added_tokens.json
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/preprocessor_config.json
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/special_tokens_map.json
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/processor_config.json s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/processor_config.json
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/tokenizer_config.json
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/args.json s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/args.json
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/README.md
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/chat_template.jinja
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/config.json
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/generation_config.json
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model.safetensors.index.json
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/vocab.json
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/merges.txt
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/tokenizer.json
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00016-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00016-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00010-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00010-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00013-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00013-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00007-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00007-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00004-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00004-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00001-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00001-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00005-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00005-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00002-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00002-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00014-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00014-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00011-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00011-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00008-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00008-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00003-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00003-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00006-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00006-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00012-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00012-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00009-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00009-of-00016.safetensors
chaiml-kimid-v12-mv1-to-30309-v2-uploader: cp /dev/shm/model_output/model-00015-of-00016.safetensors s3://guanaco-vllm-models/chaiml-kimid-v12-mv1-to-30309-v2/default/model-00015-of-00016.safetensors
Job chaiml-kimid-v12-mv1-to-30309-v2-uploader completed after 85.06s with status: succeeded
Stopping job with name chaiml-kimid-v12-mv1-to-30309-v2-uploader
Pipeline stage VLLMUploader completed in 85.73s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 9.35s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-kimid-v12-mv1-to-30309-v2
Waiting for inference service chaiml-kimid-v12-mv1-to-30309-v2 to be ready
2026-03-25T16:58:12.600992+00:00 monitor updated for chaiml-kimid-v12-mv1-to_30309_v2
2026-03-25T16:59:12.693413+00:00 monitor updated for chaiml-kimid-v12-mv1-to_30309_v2
Connection pool is full, discarding connection: %s. Connection pool size: %s
2026-03-25T17:00:13.101799+00:00 monitor updated for chaiml-kimid-v12-mv1-to_30309_v2
Inference service chaiml-kimid-v12-mv1-to-30309-v2 ready after 191.00750088691711s
Pipeline stage VLLMDeployer completed in 191.45s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-25T17:01:13.911665+00:00 monitor updated for chaiml-kimid-v12-mv1-to_30309_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T17:02:14.112010+00:00 monitor updated for chaiml-kimid-v12-mv1-to_30309_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Failed to get response for submission chaiml-pony-d3a-mv1-son_96936_v2: ('http://chaiml-pony-d3a-mv1-son-96936-v2-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'request timeout')
Received unhealthy response to inference request!
Failed to get response for submission chaiml-pony-d3a-mv1-son_96936_v2: ('http://chaiml-pony-d3a-mv1-son-96936-v2-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'request timeout')
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Failed to get response for submission chaiml-kimid-v8b-kimid_63800_v21: ('http://guanaco-model-mesh-load-balancer-v2.model-mesh.kchai-google-us-east4.chaiverse.com/models/chaiml-kimid-v8b-kimid_63800_v21/predict', '{"detail":"1 validation error for RuntimeResponse\\npredictions\\n Field required [type=missing, input_value={\'detail\': \\"Cannot connec...10.244.119.21\', 8080)]\\"}, input_type=dict]\\n For further information visit https://errors.pydantic.dev/2.12/v/missing"}')
Retrying (%r) after connection broken by '%r': %s
Received healthy response to inference request in 13.033589124679565s
2026-03-25T17:03:14.209210+00:00 monitor updated for chaiml-kimid-v12-mv1-to_30309_v2
Received healthy response to inference request in 3.534893751144409s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.991832971572876s
2026-03-25T17:04:14.795154+00:00 monitor updated for chaiml-kimid-v12-mv1-to_30309_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 7.22472357749939s
Received healthy response to inference request in 6.615461826324463s
Received healthy response to inference request in 7.056492567062378s
Received healthy response to inference request in 2.061981678009033s
Received healthy response to inference request in 2.368145704269409s
Received healthy response to inference request in 1.544560194015503s
Received healthy response to inference request in 1.5167567729949951s
Received healthy response to inference request in 1.4689955711364746s
upstream connect error or disconnect/reset before headers. reset reason: connection termination
Received unhealthy response to inference request!
Received healthy response to inference request in 1.9442451000213623s
Received healthy response to inference request in 1.4681458473205566s
2026-03-25T17:05:15.007770+00:00 monitor updated for chaiml-kimid-v12-mv1-to_30309_v2
Received healthy response to inference request in 1.4711284637451172s
Received healthy response to inference request in 1.4394614696502686s
Received healthy response to inference request in 1.7658538818359375s
Received healthy response to inference request in 1.481515645980835s
Received healthy response to inference request in 1.579153299331665s
Received healthy response to inference request in 1.5068600177764893s
30 requests
11 failed requests
5th percentile: 1.4523694396018982
10th percentile: 1.4689105987548827
20th percentile: 1.5017911434173583
30th percentile: 1.5687753677368164
40th percentile: 1.9727978229522707
50th percentile: 2.951519727706909
60th percentile: 7.123784971237183
70th percentile: 20.120101737976075
80th percentile: 20.14041919708252
90th percentile: 20.159386682510377
95th percentile: 20.189042699337005
99th percentile: 20.214881477355956
mean time: 8.760543235143025
%s, retrying in %s seconds...
Received healthy response to inference request in 1.4005393981933594s
Received healthy response to inference request in 1.6230981349945068s
Received healthy response to inference request in 1.412132740020752s
Received healthy response to inference request in 1.4029784202575684s
Received healthy response to inference request in 1.3311705589294434s
Received healthy response to inference request in 1.3728435039520264s
Received healthy response to inference request in 1.6184966564178467s
Received healthy response to inference request in 1.3352687358856201s
Received healthy response to inference request in 1.6909785270690918s
Received healthy response to inference request in 1.9358294010162354s
Received healthy response to inference request in 1.5258660316467285s
Received healthy response to inference request in 1.442624568939209s
Received healthy response to inference request in 1.6238923072814941s
Received healthy response to inference request in 1.4569928646087646s
Received healthy response to inference request in 1.5930051803588867s
Received healthy response to inference request in 1.468843698501587s
Received healthy response to inference request in 1.4521663188934326s
Received healthy response to inference request in 1.3679101467132568s
Received healthy response to inference request in 1.5254652500152588s
Received healthy response to inference request in 1.7333853244781494s
Received healthy response to inference request in 1.39701247215271s
Received healthy response to inference request in 1.4447054862976074s
Received healthy response to inference request in 1.4398818016052246s
Received healthy response to inference request in 1.3610188961029053s
Received healthy response to inference request in 1.6313056945800781s
Received healthy response to inference request in 1.3170206546783447s
Received healthy response to inference request in 1.461535930633545s
Received healthy response to inference request in 1.5823938846588135s
Received healthy response to inference request in 1.4818217754364014s
Received healthy response to inference request in 1.4721124172210693s
30 requests
0 failed requests
5th percentile: 1.3330147385597229
10th percentile: 1.3584438800811767
20th percentile: 1.3921786785125732
30th percentile: 1.409386444091797
40th percentile: 1.443873119354248
50th percentile: 1.4592643976211548
60th percentile: 1.475996160507202
70th percentile: 1.5428243875503538
2026-03-25T17:06:15.103947+00:00 monitor updated for chaiml-kimid-v12-mv1-to_30309_v2
80th percentile: 1.6194169521331787
90th percentile: 1.6372729778289796
95th percentile: 1.7143022656440734
99th percentile: 1.8771206188201905
mean time: 1.4967432260513305
Pipeline stage StressChecker completed in 316.63s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.22s
Shutdown handler de-registered
chaiml-kimid-v12-mv1-to_30309_v2 status is now deployed due to DeploymentManager action
chaiml-kimid-v12-mv1-to_30309_v2 status is now inactive due to auto deactivation removed underperforming models
chaiml-kimid-v12-mv1-to_30309_v2 status is now torndown due to DeploymentManager action