Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3a-mv1-plc-30375-v1-uploader
Waiting for job on chaiml-pony-d3a-mv1-plc-30375-v1-uploader to finish
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: Using quantization_mode: none
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: Downloading snapshot of ChaiML/pony-d3a-mv1-plc-q35b-lr5e6ep1g8...
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: Downloaded in 33.716s
2026-03-25T14:56:01.867374+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v1
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: Processed model ChaiML/pony-d3a-mv1-plc-q35b-lr5e6ep1g8 in 59.613s
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/processor_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/processor_config.json
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/args.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/args.json
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/tokenizer_config.json
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/added_tokens.json
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/README.md
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/merges.txt
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/generation_config.json
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/preprocessor_config.json
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/chat_template.jinja
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/config.json
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/special_tokens_map.json
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/.gitattributes
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model.safetensors.index.json
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/vocab.json
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/tokenizer.json
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00016-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00016-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00007-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00007-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00013-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00013-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00010-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00010-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00004-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00004-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00008-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00008-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00001-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00001-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00014-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00014-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00011-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00011-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00005-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00005-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00002-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00002-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00003-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00003-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00012-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00012-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00009-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00009-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00015-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00015-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v1-uploader: cp /dev/shm/model_output/model-00006-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v1/default/model-00006-of-00016.safetensors
Job chaiml-pony-d3a-mv1-plc-30375-v1-uploader completed after 94.52s with status: succeeded
Stopping job with name chaiml-pony-d3a-mv1-plc-30375-v1-uploader
Pipeline stage VLLMUploader completed in 95.16s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 3.20s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3a-mv1-plc-30375-v1
Waiting for inference service chaiml-pony-d3a-mv1-plc-30375-v1 to be ready
Failed to get response for submission chaiml-pony-d3a-mv1-son_96936_v1: ('http://chaiml-pony-d3a-mv1-son-96936-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'request timeout')
2026-03-25T14:57:01.959450+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v1
2026-03-25T14:58:02.046411+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v1
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
2026-03-25T14:59:02.134420+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v1
Inference service chaiml-pony-d3a-mv1-plc-30375-v1 ready after 150.34722089767456s
Pipeline stage VLLMDeployer completed in 150.81s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T15:00:02.228809+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 7.126107692718506s
Received healthy response to inference request in 2.584869384765625s
2026-03-25T15:01:02.314341+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v1
Received healthy response to inference request in 1.810486078262329s
Received healthy response to inference request in 6.278533697128296s
Received healthy response to inference request in 1.4470689296722412s
Received healthy response to inference request in 2.0002522468566895s
Received healthy response to inference request in 1.492002248764038s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T15:02:02.468534+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 13.870359420776367s
Received healthy response to inference request in 2.3564579486846924s
Received healthy response to inference request in 17.238901376724243s
Received healthy response to inference request in 1.4039618968963623s
Received healthy response to inference request in 2.2931079864501953s
Received healthy response to inference request in 1.8289482593536377s
Received healthy response to inference request in 2.1839439868927s
Received healthy response to inference request in 1.4388446807861328s
Received healthy response to inference request in 1.4783952236175537s
2026-03-25T15:03:02.560410+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v1
Received healthy response to inference request in 2.0462141036987305s
Received healthy response to inference request in 1.5312094688415527s
Received healthy response to inference request in 1.4762616157531738s
Received healthy response to inference request in 1.4536690711975098s
Received healthy response to inference request in 1.6669940948486328s
Received healthy response to inference request in 1.4428637027740479s
30 requests
8 failed requests
5th percentile: 1.4406532406806947
10th percentile: 1.446648406982422
20th percentile: 1.4779685020446778
30th percentile: 1.6262587070465087
40th percentile: 1.931730651855469
50th percentile: 2.2385259866714478
60th percentile: 4.062335109710688
70th percentile: 14.88092200756072
80th percentile: 20.11980471611023
90th percentile: 20.141295981407165
95th percentile: 20.203064250946046
99th percentile: 20.298557965755464
mean time: 7.926869527498881
%s, retrying in %s seconds...
Received healthy response to inference request in 1.715759515762329s
Received healthy response to inference request in 1.3389256000518799s
Received healthy response to inference request in 1.382767677307129s
Received healthy response to inference request in 1.5167365074157715s
Received healthy response to inference request in 1.345909833908081s
Received healthy response to inference request in 1.6975371837615967s
Received healthy response to inference request in 1.46641206741333s
Received healthy response to inference request in 1.3225831985473633s
Received healthy response to inference request in 1.5343396663665771s
Received healthy response to inference request in 1.6908562183380127s
Received healthy response to inference request in 1.3795440196990967s
Received healthy response to inference request in 1.373856782913208s
Received healthy response to inference request in 1.3798608779907227s
Received healthy response to inference request in 1.3865880966186523s
Received healthy response to inference request in 1.507150411605835s
Received healthy response to inference request in 2.247912883758545s
Received healthy response to inference request in 1.4258534908294678s
Received healthy response to inference request in 1.5272853374481201s
Received healthy response to inference request in 1.6015233993530273s
Received healthy response to inference request in 1.9341270923614502s
Received healthy response to inference request in 1.3897039890289307s
Received healthy response to inference request in 1.479701280593872s
Received healthy response to inference request in 1.5611133575439453s
Received healthy response to inference request in 1.5314726829528809s
Received healthy response to inference request in 1.4310023784637451s
Received healthy response to inference request in 1.4322996139526367s
Received healthy response to inference request in 1.414491891860962s
Received healthy response to inference request in 1.508617877960205s
Received healthy response to inference request in 1.4407994747161865s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 1.478912591934204s
30 requests
0 failed requests
5th percentile: 1.3420685052871704
10th percentile: 1.3710620880126954
20th percentile: 1.3821863174438476
30th percentile: 1.4070555210113525
40th percentile: 1.43178071975708
50th percentile: 1.472662329673767
60th percentile: 1.507737398147583
70th percentile: 1.5285415410995484
80th percentile: 1.5691953659057618
90th percentile: 1.69935941696167
95th percentile: 1.835861682891845
99th percentile: 2.1569150042533876
mean time: 1.5147881666819254
Pipeline stage StressChecker completed in 289.51s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
2026-03-25T15:04:02.659151+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v1
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.69s
Shutdown handler de-registered
chaiml-pony-d3a-mv1-plc_30375_v1 status is now deployed due to DeploymentManager action
chaiml-pony-d3a-mv1-plc_30375_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3a-mv1-plc_30375_v1 status is now torndown due to DeploymentManager action