Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3a-mv1-plc-89556-v5-uploader
Waiting for job on chaiml-pony-d3a-mv1-plc-89556-v5-uploader to finish
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: Using quantization_mode: fp8
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: Checking if ChaiML/pony-d3a-mv1-plc-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: Downloading snapshot of ChaiML/pony-d3a-mv1-plc-q35b-lr5e6ep2g8-FP8...
2026-03-28T14:49:17.762554+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v5
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: Downloaded in 34.507s
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: Processed model ChaiML/pony-d3a-mv1-plc-q35b-lr5e6ep2g8 in 36.977s
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v5/default
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v5/default/generation_config.json
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v5/default/config.json
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v5/default/tokenizer_config.json
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v5/default/.gitattributes
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v5/default/recipe.yaml
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v5/default/chat_template.jinja
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v5/default/tokenizer.json
2026-03-28T14:50:17.854177+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v5
chaiml-pony-d3a-mv1-plc-89556-v5-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v5/default/model.safetensors
Job chaiml-pony-d3a-mv1-plc-89556-v5-uploader completed after 143.03s with status: succeeded
Stopping job with name chaiml-pony-d3a-mv1-plc-89556-v5-uploader
Pipeline stage VLLMUploader completed in 143.59s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.87s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3a-mv1-plc-89556-v5
Waiting for inference service chaiml-pony-d3a-mv1-plc-89556-v5 to be ready
2026-03-28T14:51:18.013733+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v5
2026-03-28T14:52:18.110081+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v5
2026-03-28T14:53:18.201865+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v5
Inference service chaiml-pony-d3a-mv1-plc-89556-v5 ready after 191.1566150188446s
Pipeline stage VLLMDeployer completed in 191.77s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T14:54:18.304561+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v5
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T14:55:18.464869+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v5
Received healthy response to inference request in 5.731764078140259s
Received healthy response to inference request in 7.7516374588012695s
Received healthy response to inference request in 1.6246070861816406s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.5095582008361816s
2026-03-28T14:56:18.586175+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v5
Received healthy response to inference request in 6.128734350204468s
Received healthy response to inference request in 1.4814245700836182s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.4019415378570557s
Received healthy response to inference request in 6.40857458114624s
Received healthy response to inference request in 1.2617557048797607s
Received healthy response to inference request in 1.3817121982574463s
Received healthy response to inference request in 11.228424072265625s
Received healthy response to inference request in 1.8043506145477295s
Received healthy response to inference request in 4.065707683563232s
Received healthy response to inference request in 1.450188159942627s
Received healthy response to inference request in 1.4315550327301025s
Received healthy response to inference request in 1.3903169631958008s
Received healthy response to inference request in 1.9159486293792725s
Received healthy response to inference request in 1.6486263275146484s
2026-03-28T14:57:18.683383+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v5
Received healthy response to inference request in 1.2703168392181396s
Received healthy response to inference request in 1.50482177734375s
Received healthy response to inference request in 1.6629397869110107s
Received healthy response to inference request in 1.4455914497375488s
Received healthy response to inference request in 1.2998549938201904s
30 requests
7 failed requests
5th percentile: 1.2836090087890626
10th percentile: 1.3735264778137208
20th percentile: 1.4256323337554933
30th percentile: 1.4720536470413208
40th percentile: 1.5785875320434573
50th percentile: 1.7336452007293701
60th percentile: 4.7321302413940405
70th percentile: 6.811493444442745
80th percentile: 20.11309666633606
90th percentile: 20.127679705619812
95th percentile: 20.138679242134096
99th percentile: 20.40089287042618
mean time: 6.935514609018962
%s, retrying in %s seconds...
Received healthy response to inference request in 1.256171703338623s
Received healthy response to inference request in 1.4492449760437012s
Received healthy response to inference request in 1.2553398609161377s
Received healthy response to inference request in 1.233898639678955s
Received healthy response to inference request in 1.2169578075408936s
Received healthy response to inference request in 1.2721326351165771s
Received healthy response to inference request in 1.2489149570465088s
Received healthy response to inference request in 1.3715860843658447s
Received healthy response to inference request in 2.04227352142334s
Received healthy response to inference request in 1.2919645309448242s
Received healthy response to inference request in 1.427070140838623s
Received healthy response to inference request in 1.2903623580932617s
Received healthy response to inference request in 1.3122289180755615s
Received healthy response to inference request in 1.3243072032928467s
Received healthy response to inference request in 1.322685718536377s
Received healthy response to inference request in 1.3335297107696533s
Received healthy response to inference request in 1.2504868507385254s
Received healthy response to inference request in 1.2594211101531982s
Received healthy response to inference request in 1.3716208934783936s
Received healthy response to inference request in 1.3776166439056396s
Received healthy response to inference request in 1.3836760520935059s
Received healthy response to inference request in 1.2715754508972168s
Received healthy response to inference request in 1.4826128482818604s
Received healthy response to inference request in 1.4145796298980713s
Received healthy response to inference request in 1.2905595302581787s
Received healthy response to inference request in 1.7529394626617432s
Received healthy response to inference request in 1.3547084331512451s
Received healthy response to inference request in 1.3026173114776611s
Received healthy response to inference request in 1.304474115371704s
Received healthy response to inference request in 1.2973194122314453s
30 requests
0 failed requests
5th percentile: 1.2406559824943542
10th percentile: 1.2503296613693238
20th percentile: 1.2587712287902832
30th percentile: 1.2848934412002564
40th percentile: 1.2951774597167969
50th percentile: 1.3083515167236328
60th percentile: 1.3279962062835693
70th percentile: 1.3715965270996093
80th percentile: 1.3898567676544191
90th percentile: 1.452581763267517
95th percentile: 1.6312924861907951
99th percentile: 1.958366644382477
mean time: 1.358762550354004
Pipeline stage StressChecker completed in 254.65s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.29s
Shutdown handler de-registered
chaiml-pony-d3a-mv1-plc_89556_v5 status is now deployed due to DeploymentManager action
chaiml-pony-d3a-mv1-plc_89556_v5 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3a-mv1-plc_89556_v5 status is now torndown due to DeploymentManager action