Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3a-mv1-plc-89556-v6-uploader
Waiting for job on chaiml-pony-d3a-mv1-plc-89556-v6-uploader to finish
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: Using quantization_mode: fp8
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: Checking if ChaiML/pony-d3a-mv1-plc-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: Downloading snapshot of ChaiML/pony-d3a-mv1-plc-q35b-lr5e6ep2g8-FP8...
2026-03-28T17:14:43.521381+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v6
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: Downloaded in 35.420s
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: Processed model ChaiML/pony-d3a-mv1-plc-q35b-lr5e6ep2g8 in 38.217s
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v6/default
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v6/default/.gitattributes
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v6/default/generation_config.json
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v6/default/tokenizer_config.json
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v6/default/config.json
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v6/default/recipe.yaml
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v6/default/chat_template.jinja
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v6/default/tokenizer.json
2026-03-28T17:15:43.785897+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v6
Failed to get request counts for guanaco-submitter. Falling back to default
chaiml-pony-d3a-mv1-plc-89556-v6-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-89556-v6/default/model.safetensors
Job chaiml-pony-d3a-mv1-plc-89556-v6-uploader completed after 153.23s with status: succeeded
Stopping job with name chaiml-pony-d3a-mv1-plc-89556-v6-uploader
Pipeline stage VLLMUploader completed in 153.92s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.10s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.47s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3a-mv1-plc-89556-v6
Waiting for inference service chaiml-pony-d3a-mv1-plc-89556-v6 to be ready
2026-03-28T17:16:43.895165+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v6
2026-03-28T17:17:44.330391+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v6
2026-03-28T17:18:44.515172+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v6
2026-03-28T17:19:44.620735+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v6
Inference service chaiml-pony-d3a-mv1-plc-89556-v6 ready after 240.66518592834473s
Pipeline stage VLLMDeployer completed in 241.12s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T17:20:44.733891+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v6
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 13.211475849151611s
2026-03-28T17:21:44.834220+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v6
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.63421630859375s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 10.929325580596924s
Received healthy response to inference request in 2.621171712875366s
Received healthy response to inference request in 2.0824127197265625s
Received healthy response to inference request in 2.1591718196868896s
2026-03-28T17:22:44.931518+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v6
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 13.183998107910156s
Received healthy response to inference request in 2.6063055992126465s
Received healthy response to inference request in 2.220069646835327s
Received healthy response to inference request in 2.1374263763427734s
Received healthy response to inference request in 2.1806206703186035s
Received healthy response to inference request in 2.0984604358673096s
Received healthy response to inference request in 2.311516761779785s
2026-03-28T17:23:45.097197+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v6
Received healthy response to inference request in 2.623492479324341s
Received healthy response to inference request in 2.1885263919830322s
Received healthy response to inference request in 2.164534091949463s
Received healthy response to inference request in 2.3227126598358154s
Received healthy response to inference request in 5.804800033569336s
Received healthy response to inference request in 2.1141035556793213s
Received healthy response to inference request in 2.237966775894165s
Received healthy response to inference request in 2.615086793899536s
Received healthy response to inference request in 2.0997049808502197s
Received healthy response to inference request in 2.376051187515259s
30 requests
7 failed requests
5th percentile: 2.099020481109619
10th percentile: 2.1126636981964113
20th percentile: 2.1634616374969484
30th percentile: 2.2106066703796388
40th percentile: 2.3182343006134034
50th percentile: 2.6106961965560913
60th percentile: 2.6277820110321044
70th percentile: 11.605727338790887
80th percentile: 20.125398778915404
90th percentile: 20.14461271762848
95th percentile: 20.1572909116745
99th percentile: 20.169998149871827
mean time: 7.598096132278442
%s, retrying in %s seconds...
Received healthy response to inference request in 3.837265968322754s
Received healthy response to inference request in 2.083933115005493s
Received healthy response to inference request in 2.076904296875s
Received healthy response to inference request in 2.166717529296875s
Received healthy response to inference request in 2.09325909614563s
Received healthy response to inference request in 2.128898859024048s
Received healthy response to inference request in 2.1726248264312744s
Received healthy response to inference request in 2.0852901935577393s
Received healthy response to inference request in 2.160780191421509s
Received healthy response to inference request in 2.179211139678955s
Received healthy response to inference request in 2.228248357772827s
Received healthy response to inference request in 2.2498908042907715s
Received healthy response to inference request in 2.1426401138305664s
Received healthy response to inference request in 2.1776130199432373s
2026-03-28T17:24:45.199535+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_89556_v6
Received healthy response to inference request in 2.2423534393310547s
Received healthy response to inference request in 2.1039109230041504s
Received healthy response to inference request in 2.1610515117645264s
Received healthy response to inference request in 2.117616891860962s
Received healthy response to inference request in 2.5654938220977783s
Received healthy response to inference request in 2.09828782081604s
Received healthy response to inference request in 2.1449763774871826s
Received healthy response to inference request in 2.53306245803833s
Received healthy response to inference request in 2.256833553314209s
Received healthy response to inference request in 2.1909279823303223s
Received healthy response to inference request in 2.10416316986084s
Received healthy response to inference request in 2.3687057495117188s
Received healthy response to inference request in 2.117302894592285s
Received healthy response to inference request in 2.206192970275879s
Received healthy response to inference request in 2.103381872177124s
Received healthy response to inference request in 2.1122500896453857s
30 requests
0 failed requests
5th percentile: 2.084543800354004
10th percentile: 2.0924622058868407
20th percentile: 2.103805112838745
30th percentile: 2.1157870531082152
40th percentile: 2.137143611907959
50th percentile: 2.1609158515930176
60th percentile: 2.1746201038360597
70th percentile: 2.1955074787139894
80th percentile: 2.2438609123229982
90th percentile: 2.38514142036438
95th percentile: 2.5508997082710265
99th percentile: 3.468452045917512
mean time: 2.2403263012568155
Pipeline stage StressChecker completed in 312.13s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.79s
Shutdown handler de-registered
chaiml-pony-d3a-mv1-plc_89556_v6 status is now deployed due to DeploymentManager action