Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-fp8-v3-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-fp8-v3-uploader to finish
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: Using quantization_mode: fp8
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: Repo Qwen/Qwen3.5-35B-A3B-FP8 already ends in FP8. Skipping...
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: Checking if Qwen/Qwen3.5-35B-A3B-FP8 already exists in ChaiML
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: Downloading snapshot of Qwen/Qwen3.5-35B-A3B-FP8...
2026-03-24T23:30:02.447679+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v3
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/video_preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/config.json
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/README.md
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/LICENSE
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/generation_config.json
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/configuration.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/configuration.json
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/.gitattributes
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/merges.txt
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/vocab.json
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors.index.json
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/tokenizer.json
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00014-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00014-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00004-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00011-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00012-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00012-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00008-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00008-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00003-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00003-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00005-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00005-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00002-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00007-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00010-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00001-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00006-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00006-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00013-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00013-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v3-uploader: cp /dev/shm/model_output/model.safetensors-00009-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v3/default/model.safetensors-00009-of-00014.safetensors
Job qwen-qwen3-5-35b-a3b-fp8-v3-uploader completed after 78.6s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-fp8-v3-uploader
Pipeline stage VLLMUploader completed in 79.77s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.06s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-fp8-v3
Waiting for inference service qwen-qwen3-5-35b-a3b-fp8-v3 to be ready
2026-03-24T23:31:02.619809+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v3
2026-03-24T23:32:02.812600+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v3
2026-03-24T23:33:03.057380+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v3
Inference service qwen-qwen3-5-35b-a3b-fp8-v3 ready after 172.2009470462799s
Pipeline stage VLLMDeployer completed in 185.04s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
2026-03-24T23:34:03.306540+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v3
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 11.53198766708374s
Received healthy response to inference request in 1.9787638187408447s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.1701860427856445s
Received healthy response to inference request in 3.8139591217041016s
2026-03-24T23:35:03.499993+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v3
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.133466958999634s
Received healthy response to inference request in 2.001633644104004s
Received healthy response to inference request in 1.9797310829162598s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.7511157989501953s
2026-03-24T23:36:03.696136+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v3
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.685474157333374s
Received healthy response to inference request in 2.2462880611419678s
Received healthy response to inference request in 2.9892380237579346s
Received healthy response to inference request in 2.2829980850219727s
Received healthy response to inference request in 20.42040753364563s
2026-03-24T23:37:03.941237+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v3
Received healthy response to inference request in 2.267498254776001s
Received healthy response to inference request in 2.0424816608428955s
Received healthy response to inference request in 2.2476577758789062s
Received healthy response to inference request in 2.9320054054260254s
Received healthy response to inference request in 2.421052932739258s
Received healthy response to inference request in 3.0177741050720215s
Received healthy response to inference request in 3.7247936725616455s
Received healthy response to inference request in 2.154242753982544s
Received healthy response to inference request in 3.752321481704712s
Received healthy response to inference request in 2.488884925842285s
30 requests
7 failed requests
5th percentile: 1.9895872354507447
10th percentile: 2.0383968591690063
20th percentile: 2.1669973850250246
30th percentile: 2.2615461111068726
40th percentile: 2.4617521286010744
50th percentile: 2.96062171459198
60th percentile: 3.7012019634246824
70th percentile: 6.1293676853179715
80th percentile: 20.456520748138427
90th percentile: 20.539837265014647
95th percentile: 20.68093957901001
99th percentile: 26.69395423889161
mean time: 7.981526025136312
%s, retrying in %s seconds...
Received healthy response to inference request in 3.340566873550415s
Received healthy response to inference request in 2.183466911315918s
Received healthy response to inference request in 2.225909471511841s
Received healthy response to inference request in 2.1376712322235107s
Received healthy response to inference request in 2.3726420402526855s
Received healthy response to inference request in 2.15351939201355s
Received healthy response to inference request in 2.015174388885498s
Received healthy response to inference request in 2.3750357627868652s
Received healthy response to inference request in 2.054288864135742s
Received healthy response to inference request in 1.896270513534546s
Received healthy response to inference request in 1.987912654876709s
2026-03-24T23:38:04.824185+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v3
Received healthy response to inference request in 1.951962947845459s
Received healthy response to inference request in 2.1156954765319824s
Received healthy response to inference request in 2.0874619483947754s
Received healthy response to inference request in 1.8836894035339355s
Received healthy response to inference request in 2.013960361480713s
Received healthy response to inference request in 2.1909515857696533s
Received healthy response to inference request in 2.1295993328094482s
Received healthy response to inference request in 1.921478509902954s
Received healthy response to inference request in 3.0376882553100586s
Received healthy response to inference request in 1.9512147903442383s
Received healthy response to inference request in 1.9579622745513916s
Received healthy response to inference request in 1.9635169506072998s
Received healthy response to inference request in 2.624441385269165s
Received healthy response to inference request in 2.0818891525268555s
Received healthy response to inference request in 1.9552767276763916s
Received healthy response to inference request in 2.256201982498169s
Received healthy response to inference request in 2.1391665935516357s
Received healthy response to inference request in 2.2656748294830322s
Received healthy response to inference request in 2.0410525798797607s
30 requests
0 failed requests
5th percentile: 1.9076141119003296
10th percentile: 1.94824116230011
20th percentile: 1.9574251651763916
30th percentile: 2.006146049499512
40th percentile: 2.0489943504333494
50th percentile: 2.101578712463379
60th percentile: 2.138269376754761
70th percentile: 2.1857123136520387
80th percentile: 2.2580965518951417
90th percentile: 2.3999763250350954
95th percentile: 2.8517271637916553
99th percentile: 3.2527320742607118
mean time: 2.1770447731018066
Pipeline stage StressChecker completed in 318.59s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.27s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b-fp8_v3 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b-fp8_v3 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b-fp8_v3 status is now torndown due to DeploymentManager action