Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-fp8-v9-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-fp8-v9-uploader to finish
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: Using quantization_mode: fp8
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: Repo Qwen/Qwen3.5-35B-A3B-FP8 already ends in FP8. Skipping...
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: Checking if Qwen/Qwen3.5-35B-A3B-FP8 already exists in ChaiML
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: Model already exists. Downloading to /dev/shm/model_output...
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: Downloading snapshot of Qwen/Qwen3.5-35B-A3B-FP8...
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: Downloaded in 13.917s
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: Processed model Qwen/Qwen3.5-35B-A3B-FP8 in 16.487s
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/.gitattributes
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/generation_config.json
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/config.json
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/video_preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/README.md
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/merges.txt
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/configuration.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/configuration.json
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/LICENSE
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/vocab.json
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors.index.json
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00014-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00014-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00007-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00012-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00012-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00004-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00011-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00005-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00005-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00006-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00006-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00002-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00008-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00008-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00001-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00003-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00003-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00010-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00013-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00013-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v9-uploader: cp /dev/shm/model_output/model.safetensors-00009-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v9/default/model.safetensors-00009-of-00014.safetensors
Job qwen-qwen3-5-35b-a3b-fp8-v9-uploader completed after 57.18s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-fp8-v9-uploader
Pipeline stage VLLMUploader completed in 58.29s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.97s
run pipeline stage %s
Running pipeline stage VLLMDeployer
2026-03-25T19:26:36.083475+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v9
Creating inference service qwen-qwen3-5-35b-a3b-fp8-v9
Waiting for inference service qwen-qwen3-5-35b-a3b-fp8-v9 to be ready
2026-03-25T19:27:36.366303+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v9
2026-03-25T19:28:36.550565+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v9
Inference service qwen-qwen3-5-35b-a3b-fp8-v9 ready after 161.76631546020508s
Pipeline stage VLLMDeployer completed in 162.87s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-25T19:29:36.895311+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v9
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v9-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v9-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v9-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
2026-03-25T19:30:37.139962+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v9
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v9-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v9-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.4996533393859863s
Received healthy response to inference request in 7.954771041870117s
Received healthy response to inference request in 20.075941801071167s
Received healthy response to inference request in 1.8555400371551514s
2026-03-25T19:31:37.320343+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v9
Received healthy response to inference request in 3.506997585296631s
Received healthy response to inference request in 2.000943183898926s
Received healthy response to inference request in 2.035580635070801s
Received healthy response to inference request in 2.94850754737854s
Received healthy response to inference request in 2.0963077545166016s
Received healthy response to inference request in 2.8229479789733887s
Received healthy response to inference request in 2.2802398204803467s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v9-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.1224422454833984s
Received healthy response to inference request in 2.4637956619262695s
Received healthy response to inference request in 2.1745553016662598s
Received healthy response to inference request in 2.996955633163452s
2026-03-25T19:32:37.496472+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v9
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v9-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.0183517932891846s
Received healthy response to inference request in 2.1760129928588867s
Received healthy response to inference request in 1.8230109214782715s
Received healthy response to inference request in 2.0900533199310303s
Received healthy response to inference request in 1.9389407634735107s
Received healthy response to inference request in 2.288165330886841s
Received healthy response to inference request in 3.0662384033203125s
Received healthy response to inference request in 2.140580177307129s
30 requests
7 failed requests
5th percentile: 1.893070363998413
10th percentile: 1.9947429418563842
20th percentile: 2.0791587829589844
30th percentile: 2.1351387977600096
40th percentile: 2.238549089431763
50th percentile: 2.643371820449829
60th percentile: 3.024668741226196
70th percentile: 4.841329622268664
80th percentile: 20.410417890548707
90th percentile: 20.46089506149292
95th percentile: 20.559009540081025
99th percentile: 20.808336656093598
mean time: 7.404035131136577
%s, retrying in %s seconds...
Received healthy response to inference request in 1.9905784130096436s
Received healthy response to inference request in 2.0155797004699707s
Received healthy response to inference request in 1.943835735321045s
Received healthy response to inference request in 1.8911807537078857s
Received healthy response to inference request in 2.131676435470581s
Received healthy response to inference request in 2.1604502201080322s
Received healthy response to inference request in 1.9476346969604492s
Received healthy response to inference request in 1.840334415435791s
Received healthy response to inference request in 2.093677520751953s
Received healthy response to inference request in 1.9620556831359863s
Received healthy response to inference request in 2.0036497116088867s
Received healthy response to inference request in 2.058459997177124s
Received healthy response to inference request in 1.8903191089630127s
Received healthy response to inference request in 2.225114583969116s
Received healthy response to inference request in 2.0072576999664307s
2026-03-25T19:33:37.985562+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v9
Received healthy response to inference request in 2.3341054916381836s
Received healthy response to inference request in 2.139350652694702s
Received healthy response to inference request in 2.5271291732788086s
Received healthy response to inference request in 2.053995132446289s
Received healthy response to inference request in 1.9822478294372559s
Received healthy response to inference request in 1.8722789287567139s
Received healthy response to inference request in 2.273447275161743s
Received healthy response to inference request in 2.314058303833008s
Received healthy response to inference request in 1.9898695945739746s
Received healthy response to inference request in 2.1287920475006104s
Received healthy response to inference request in 1.9215588569641113s
Received healthy response to inference request in 2.739072322845459s
Received healthy response to inference request in 1.4755220413208008s
Received healthy response to inference request in 1.474170207977295s
Received healthy response to inference request in 1.9630041122436523s
30 requests
0 failed requests
5th percentile: 1.6396876096725466
10th percentile: 1.8690844774246216
20th percentile: 1.9154832363128662
30th percentile: 1.9577293872833252
40th percentile: 1.986820888519287
50th percentile: 2.0054537057876587
60th percentile: 2.0557810783386232
70th percentile: 2.1296573638916017
80th percentile: 2.173383092880249
90th percentile: 2.3160630226135255
95th percentile: 2.440268516540527
99th percentile: 2.6776088094711303
mean time: 2.0450135548909505
Pipeline stage StressChecker completed in 294.14s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.29s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b-fp8_v9 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b-fp8_v9 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b-fp8_v9 status is now torndown due to DeploymentManager action