Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-v43-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-v43-uploader to finish
qwen-qwen3-5-35b-a3b-v43-uploader: Using quantization_mode: fp8
2026-03-24T01:09:28.012439+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v43
qwen-qwen3-5-35b-a3b-v43-uploader: Downloaded in 48.261s
qwen-qwen3-5-35b-a3b-v43-uploader: Processed model Qwen/Qwen3.5-35B-A3B in 48.823s
qwen-qwen3-5-35b-a3b-v43-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-v43-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v43-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-v43-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-v43-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-v43-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v43-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v43-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v43-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v43-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v43-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v43-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v43-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v43-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-v43-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-v43-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-v43-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-v43-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-v43-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v43/default
qwen-qwen3-5-35b-a3b-v43-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v43/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-v43-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v43/default/config.json
qwen-qwen3-5-35b-a3b-v43-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v43/default/generation_config.json
qwen-qwen3-5-35b-a3b-v43-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v43/default/.gitattributes
qwen-qwen3-5-35b-a3b-v43-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v43/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-v43-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v43/default/recipe.yaml
qwen-qwen3-5-35b-a3b-v43-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v43/default/tokenizer.json
2026-03-24T01:10:28.179376+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v43
qwen-qwen3-5-35b-a3b-v43-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v43/default/model.safetensors
Job qwen-qwen3-5-35b-a3b-v43-uploader completed after 127.64s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-v43-uploader
Pipeline stage VLLMUploader completed in 128.72s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.97s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-v43
Waiting for inference service qwen-qwen3-5-35b-a3b-v43 to be ready
2026-03-24T01:11:28.368061+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v43
Retrying (%r) after connection broken by '%r': %s
2026-03-24T01:12:28.535984+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v43
Inference service qwen-qwen3-5-35b-a3b-v43 ready after 162.6375858783722s
Pipeline stage VLLMDeployer completed in 164.03s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-24T01:13:28.717314+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v43
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v43-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v43-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v43-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
2026-03-24T01:14:28.888756+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v43
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v43-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.414415121078491s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v43-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
2026-03-24T01:15:29.063038+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v43
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v43-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v43-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.510065793991089s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v43-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.8941822052001953s
Received healthy response to inference request in 4.419306516647339s
Received healthy response to inference request in 2.4702277183532715s
Received healthy response to inference request in 3.5817959308624268s
2026-03-24T01:16:29.242361+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v43
Received healthy response to inference request in 2.190176486968994s
Received healthy response to inference request in 1.8431954383850098s
Received healthy response to inference request in 2.0131711959838867s
Received healthy response to inference request in 1.807645320892334s
Received healthy response to inference request in 1.8056669235229492s
Received healthy response to inference request in 1.5719945430755615s
Received healthy response to inference request in 1.8254189491271973s
Received healthy response to inference request in 2.0321192741394043s
Received healthy response to inference request in 1.9616143703460693s
Received healthy response to inference request in 1.3791134357452393s
Received healthy response to inference request in 2.271596908569336s
Received healthy response to inference request in 1.5820956230163574s
Received healthy response to inference request in 3.6041932106018066s
Received healthy response to inference request in 1.7584261894226074s
Received healthy response to inference request in 1.9522449970245361s
Received healthy response to inference request in 1.6263470649719238s
30 requests
8 failed requests
5th percentile: 1.5765400290489198
10th percentile: 1.621921920776367
20th percentile: 1.8072496414184571
30th percentile: 1.8788861751556396
40th percentile: 1.9925484657287598
50th percentile: 2.230886697769165
60th percentile: 3.538757848739624
70th percentile: 4.415882539749146
80th percentile: 20.3986056804657
90th percentile: 20.426310086250304
95th percentile: 20.641495645046234
99th percentile: 20.83609215259552
mean time: 7.188673226038615
%s, retrying in %s seconds...
Received healthy response to inference request in 2.146308422088623s
Received healthy response to inference request in 2.011796712875366s
Received healthy response to inference request in 1.9016072750091553s
Received healthy response to inference request in 2.2083821296691895s
Received healthy response to inference request in 2.0165185928344727s
Received healthy response to inference request in 2.250105142593384s
Received healthy response to inference request in 1.1661756038665771s
Received healthy response to inference request in 1.4048666954040527s
Received healthy response to inference request in 2.1413140296936035s
Received healthy response to inference request in 1.210907220840454s
Received healthy response to inference request in 2.1265342235565186s
Received healthy response to inference request in 1.8697304725646973s
Received healthy response to inference request in 1.8503360748291016s
2026-03-24T01:17:29.437953+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v43
Received healthy response to inference request in 2.001007080078125s
Received healthy response to inference request in 1.9635047912597656s
Received healthy response to inference request in 1.742356538772583s
Received healthy response to inference request in 1.7389330863952637s
Received healthy response to inference request in 2.038919448852539s
Received healthy response to inference request in 1.611372709274292s
Received healthy response to inference request in 1.5392286777496338s
Received healthy response to inference request in 2.1614198684692383s
Received healthy response to inference request in 1.8922216892242432s
Received healthy response to inference request in 1.8215444087982178s
Received healthy response to inference request in 2.278231620788574s
Received healthy response to inference request in 1.5482449531555176s
Received healthy response to inference request in 1.8107295036315918s
Received healthy response to inference request in 1.6643102169036865s
Received healthy response to inference request in 1.9732184410095215s
Received healthy response to inference request in 1.3835670948028564s
Received healthy response to inference request in 1.8743538856506348s
30 requests
0 failed requests
5th percentile: 1.2886041641235353
10th percentile: 1.402736735343933
20th percentile: 1.5987471580505372
30th percentile: 1.7413295030593872
40th percentile: 1.8388194084167482
50th percentile: 1.883287787437439
60th percentile: 1.967390251159668
70th percentile: 2.013213276863098
80th percentile: 2.1294901847839354
90th percentile: 2.1661160945892335
95th percentile: 2.231329786777496
99th percentile: 2.270074942111969
mean time: 1.8449248870213826
Pipeline stage StressChecker completed in 280.11s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.27s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b_v43 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b_v43 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b_v43 status is now torndown due to DeploymentManager action