Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-v1-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-v1-uploader to finish
qwen-qwen3-5-35b-a3b-v1-uploader: Using quantization_mode: none
qwen-qwen3-5-35b-a3b-v1-uploader: Downloading snapshot of Qwen/Qwen3.5-35B-A3B...
qwen-qwen3-5-35b-a3b-v1-uploader: Downloaded in 23.149s
qwen-qwen3-5-35b-a3b-v1-uploader: Processed model Qwen/Qwen3.5-35B-A3B in 50.350s
2026-03-20T16:15:20.914345+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v1
qwen-qwen3-5-35b-a3b-v1-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/video_preprocessor_config.json
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/config.json
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/LICENSE
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/README.md
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors.index.json
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/.gitattributes
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/preprocessor_config.json
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/merges.txt
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/generation_config.json
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/tokenizer.json
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/vocab.json
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00007-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00006-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00006-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00003-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00003-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00010-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00002-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00011-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00001-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00013-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00013-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00009-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00009-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00012-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00012-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00005-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00005-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00008-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00008-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v1-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v1/default/model.safetensors-00004-of-00014.safetensors
Job qwen-qwen3-5-35b-a3b-v1-uploader completed after 91.22s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-v1-uploader
Pipeline stage VLLMUploader completed in 92.53s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.74s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-v1
Waiting for inference service qwen-qwen3-5-35b-a3b-v1 to be ready
2026-03-20T16:16:27.615308+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v1
2026-03-20T16:17:27.811146+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v1
2026-03-20T16:18:28.035505+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v1
2026-03-20T16:19:28.276759+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v1
2026-03-20T16:20:28.509858+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v1
2026-03-20T16:21:28.725448+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v1
Inference service qwen-qwen3-5-35b-a3b-v1 ready after 334.799880027771s
Pipeline stage VLLMDeployer completed in 336.61s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-20T16:22:28.933079+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v1
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.807224988937378s
Received healthy response to inference request in 2.0983798503875732s
2026-03-20T16:23:29.165475+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v1
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.390596389770508s
2026-03-20T16:24:29.391127+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v1
Received healthy response to inference request in 3.43447208404541s
Received healthy response to inference request in 3.882758617401123s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.33986234664917s
Received healthy response to inference request in 1.0635473728179932s
Received healthy response to inference request in 1.2284657955169678s
Received healthy response to inference request in 1.6369802951812744s
Received healthy response to inference request in 1.8735530376434326s
Received healthy response to inference request in 1.6854641437530518s
Received healthy response to inference request in 2.576777935028076s
Received healthy response to inference request in 1.8301949501037598s
Received healthy response to inference request in 2.763336420059204s
Received healthy response to inference request in 1.0414273738861084s
Received healthy response to inference request in 1.4073312282562256s
Received healthy response to inference request in 1.1914710998535156s
Received healthy response to inference request in 1.69303297996521s
Received healthy response to inference request in 1.3671574592590332s
Received healthy response to inference request in 1.1629068851470947s
Received healthy response to inference request in 1.1128778457641602s
30 requests
9 failed requests
5th percentile: 1.0857460856437684
10th percentile: 1.1579039812088012
20th percentile: 1.3175830364227297
30th percentile: 1.5680855751037597
40th percentile: 1.77533016204834
50th percentile: 2.3375788927078247
60th percentile: 3.5835732460021967
70th percentile: 9.143000125884964
80th percentile: 20.320705366134643
90th percentile: 20.351480031013487
95th percentile: 20.40867644548416
99th percentile: 20.533603060245515
mean time: 7.526637935638428
%s, retrying in %s seconds...
Received healthy response to inference request in 1.0002920627593994s
Received healthy response to inference request in 1.5293447971343994s
Received healthy response to inference request in 1.0464680194854736s
Received healthy response to inference request in 1.165653944015503s
2026-03-20T16:25:29.691054+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v1
Received healthy response to inference request in 1.3173606395721436s
Received healthy response to inference request in 1.152921199798584s
Received healthy response to inference request in 1.434645175933838s
Received healthy response to inference request in 1.133061170578003s
Received healthy response to inference request in 1.3107523918151855s
Received healthy response to inference request in 1.2185544967651367s
Received healthy response to inference request in 1.112774133682251s
Received healthy response to inference request in 1.2205758094787598s
Received healthy response to inference request in 1.2022907733917236s
Received healthy response to inference request in 1.1070854663848877s
Received healthy response to inference request in 1.4342231750488281s
Received healthy response to inference request in 1.1873488426208496s
Received healthy response to inference request in 1.2218539714813232s
Received healthy response to inference request in 1.2924425601959229s
Received healthy response to inference request in 1.5457000732421875s
Received healthy response to inference request in 1.3558375835418701s
Received healthy response to inference request in 1.478461742401123s
Received healthy response to inference request in 1.3196814060211182s
Received healthy response to inference request in 1.6210670471191406s
Received healthy response to inference request in 1.1477677822113037s
Received healthy response to inference request in 1.1904613971710205s
Received healthy response to inference request in 1.2858333587646484s
Received healthy response to inference request in 1.208235502243042s
Received healthy response to inference request in 2.2856101989746094s
Received healthy response to inference request in 1.0079395771026611s
Received healthy response to inference request in 1.2073850631713867s
30 requests
0 failed requests
5th percentile: 1.0252773761749268
10th percentile: 1.1010237216949463
20th percentile: 1.1448264598846436
30th percentile: 1.1808403730392456
40th percentile: 1.2053473472595215
50th percentile: 1.2195651531219482
60th percentile: 1.2884770393371583
70th percentile: 1.318056869506836
80th percentile: 1.43430757522583
90th percentile: 1.5309803247451783
95th percentile: 1.5871519088745114
99th percentile: 2.092892684936524
mean time: 1.291387645403544
Pipeline stage StressChecker completed in 274.91s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.43s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b_v1 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b_v1 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b_v1 status is now torndown due to DeploymentManager action