Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-v35-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-v35-uploader to finish
qwen-qwen3-5-35b-a3b-v35-uploader: Using quantization_mode: none
qwen-qwen3-5-35b-a3b-v35-uploader: Downloading snapshot of Qwen/Qwen3.5-35B-A3B...
qwen-qwen3-5-35b-a3b-v35-uploader: Downloaded in 22.893s
2026-03-23T23:06:23.774570+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v35
qwen-qwen3-5-35b-a3b-v35-uploader: Processed model Qwen/Qwen3.5-35B-A3B in 50.552s
qwen-qwen3-5-35b-a3b-v35-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-v35-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v35-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-v35-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-v35-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-v35-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v35-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v35-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v35-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v35-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v35-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v35-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v35-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v35-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-v35-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-v35-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-v35-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-v35-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-v35-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/config.json
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/.gitattributes
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/model.safetensors.index.json
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/preprocessor_config.json
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/LICENSE
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/README.md
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/video_preprocessor_config.json
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/generation_config.json
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/merges.txt
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/vocab.json
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/tokenizer.json
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/model.safetensors-00014-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/model.safetensors-00014-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/model.safetensors-00007-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/model.safetensors-00002-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/model.safetensors-00004-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/model.safetensors-00011-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/model.safetensors-00013-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/model.safetensors-00013-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/model.safetensors-00009-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/model.safetensors-00009-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/model.safetensors-00010-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v35-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v35/default/model.safetensors-00001-of-00014.safetensors
Job qwen-qwen3-5-35b-a3b-v35-uploader completed after 83.16s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-v35-uploader
Pipeline stage VLLMUploader completed in 83.66s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.20s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-v35
Waiting for inference service qwen-qwen3-5-35b-a3b-v35 to be ready
2026-03-23T23:07:23.863677+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v35
2026-03-23T23:08:23.959414+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v35
2026-03-23T23:09:24.044938+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v35
Inference service qwen-qwen3-5-35b-a3b-v35 ready after 210.6608304977417s
Pipeline stage VLLMDeployer completed in 211.17s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-23T23:10:24.133808+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v35
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-23T23:11:30.445803+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v35
Received healthy response to inference request in 11.68342900276184s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.2789366245269775s
2026-03-23T23:12:30.539909+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v35
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.6684842109680176s
Received healthy response to inference request in 0.9123375415802002s
Received healthy response to inference request in 3.9541099071502686s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.8894989490509033s
Received healthy response to inference request in 0.8643753528594971s
Received healthy response to inference request in 1.9217228889465332s
Received healthy response to inference request in 1.086205244064331s
Received healthy response to inference request in 1.6181471347808838s
Received healthy response to inference request in 1.423630714416504s
Received healthy response to inference request in 1.0235106945037842s
Received healthy response to inference request in 0.9483671188354492s
Received healthy response to inference request in 2.2578256130218506s
Received healthy response to inference request in 1.0534307956695557s
Received healthy response to inference request in 2.1771602630615234s
Received healthy response to inference request in 1.7281665802001953s
Received healthy response to inference request in 1.115457534790039s
Received healthy response to inference request in 1.188908338546753s
Received healthy response to inference request in 1.1468114852905273s
Received healthy response to inference request in 1.0702502727508545s
2026-03-23T23:13:30.636900+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v35
Received healthy response to inference request in 1.2379333972930908s
Received healthy response to inference request in 1.6227209568023682s
30 requests
7 failed requests
5th percentile: 0.9285508513450622
10th percentile: 1.0159963369369507
20th percentile: 1.0830142498016357
30th percentile: 1.1762792825698851
40th percentile: 1.5403405666351322
50th percentile: 1.6983253955841064
60th percentile: 2.2094264030456543
70th percentile: 4.051557922363281
80th percentile: 20.113834857940674
90th percentile: 20.12287309169769
95th percentile: 20.131926774978638
99th percentile: 20.22377767562866
mean time: 6.362195452054341
%s, retrying in %s seconds...
Received healthy response to inference request in 1.1985175609588623s
Received healthy response to inference request in 0.7209184169769287s
Received healthy response to inference request in 1.3258533477783203s
Received healthy response to inference request in 1.0728349685668945s
Received healthy response to inference request in 1.4612250328063965s
Received healthy response to inference request in 1.0333380699157715s
Received healthy response to inference request in 1.0377304553985596s
Received healthy response to inference request in 1.2222189903259277s
Received healthy response to inference request in 1.2015628814697266s
Received healthy response to inference request in 0.9737739562988281s
Received healthy response to inference request in 0.9768922328948975s
Received healthy response to inference request in 1.3258740901947021s
Received healthy response to inference request in 0.7527964115142822s
Received healthy response to inference request in 0.7305617332458496s
Received healthy response to inference request in 1.3693671226501465s
Received healthy response to inference request in 0.781313419342041s
Received healthy response to inference request in 0.8962938785552979s
Received healthy response to inference request in 1.0623843669891357s
Received healthy response to inference request in 2.097100257873535s
Received healthy response to inference request in 1.2690379619598389s
Received healthy response to inference request in 1.3426499366760254s
Received healthy response to inference request in 1.0285236835479736s
Received healthy response to inference request in 1.5167710781097412s
Received healthy response to inference request in 1.0698881149291992s
Received healthy response to inference request in 1.4850671291351318s
Received healthy response to inference request in 0.9961650371551514s
Received healthy response to inference request in 1.224095344543457s
Received healthy response to inference request in 1.0290203094482422s
Received healthy response to inference request in 1.0217182636260986s
Received healthy response to inference request in 1.176774501800537s
30 requests
0 failed requests
5th percentile: 0.7405673384666442
10th percentile: 0.7784617185592652
20th percentile: 0.9762685775756836
30th percentile: 1.026482057571411
40th percentile: 1.0359735012054443
50th percentile: 1.0713615417480469
60th percentile: 1.199735689163208
70th percentile: 1.2375781297683714
80th percentile: 1.329229259490967
90th percentile: 1.4636092424392702
95th percentile: 1.502504301071167
99th percentile: 1.9288047957420353
mean time: 1.1466756184895834
Pipeline stage StressChecker completed in 230.66s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.28s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b_v35 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b_v35 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b_v35 status is now torndown due to DeploymentManager action