Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-v52-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-v52-uploader to finish
qwen-qwen3-5-35b-a3b-v52-uploader: Using quantization_mode: none
qwen-qwen3-5-35b-a3b-v52-uploader: Downloading snapshot of Qwen/Qwen3.5-35B-A3B...
qwen-qwen3-5-35b-a3b-v52-uploader: Downloaded in 25.553s
2026-03-25T18:52:58.294269+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v52
qwen-qwen3-5-35b-a3b-v52-uploader: Processed model Qwen/Qwen3.5-35B-A3B in 53.318s
qwen-qwen3-5-35b-a3b-v52-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-v52-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v52-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-v52-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-v52-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-v52-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v52-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v52-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v52-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v52-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v52-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v52-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v52-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v52-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-v52-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-v52-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-v52-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-v52-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-v52-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/.gitattributes
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/LICENSE
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/generation_config.json
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/preprocessor_config.json
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/video_preprocessor_config.json
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors.index.json
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/README.md
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/config.json
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/merges.txt
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/vocab.json
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/tokenizer.json
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00014-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00014-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00005-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00005-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00009-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00009-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00004-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00011-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00010-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00013-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00013-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00002-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00007-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00001-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00008-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00008-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00012-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00012-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00006-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00006-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v52-uploader: cp /dev/shm/model_output/model.safetensors-00003-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v52/default/model.safetensors-00003-of-00014.safetensors
Job qwen-qwen3-5-35b-a3b-v52-uploader completed after 83.35s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-v52-uploader
Pipeline stage VLLMUploader completed in 83.81s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.68s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-v52
Waiting for inference service qwen-qwen3-5-35b-a3b-v52 to be ready
2026-03-25T18:53:58.385322+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v52
2026-03-25T18:54:58.483632+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v52
2026-03-25T18:55:58.811735+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v52
Inference service qwen-qwen3-5-35b-a3b-v52 ready after 210.53461933135986s
Pipeline stage VLLMDeployer completed in 211.08s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-25T18:56:58.906263+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v52
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T18:57:59.000108+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v52
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T18:58:59.101090+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v52
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.0608575344085693s
Received healthy response to inference request in 1.94437575340271s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.917240858078003s
Received healthy response to inference request in 4.2583911418914795s
Received healthy response to inference request in 1.3410251140594482s
Received healthy response to inference request in 4.161833047866821s
2026-03-25T18:59:59.204939+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v52
Received healthy response to inference request in 11.72809910774231s
Received healthy response to inference request in 1.1737651824951172s
Received healthy response to inference request in 1.8245224952697754s
Received healthy response to inference request in 1.906099557876587s
Received healthy response to inference request in 1.7518563270568848s
Received healthy response to inference request in 1.702059030532837s
Received healthy response to inference request in 2.466306209564209s
Received healthy response to inference request in 1.814697027206421s
Received healthy response to inference request in 1.3005502223968506s
Received healthy response to inference request in 1.7995924949645996s
Received healthy response to inference request in 0.7158722877502441s
Received healthy response to inference request in 1.6417529582977295s
Received healthy response to inference request in 1.3176186084747314s
Received healthy response to inference request in 1.0061397552490234s
Received healthy response to inference request in 1.1464436054229736s
Received healthy response to inference request in 1.1123826503753662s
30 requests
8 failed requests
5th percentile: 1.0539490580558777
10th percentile: 1.1430375099182128
20th percentile: 1.3142049312591553
30th percentile: 1.6839672088623046
40th percentile: 1.8086552143096923
50th percentile: 1.9252376556396484
60th percentile: 3.4034108638763416
70th percentile: 6.499303531646707
80th percentile: 20.12763695716858
90th percentile: 20.143958973884583
95th percentile: 20.144960844516753
99th percentile: 20.146781497001648
mean time: 7.139397835731506
%s, retrying in %s seconds...
Received healthy response to inference request in 1.054192066192627s
Received healthy response to inference request in 1.0994324684143066s
Received healthy response to inference request in 1.4114980697631836s
Received healthy response to inference request in 0.6818499565124512s
Received healthy response to inference request in 1.037463665008545s
Received healthy response to inference request in 0.8813419342041016s
Received healthy response to inference request in 0.8635475635528564s
Received healthy response to inference request in 0.9708032608032227s
Received healthy response to inference request in 1.3103156089782715s
Received healthy response to inference request in 0.8669207096099854s
Received healthy response to inference request in 1.0934967994689941s
Received healthy response to inference request in 1.194584608078003s
Received healthy response to inference request in 1.0551295280456543s
Received healthy response to inference request in 1.014904499053955s
Received healthy response to inference request in 1.0035772323608398s
Received healthy response to inference request in 1.1759703159332275s
2026-03-25T19:00:59.299457+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v52
Received healthy response to inference request in 1.0136058330535889s
Received healthy response to inference request in 1.011448860168457s
Received healthy response to inference request in 1.0064451694488525s
Received healthy response to inference request in 1.0377237796783447s
Received healthy response to inference request in 1.2959074974060059s
Received healthy response to inference request in 1.2049365043640137s
Received healthy response to inference request in 1.21189284324646s
Received healthy response to inference request in 1.432260274887085s
Received healthy response to inference request in 1.193469762802124s
Received healthy response to inference request in 1.5815412998199463s
Received healthy response to inference request in 1.1845595836639404s
Received healthy response to inference request in 1.1388776302337646s
Received healthy response to inference request in 1.1056110858917236s
Received healthy response to inference request in 0.8935458660125732s
30 requests
0 failed requests
5th percentile: 0.8650654792785645
10th percentile: 0.8798998117446899
20th percentile: 0.9970224380493165
30th percentile: 1.0129587411880494
40th percentile: 1.0376197338104247
50th percentile: 1.0743131637573242
60th percentile: 1.11891770362854
70th percentile: 1.1872326374053954
80th percentile: 1.206327772140503
90th percentile: 1.320433855056763
95th percentile: 1.4229172825813292
99th percentile: 1.5382498025894167
mean time: 1.100895142555237
Pipeline stage StressChecker completed in 261.99s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.72s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b_v52 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b_v52 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b_v52 status is now torndown due to DeploymentManager action