Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-fp8-v6-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-fp8-v6-uploader to finish
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: Using quantization_mode: none
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: Downloading snapshot of Qwen/Qwen3.5-35B-A3B-FP8...
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: Downloaded in 14.860s
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: Processed model Qwen/Qwen3.5-35B-A3B-FP8 in 28.832s
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/configuration.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/configuration.json
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/LICENSE
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/README.md
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/generation_config.json
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/.gitattributes
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/config.json
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/merges.txt
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/video_preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors.index.json
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/vocab.json
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/tokenizer.json
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00014-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00014-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00007-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00010-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00004-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00003-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00003-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00001-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00002-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00011-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00012-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00012-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00008-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00008-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00005-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00005-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00006-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00006-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00013-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00013-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v6-uploader: cp /dev/shm/model_output/model.safetensors-00009-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v6/default/model.safetensors-00009-of-00014.safetensors
Job qwen-qwen3-5-35b-a3b-fp8-v6-uploader completed after 53.16s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-fp8-v6-uploader
Pipeline stage VLLMUploader completed in 53.70s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.61s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-fp8-v6
Waiting for inference service qwen-qwen3-5-35b-a3b-fp8-v6 to be ready
2026-03-25T17:38:20.359561+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v6
2026-03-25T17:39:20.450997+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v6
2026-03-25T17:40:20.545667+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v6
Inference service qwen-qwen3-5-35b-a3b-fp8-v6 ready after 160.49681901931763s
Pipeline stage VLLMDeployer completed in 161.58s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T17:41:20.643876+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v6
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.9783523082733154s
2026-03-25T17:42:20.742418+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v6
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.9049434661865234s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T17:43:20.839772+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v6
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.5596935749053955s
Received healthy response to inference request in 4.215612411499023s
Received healthy response to inference request in 1.5903844833374023s
Received healthy response to inference request in 0.8509197235107422s
Received healthy response to inference request in 1.0624637603759766s
Received healthy response to inference request in 3.486661195755005s
Received healthy response to inference request in 1.280146837234497s
Received healthy response to inference request in 1.2438745498657227s
Received healthy response to inference request in 1.0817437171936035s
Received healthy response to inference request in 1.503507137298584s
Received healthy response to inference request in 13.58344030380249s
Received healthy response to inference request in 2.848235845565796s
Received healthy response to inference request in 1.5154049396514893s
Received healthy response to inference request in 1.2428598403930664s
Received healthy response to inference request in 1.1886215209960938s
Received healthy response to inference request in 1.7471725940704346s
Received healthy response to inference request in 1.6487300395965576s
Received healthy response to inference request in 1.2819135189056396s
Received healthy response to inference request in 1.0630178451538086s
Received healthy response to inference request in 1.1903703212738037s
Received healthy response to inference request in 1.4562149047851562s
30 requests
7 failed requests
5th percentile: 1.062713098526001
10th percentile: 1.079871129989624
20th percentile: 1.232361936569214
30th percentile: 1.2813835144042969
40th percentile: 1.510645818710327
50th percentile: 1.61955726146698
60th percentile: 2.9002824306488035
70th percentile: 3.9981441497802725
80th percentile: 20.108800077438353
90th percentile: 20.141145730018614
95th percentile: 20.15057430267334
99th percentile: 20.452657272815706
mean time: 6.49683526357015
%s, retrying in %s seconds...
Received healthy response to inference request in 1.109879970550537s
Received healthy response to inference request in 2.7390716075897217s
Received healthy response to inference request in 1.7460591793060303s
2026-03-25T17:44:20.936844+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v6
Received healthy response to inference request in 1.4788382053375244s
Received healthy response to inference request in 1.8378381729125977s
Received healthy response to inference request in 1.2160890102386475s
Received healthy response to inference request in 1.1795697212219238s
Received healthy response to inference request in 2.2310948371887207s
Received healthy response to inference request in 0.9338371753692627s
Received healthy response to inference request in 1.2033472061157227s
Received healthy response to inference request in 1.047969102859497s
Received healthy response to inference request in 1.6378111839294434s
Received healthy response to inference request in 1.6445529460906982s
Received healthy response to inference request in 1.2782728672027588s
Received healthy response to inference request in 1.2915618419647217s
Received healthy response to inference request in 1.2817306518554688s
Received healthy response to inference request in 0.9764719009399414s
Received healthy response to inference request in 1.5213713645935059s
Received healthy response to inference request in 1.7309014797210693s
Received healthy response to inference request in 1.7928450107574463s
Received healthy response to inference request in 1.2223970890045166s
Received healthy response to inference request in 1.4942638874053955s
Received healthy response to inference request in 1.278791904449463s
Received healthy response to inference request in 1.3362367153167725s
Received healthy response to inference request in 1.396347999572754s
Received healthy response to inference request in 1.3738062381744385s
Received healthy response to inference request in 1.2918262481689453s
Received healthy response to inference request in 1.5209453105926514s
Received healthy response to inference request in 1.8269150257110596s
Received healthy response to inference request in 1.409980297088623s
30 requests
0 failed requests
5th percentile: 1.0086456418037415
10th percentile: 1.1036888837814331
20th percentile: 1.2135406494140626
30th percentile: 1.2786361932754517
40th percentile: 1.291720485687256
50th percentile: 1.3850771188735962
60th percentile: 1.4850084781646729
70th percentile: 1.5563033103942867
80th percentile: 1.7339330196380616
90th percentile: 1.8280073404312134
95th percentile: 2.054129338264464
99th percentile: 2.591758344173432
mean time: 1.467687471707662
Pipeline stage StressChecker completed in 245.14s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.70s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b-fp8_v6 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b-fp8_v6 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b-fp8_v6 status is now torndown due to DeploymentManager action