Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-32b-v5-uploader
Waiting for job on qwen-qwen3-32b-v5-uploader to finish
qwen-qwen3-32b-v5-uploader: Using quantization_mode: none
qwen-qwen3-32b-v5-uploader: Downloading snapshot of Qwen/Qwen3-32B...
qwen-qwen3-32b-v5-uploader:
Fetching 27 files: 0%| | 0/27 [00:00<?, ?it/s]
Fetching 27 files: 4%|▎ | 1/27 [00:00<00:08, 3.02it/s]
Fetching 27 files: 26%|██▌ | 7/27 [00:29<01:28, 4.42s/it]
Fetching 27 files: 100%|██████████| 27/27 [00:29<00:00, 1.10s/it]
qwen-qwen3-32b-v5-uploader: Downloaded in 29.873s
qwen-qwen3-32b-v5-uploader: Processed model Qwen/Qwen3-32B in 53.346s
qwen-qwen3-32b-v5-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-32b-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-32b-v5-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-32b-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-32b-v5-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-32b-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-32b-v5-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-32b-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-32b-v5-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-32b-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-32b-v5-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-32b-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-32b-v5-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-32b-v5-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-32b-v5-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-32b-v5-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-32b-v5-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-32b-v5-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-32b-v5-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-32b-v5
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-32b-v5/.gitattributes
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/tokenizer_config.json
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/config.json
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/generation_config.json
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-32b-v5/README.md
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-32b-v5/LICENSE
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model.safetensors.index.json
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-32b-v5/merges.txt
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/vocab.json
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/tokenizer.json
qwen-qwen3-32b-v5-uploader: DEBUG retryable error: RequestError: send request failed
qwen-qwen3-32b-v5-uploader: caused by: Put "https://object.ord1.coreweave.com/guanaco-vllm-models/qwen-qwen3-32b-v5/model-00006-of-00017.safetensors?partNumber=8&uploadId=2~asgHe6AOH-NNfNCqqtLHzOoejohveBm": write tcp 10.0.13.168:59618->216.153.53.63:443: write: connection reset by peer
qwen-qwen3-32b-v5-uploader: ERROR "cp /dev/shm/model_output/model-00006-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00006-of-00017.safetensors": MultipartUpload: upload multipart failed upload id: 2~asgHe6AOH-NNfNCqqtLHzOoejohveBm caused by: SignatureDoesNotMatch: status code: 403, request id: tx000001f4ed0a2e18c19e1-006987dd58-149d0ff886-default, host id:
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00017-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00017-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00009-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00009-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00007-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00007-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00003-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00003-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00015-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00015-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00008-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00008-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00005-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00005-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00016-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00016-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00011-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00011-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00001-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00001-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00012-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00012-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00002-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00002-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00010-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00010-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00014-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00014-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00004-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00004-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00013-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00013-of-00017.safetensors
qwen-qwen3-32b-v5-uploader: Retry 1/5 exited 1, retrying in 2 seconds...
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-32b-v5/.gitattributes": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-32b-v5/LICENSE": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-32b-v5/README.md": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/config.json": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/generation_config.json": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-32b-v5/merges.txt": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00001-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00001-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00002-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00002-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00003-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00003-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00004-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00004-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00005-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00005-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00007-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00007-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00008-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00008-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00009-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00009-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00010-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00010-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00011-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00011-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00012-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00012-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00013-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00013-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00014-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00014-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00015-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00015-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00016-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00016-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model-00017-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00017-of-00017.safetensors": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model.safetensors.index.json": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/tokenizer.json": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/tokenizer_config.json": object size matches
qwen-qwen3-32b-v5-uploader: DEBUG "sync /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-32b-v5/vocab.json": object size matches
HTTP Request: %s %s "%s %d %s"
qwen-qwen3-32b-v5-uploader: cp /dev/shm/model_output/model-00006-of-00017.safetensors s3://guanaco-vllm-models/qwen-qwen3-32b-v5/model-00006-of-00017.safetensors
Job qwen-qwen3-32b-v5-uploader completed after 187.58s with status: succeeded
Stopping job with name qwen-qwen3-32b-v5-uploader
Pipeline stage VLLMUploader completed in 188.39s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-32b-v5
Waiting for inference service qwen-qwen3-32b-v5 to be ready
HTTP Request: %s %s "%s %d %s"
Inference service qwen-qwen3-32b-v5 ready after 231.03448009490967s
Pipeline stage VLLMDeployer completed in 231.59s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.497617244720459s
Received healthy response to inference request in 3.436899185180664s
Received healthy response to inference request in 3.7076189517974854s
Received healthy response to inference request in 3.8194494247436523s
Received healthy response to inference request in 3.1278305053710938s
Received healthy response to inference request in 3.2569215297698975s
Received healthy response to inference request in 3.4908533096313477s
Received healthy response to inference request in 3.093719005584717s
Received healthy response to inference request in 3.265122890472412s
Received healthy response to inference request in 3.185274600982666s
Received healthy response to inference request in 3.245758533477783s
Received healthy response to inference request in 3.2699780464172363s
Received healthy response to inference request in 3.5351741313934326s
Received healthy response to inference request in 3.1819396018981934s
Received healthy response to inference request in 3.404667615890503s
Received healthy response to inference request in 3.3936727046966553s
Received healthy response to inference request in 3.3779213428497314s
Received healthy response to inference request in 3.33815860748291s
Received healthy response to inference request in 3.1256954669952393s
Received healthy response to inference request in 3.2130579948425293s
Received healthy response to inference request in 3.199549913406372s
Received healthy response to inference request in 2.9991343021392822s
Received healthy response to inference request in 3.4800782203674316s
Received healthy response to inference request in 3.4613170623779297s
Received healthy response to inference request in 3.4850423336029053s
Received healthy response to inference request in 3.360374927520752s
Received healthy response to inference request in 3.8938794136047363s
Received healthy response to inference request in 3.440990447998047s
Received healthy response to inference request in 3.324428081512451s
Received healthy response to inference request in 3.4007327556610107s
30 requests
0 failed requests
5th percentile: 3.108108413219452
10th percentile: 3.1276170015335083
20th percentile: 3.196694850921631
30th percentile: 3.253572630882263
40th percentile: 3.3026480674743652
50th percentile: 3.3691481351852417
60th percentile: 3.4023066997528075
70th percentile: 3.4470884323120115
80th percentile: 3.4862045288085937
90th percentile: 3.5524186134338382
95th percentile: 3.769125711917877
99th percentile: 3.872294716835022
mean time: 3.3670952717463174
Pipeline stage StressChecker completed in 104.95s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.63s
Shutdown handler de-registered
qwen-qwen3-32b_v5 status is now deployed due to DeploymentManager action
qwen-qwen3-32b_v5 status is now inactive due to auto deactivation removed underperforming models