Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-4b-instruct-2507-v7-uploader
Waiting for job on qwen-qwen3-4b-instruct-2507-v7-uploader to finish
qwen-qwen3-4b-instruct-2507-v7-uploader: Using quantization_mode: none
qwen-qwen3-4b-instruct-2507-v7-uploader: Downloading snapshot of Qwen/Qwen3-4B-Instruct-2507...
qwen-qwen3-4b-instruct-2507-v7-uploader:
Fetching 13 files: 0%| | 0/13 [00:00<?, ?it/s]
Fetching 13 files: 8%|▊ | 1/13 [00:00<00:03, 3.90it/s]
Fetching 13 files: 54%|█████▍ | 7/13 [00:03<00:03, 1.89it/s]
Fetching 13 files: 100%|██████████| 13/13 [00:03<00:00, 3.53it/s]
qwen-qwen3-4b-instruct-2507-v7-uploader: Downloaded in 3.794s
qwen-qwen3-4b-instruct-2507-v7-uploader: Processed model Qwen/Qwen3-4B-Instruct-2507 in 6.800s
qwen-qwen3-4b-instruct-2507-v7-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-4b-instruct-2507-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-4b-instruct-2507-v7-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-4b-instruct-2507-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-4b-instruct-2507-v7-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-4b-instruct-2507-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-4b-instruct-2507-v7-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-4b-instruct-2507-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-4b-instruct-2507-v7-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-4b-instruct-2507-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-4b-instruct-2507-v7-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-4b-instruct-2507-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-4b-instruct-2507-v7-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-4b-instruct-2507-v7-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-4b-instruct-2507-v7-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-4b-instruct-2507-v7-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-4b-instruct-2507-v7-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-4b-instruct-2507-v7-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-4b-instruct-2507-v7-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/.gitattributes
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/LICENSE
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/README.md
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/tokenizer_config.json
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/generation_config.json
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/model.safetensors.index.json
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/config.json
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/vocab.json
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/tokenizer.json
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/merges.txt
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/model-00003-of-00003.safetensors
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/model-00001-of-00003.safetensors
HTTP Request: %s %s "%s %d %s"
qwen-qwen3-4b-instruct-2507-v7-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/qwen-qwen3-4b-instruct-2507-v7/model-00002-of-00003.safetensors
Job qwen-qwen3-4b-instruct-2507-v7-uploader completed after 83.31s with status: succeeded
Stopping job with name qwen-qwen3-4b-instruct-2507-v7-uploader
Pipeline stage VLLMUploader completed in 83.87s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-4b-instruct-2507-v7
Waiting for inference service qwen-qwen3-4b-instruct-2507-v7 to be ready
Inference service qwen-qwen3-4b-instruct-2507-v7 ready after 151.04247450828552s
Pipeline stage VLLMDeployer completed in 151.55s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 0.6692748069763184s
Received healthy response to inference request in 1.0185179710388184s
Received healthy response to inference request in 0.9724330902099609s
Received healthy response to inference request in 0.875903844833374s
Received healthy response to inference request in 0.7578108310699463s
Received healthy response to inference request in 0.8227589130401611s
Received healthy response to inference request in 0.7388122081756592s
Received healthy response to inference request in 0.6799750328063965s
Received healthy response to inference request in 0.767338752746582s
Received healthy response to inference request in 0.7093846797943115s
Received healthy response to inference request in 0.726111650466919s
Received healthy response to inference request in 1.0566530227661133s
Received healthy response to inference request in 1.0215399265289307s
Received healthy response to inference request in 0.8239214420318604s
Received healthy response to inference request in 0.8235831260681152s
Received healthy response to inference request in 0.7006492614746094s
Received healthy response to inference request in 0.7832703590393066s
Received healthy response to inference request in 0.9242165088653564s
Received healthy response to inference request in 0.7140152454376221s
Received healthy response to inference request in 0.9054539203643799s
Received healthy response to inference request in 0.8173801898956299s
Received healthy response to inference request in 0.8692364692687988s
Received healthy response to inference request in 0.8208611011505127s
Received healthy response to inference request in 0.7261025905609131s
Received healthy response to inference request in 0.8349273204803467s
Received healthy response to inference request in 0.9332425594329834s
Received healthy response to inference request in 0.7385845184326172s
Received healthy response to inference request in 0.8088521957397461s
Received healthy response to inference request in 0.780534029006958s
Received healthy response to inference request in 0.7473123073577881s
30 requests
0 failed requests
5th percentile: 0.6892784357070922
10th percentile: 0.7085111379623413
20th percentile: 0.7261098384857178
30th percentile: 0.7447622776031494
40th percentile: 0.7752559185028076
50th percentile: 0.813116192817688
60th percentile: 0.8230885982513427
70th percentile: 0.8452200651168822
80th percentile: 0.9092064380645752
90th percentile: 0.9770415782928468
95th percentile: 1.0201800465583801
99th percentile: 1.0464702248573303
mean time: 0.8189552625020345
Pipeline stage StressChecker completed in 27.99s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.61s
Shutdown handler de-registered
qwen-qwen3-4b-instruct-2507_v7 status is now deployed due to DeploymentManager action
qwen-qwen3-4b-instruct-2507_v7 status is now inactive due to auto deactivation removed underperforming models