Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-27b-v2-uploader
Waiting for job on qwen-qwen3-5-27b-v2-uploader to finish
qwen-qwen3-5-27b-v2-uploader: Using quantization_mode: none
qwen-qwen3-5-27b-v2-uploader: Downloading snapshot of Qwen/Qwen3.5-27B...
qwen-qwen3-5-27b-v2-uploader: Downloaded in 17.963s
2026-03-10T15:52:17.394336+00:00 monitor updated for qwen-qwen3-5-27b_v2
qwen-qwen3-5-27b-v2-uploader: Processed model Qwen/Qwen3.5-27B in 38.943s
qwen-qwen3-5-27b-v2-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-27b-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-27b-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-27b-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-27b-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-27b-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-27b-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-27b-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-27b-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-27b-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-27b-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-27b-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-27b-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-27b-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-27b-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-27b-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-27b-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-27b-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-27b-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors.index.json
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/README.md
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/video_preprocessor_config.json
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/preprocessor_config.json
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/tokenizer_config.json
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/generation_config.json
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/config.json
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/chat_template.jinja
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/merges.txt
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/.gitattributes
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/vocab.json
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/tokenizer.json
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors-00011-of-00011.safetensors
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors-00002-of-00011.safetensors
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors-00001-of-00011.safetensors
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors-00005-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors-00005-of-00011.safetensors
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors-00010-of-00011.safetensors
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors-00008-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors-00008-of-00011.safetensors
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors-00004-of-00011.safetensors
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors-00006-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors-00006-of-00011.safetensors
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors-00009-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors-00009-of-00011.safetensors
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors-00007-of-00011.safetensors
qwen-qwen3-5-27b-v2-uploader: cp /dev/shm/model_output/model.safetensors-00003-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v2/default/model.safetensors-00003-of-00011.safetensors
Job qwen-qwen3-5-27b-v2-uploader completed after 94.05s with status: succeeded
Stopping job with name qwen-qwen3-5-27b-v2-uploader
Pipeline stage VLLMUploader completed in 95.40s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.91s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-27b-v2
Waiting for inference service qwen-qwen3-5-27b-v2 to be ready
2026-03-10T15:53:18.119573+00:00 monitor updated for qwen-qwen3-5-27b_v2
2026-03-10T15:54:18.221367+00:00 monitor updated for qwen-qwen3-5-27b_v2
2026-03-10T15:55:18.700249+00:00 monitor updated for qwen-qwen3-5-27b_v2
2026-03-10T15:56:18.814874+00:00 monitor updated for qwen-qwen3-5-27b_v2
Inference service qwen-qwen3-5-27b-v2 ready after 220.65916204452515s
Pipeline stage VLLMDeployer completed in 221.30s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-10T15:57:19.109258+00:00 monitor updated for qwen-qwen3-5-27b_v2
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 12.659321546554565s
Received healthy response to inference request in 2.7805957794189453s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-10T15:58:19.217641+00:00 monitor updated for qwen-qwen3-5-27b_v2
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.9619171619415283s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 9.559773683547974s
2026-03-10T15:59:19.324143+00:00 monitor updated for qwen-qwen3-5-27b_v2
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.917482852935791s
Received healthy response to inference request in 2.7253611087799072s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.1038403511047363s
Received healthy response to inference request in 2.7676353454589844s
Received healthy response to inference request in 2.8017797470092773s
Received healthy response to inference request in 1.4116804599761963s
Received healthy response to inference request in 4.28678560256958s
Received healthy response to inference request in 2.914236068725586s
Received healthy response to inference request in 2.652261257171631s
2026-03-10T16:00:19.453857+00:00 monitor updated for qwen-qwen3-5-27b_v2
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.9942193031311035s
Received healthy response to inference request in 2.9928181171417236s
Received healthy response to inference request in 3.031665802001953s
Received healthy response to inference request in 2.7925894260406494s
Received healthy response to inference request in 2.211193799972534s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.624497175216675s
Received healthy response to inference request in 2.9669137001037598s
30 requests
10 failed requests
5th percentile: 2.1521494030952453
10th percentile: 2.5831668376922607
20th percentile: 2.759180498123169
30th percentile: 2.799022650718689
40th percentile: 2.9441434383392333
50th percentile: 3.0122419595718384
60th percentile: 6.820441055297845
70th percentile: 20.119102716445923
80th percentile: 20.130286073684694
90th percentile: 20.14755721092224
95th percentile: 20.153887724876405
99th percentile: 20.156227748394013
mean time: 9.184694774945577
%s, retrying in %s seconds...
Received healthy response to inference request in 2.792971611022949s
2026-03-10T16:01:19.617571+00:00 monitor updated for qwen-qwen3-5-27b_v2
Received healthy response to inference request in 2.60329270362854s
Received healthy response to inference request in 2.2404184341430664s
Received healthy response to inference request in 2.6232166290283203s
Received healthy response to inference request in 2.728541135787964s
Received healthy response to inference request in 2.7004776000976562s
Received healthy response to inference request in 2.4513301849365234s
Received healthy response to inference request in 3.0727028846740723s
Received healthy response to inference request in 2.8702571392059326s
Received healthy response to inference request in 2.7890331745147705s
Received healthy response to inference request in 2.81533145904541s
Received healthy response to inference request in 2.8286235332489014s
Received healthy response to inference request in 5.20267128944397s
Received healthy response to inference request in 2.7302074432373047s
Received healthy response to inference request in 2.886345863342285s
Received healthy response to inference request in 3.4802393913269043s
Received healthy response to inference request in 2.778672695159912s
Received healthy response to inference request in 2.9627840518951416s
Received healthy response to inference request in 2.8922290802001953s
Received healthy response to inference request in 2.2978994846343994s
Received healthy response to inference request in 2.773064613342285s
2026-03-10T16:02:19.728732+00:00 monitor updated for qwen-qwen3-5-27b_v2
Received healthy response to inference request in 2.3613927364349365s
Received healthy response to inference request in 2.8599374294281006s
Received healthy response to inference request in 2.0871682167053223s
Received healthy response to inference request in 2.8822646141052246s
Received healthy response to inference request in 2.903998374938965s
Received healthy response to inference request in 2.2217960357666016s
Received healthy response to inference request in 3.2011733055114746s
Received healthy response to inference request in 2.700998067855835s
Received healthy response to inference request in 2.863269805908203s
30 requests
0 failed requests
5th percentile: 2.2301761150360107
10th percentile: 2.292151379585266
20th percentile: 2.5729001998901366
30th percentile: 2.7008419275283813
40th percentile: 2.755921745300293
50th percentile: 2.79100239276886
60th percentile: 2.841149091720581
70th percentile: 2.8738593816757203
80th percentile: 2.8945829391479494
90th percentile: 3.0855499267578126
95th percentile: 3.35465965270996
99th percentile: 4.703166038990022
mean time: 2.8200769662857055
Pipeline stage StressChecker completed in 372.52s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.41s
Shutdown handler de-registered
qwen-qwen3-5-27b_v2 status is now deployed due to DeploymentManager action
qwen-qwen3-5-27b_v2 status is now inactive due to system request
qwen-qwen3-5-27b_v2 status is now torndown due to DeploymentManager action