Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-27b-v5-uploader
Waiting for job on qwen-qwen3-5-27b-v5-uploader to finish
qwen-qwen3-5-27b-v5-uploader: Using quantization_mode: none
qwen-qwen3-5-27b-v5-uploader: Downloading snapshot of Qwen/Qwen3.5-27B...
qwen-qwen3-5-27b-v5-uploader: Downloaded in 18.219s
qwen-qwen3-5-27b-v5-uploader: Processed model Qwen/Qwen3.5-27B in 39.792s
qwen-qwen3-5-27b-v5-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-27b-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-27b-v5-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-27b-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-27b-v5-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-27b-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-27b-v5-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-27b-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-27b-v5-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-27b-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-27b-v5-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-27b-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-27b-v5-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-27b-v5-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-27b-v5-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-27b-v5-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-27b-v5-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-27b-v5-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-27b-v5-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/chat_template.jinja
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/vocab.json
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/tokenizer_config.json
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors.index.json
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/tokenizer.json
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/generation_config.json
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/preprocessor_config.json
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/config.json
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/video_preprocessor_config.json
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/.gitattributes
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/merges.txt
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/README.md
2026-03-17T05:56:39.928046+00:00 monitor updated for qwen-qwen3-5-27b_v5
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors-00011-of-00011.safetensors
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors-00005-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors-00005-of-00011.safetensors
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors-00009-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors-00009-of-00011.safetensors
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors-00007-of-00011.safetensors
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors-00004-of-00011.safetensors
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors-00003-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors-00003-of-00011.safetensors
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors-00006-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors-00006-of-00011.safetensors
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors-00008-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors-00008-of-00011.safetensors
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors-00001-of-00011.safetensors
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors-00002-of-00011.safetensors
qwen-qwen3-5-27b-v5-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00011.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-27b-v5/default/model.safetensors-00010-of-00011.safetensors
Job qwen-qwen3-5-27b-v5-uploader completed after 83.31s with status: succeeded
Stopping job with name qwen-qwen3-5-27b-v5-uploader
Pipeline stage VLLMUploader completed in 83.85s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.74s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-27b-v5
Waiting for inference service qwen-qwen3-5-27b-v5 to be ready
2026-03-17T05:57:40.053506+00:00 monitor updated for qwen-qwen3-5-27b_v5
2026-03-17T05:58:40.162268+00:00 monitor updated for qwen-qwen3-5-27b_v5
2026-03-17T05:59:40.294429+00:00 monitor updated for qwen-qwen3-5-27b_v5
Inference service qwen-qwen3-5-27b-v5 ready after 190.23181796073914s
Pipeline stage VLLMDeployer completed in 190.74s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-17T06:00:40.527727+00:00 monitor updated for qwen-qwen3-5-27b_v5
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-17T06:01:40.634219+00:00 monitor updated for qwen-qwen3-5-27b_v5
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.846677541732788s
Received healthy response to inference request in 7.656595706939697s
Received healthy response to inference request in 4.456328392028809s
2026-03-17T06:02:40.727235+00:00 monitor updated for qwen-qwen3-5-27b_v5
Received healthy response to inference request in 2.3435702323913574s
Received healthy response to inference request in 2.298595666885376s
Received healthy response to inference request in 2.3588457107543945s
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.5589687824249268s
Received healthy response to inference request in 2.3260550498962402s
Received healthy response to inference request in 2.4673352241516113s
Received healthy response to inference request in 2.3029632568359375s
Received healthy response to inference request in 2.4572064876556396s
Received healthy response to inference request in 2.3333895206451416s
Received healthy response to inference request in 2.5117263793945312s
Received healthy response to inference request in 2.3375368118286133s
2026-03-17T06:03:40.840302+00:00 monitor updated for qwen-qwen3-5-27b_v5
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.833181381225586s
Received healthy response to inference request in 2.337641954421997s
Received healthy response to inference request in 7.376272439956665s
Received healthy response to inference request in 2.516141653060913s
Received healthy response to inference request in 2.363041400909424s
Received healthy response to inference request in 2.353435754776001s
Received healthy response to inference request in 2.3660874366760254s
Received healthy response to inference request in 2.3820903301239014s
30 requests
8 failed requests
5th percentile: 2.313354563713074
10th percentile: 2.3326560735702513
20th percentile: 2.3423845767974854
30th percentile: 2.361782693862915
40th percentile: 2.4271600246429443
50th percentile: 2.513934016227722
60th percentile: 3.4824401855468725
70th percentile: 7.460369420051574
80th percentile: 20.132046699523926
90th percentile: 20.16559636592865
95th percentile: 20.168656885623932
99th percentile: 23.560419192314153
mean time: 7.7921686251958215
%s, retrying in %s seconds...
Received healthy response to inference request in 2.288910150527954s
Received healthy response to inference request in 2.423434257507324s
Received healthy response to inference request in 2.1735918521881104s
Received healthy response to inference request in 2.119861364364624s
Received healthy response to inference request in 2.3935399055480957s
Received healthy response to inference request in 2.2544682025909424s
Received healthy response to inference request in 2.360053062438965s
Received healthy response to inference request in 2.157510280609131s
Received healthy response to inference request in 2.3324711322784424s
Received healthy response to inference request in 2.312227249145508s
Received healthy response to inference request in 2.2773919105529785s
Received healthy response to inference request in 2.404081344604492s
2026-03-17T06:04:40.954142+00:00 monitor updated for qwen-qwen3-5-27b_v5
Received healthy response to inference request in 2.2500486373901367s
Received healthy response to inference request in 2.362760305404663s
Received healthy response to inference request in 2.359312057495117s
Received healthy response to inference request in 2.3385093212127686s
Received healthy response to inference request in 2.301424026489258s
Received healthy response to inference request in 2.282733201980591s
Received healthy response to inference request in 2.3259212970733643s
Received healthy response to inference request in 2.5194334983825684s
Received healthy response to inference request in 2.4067201614379883s
Received healthy response to inference request in 2.5138330459594727s
Received healthy response to inference request in 2.570192813873291s
Received healthy response to inference request in 2.3441762924194336s
Received healthy response to inference request in 2.290769100189209s
Received healthy response to inference request in 2.3520190715789795s
Received healthy response to inference request in 2.3809328079223633s
Received healthy response to inference request in 2.5591235160827637s
Received healthy response to inference request in 2.467294692993164s
Received healthy response to inference request in 2.451639175415039s
30 requests
0 failed requests
5th percentile: 2.1647469878196715
10th percentile: 2.242402958869934
20th percentile: 2.2816649436950684
30th percentile: 2.298227548599243
40th percentile: 2.329851198196411
50th percentile: 2.3480976819992065
60th percentile: 2.361135959625244
70th percentile: 2.3967023372650145
80th percentile: 2.429075241088867
90th percentile: 2.5143930912017822
95th percentile: 2.5412630081176757
99th percentile: 2.5669827175140383
mean time: 2.3524794578552246
Pipeline stage StressChecker completed in 309.60s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.94s
Shutdown handler de-registered
qwen-qwen3-5-27b_v5 status is now deployed due to DeploymentManager action
qwen-qwen3-5-27b_v5 status is now inactive due to auto deactivation removed underperforming models