Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-v53-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-v53-uploader to finish
qwen-qwen3-5-35b-a3b-v53-uploader: Using quantization_mode: none
qwen-qwen3-5-35b-a3b-v53-uploader: Downloading snapshot of Qwen/Qwen3.5-35B-A3B...
qwen-qwen3-5-35b-a3b-v53-uploader: Downloaded in 24.726s
2026-03-25T19:26:46.528972+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v53
qwen-qwen3-5-35b-a3b-v53-uploader: Processed model Qwen/Qwen3.5-35B-A3B in 54.902s
qwen-qwen3-5-35b-a3b-v53-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-v53-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v53-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-v53-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-v53-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-v53-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v53-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v53-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v53-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v53-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v53-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v53-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v53-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v53-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-v53-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-v53-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-v53-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-v53-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-v53-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/.gitattributes
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/README.md
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/video_preprocessor_config.json
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/LICENSE
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/config.json
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/generation_config.json
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/preprocessor_config.json
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors.index.json
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/merges.txt
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/vocab.json
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/tokenizer.json
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00014-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00014-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00009-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00009-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00002-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00011-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00010-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00008-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00008-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00006-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00006-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00007-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00005-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00005-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00004-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00012-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00012-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00003-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00003-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00013-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00013-of-00014.safetensors
qwen-qwen3-5-35b-a3b-v53-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v53/default/model.safetensors-00001-of-00014.safetensors
Job qwen-qwen3-5-35b-a3b-v53-uploader completed after 93.71s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-v53-uploader
Pipeline stage VLLMUploader completed in 94.45s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 3.07s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-v53
Waiting for inference service qwen-qwen3-5-35b-a3b-v53 to be ready
2026-03-25T19:27:46.622297+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v53
2026-03-25T19:28:46.729515+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v53
2026-03-25T19:29:46.844967+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v53
2026-03-25T19:30:46.949206+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v53
Unable to record family friendly update due to error: ('http://chaiml-nemo-guard-merged-v3-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:53962->127.0.0.1:8080: read: connection reset by peer\n')
Inference service qwen-qwen3-5-35b-a3b-v53 ready after 211.00928688049316s
Pipeline stage VLLMDeployer completed in 214.74s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T19:31:47.107524+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v53
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.97137451171875s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T19:32:47.199849+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v53
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.250379323959351s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 7.036333799362183s
Received healthy response to inference request in 1.523592233657837s
Received healthy response to inference request in 2.5578866004943848s
Received healthy response to inference request in 1.3825984001159668s
Received healthy response to inference request in 2.335587978363037s
Received healthy response to inference request in 2.108550786972046s
2026-03-25T19:33:47.294541+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v53
Received healthy response to inference request in 2.0496652126312256s
Received healthy response to inference request in 4.3051934242248535s
Received healthy response to inference request in 1.3486440181732178s
Retrying (%r) after connection broken by '%r': %s
Received healthy response to inference request in 1.90248703956604s
Received healthy response to inference request in 1.4603824615478516s
Received healthy response to inference request in 2.030351161956787s
Received healthy response to inference request in 2.655691146850586s
Received healthy response to inference request in 1.746504545211792s
Received healthy response to inference request in 1.166224479675293s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.2033426761627197s
Received healthy response to inference request in 1.5503952503204346s
Received healthy response to inference request in 1.5322723388671875s
2026-03-25T19:34:47.388778+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v53
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.4873101711273193s
30 requests
9 failed requests
5th percentile: 1.3639234900474548
10th percentile: 1.452604055404663
20th percentile: 1.5305363178253173
30th percentile: 1.8556922912597655
40th percentile: 2.084996557235718
50th percentile: 2.446737289428711
60th percentile: 4.08297643661499
70th percentile: 10.959511876106225
80th percentile: 20.12296357154846
90th percentile: 20.153772377967833
95th percentile: 20.179435968399048
99th percentile: 20.27828973531723
mean time: 7.734905409812927
%s, retrying in %s seconds...
Received healthy response to inference request in 1.494767665863037s
Received healthy response to inference request in 1.3482451438903809s
Received healthy response to inference request in 1.369760274887085s
Received healthy response to inference request in 18.75153350830078s
Received healthy response to inference request in 1.3651123046875s
Received healthy response to inference request in 1.4969794750213623s
Received healthy response to inference request in 1.5223395824432373s
Received healthy response to inference request in 1.3867385387420654s
Received healthy response to inference request in 1.6065306663513184s
Received healthy response to inference request in 1.296405553817749s
Received healthy response to inference request in 1.1903626918792725s
Received healthy response to inference request in 2.004528045654297s
Received healthy response to inference request in 1.3187363147735596s
Received healthy response to inference request in 1.4395787715911865s
Received healthy response to inference request in 1.3939049243927002s
Received healthy response to inference request in 1.4591777324676514s
Received healthy response to inference request in 1.532991647720337s
Received healthy response to inference request in 1.4298572540283203s
Received healthy response to inference request in 1.4504938125610352s
Received healthy response to inference request in 1.4426236152648926s
Received healthy response to inference request in 1.335303544998169s
Received healthy response to inference request in 1.1955907344818115s
Received healthy response to inference request in 1.777630090713501s
2026-03-25T19:35:47.489206+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v53
Received healthy response to inference request in 1.2231862545013428s
Received healthy response to inference request in 1.3892974853515625s
Received healthy response to inference request in 1.3846728801727295s
Received healthy response to inference request in 1.2192111015319824s
Received healthy response to inference request in 1.9501192569732666s
Received healthy response to inference request in 1.3481354713439941s
Received healthy response to inference request in 1.351182460784912s
30 requests
0 failed requests
5th percentile: 1.2062198996543885
10th percentile: 1.2227887392044068
20th percentile: 1.3319900989532472
30th percentile: 1.3503012657165527
40th percentile: 1.3787078380584716
50th percentile: 1.3916012048721313
60th percentile: 1.440796709060669
70th percentile: 1.469854712486267
80th percentile: 1.5244699954986574
90th percentile: 1.7948790073394778
95th percentile: 1.9800440907478332
99th percentile: 13.894901924133315
mean time: 2.0158332268397015
Pipeline stage StressChecker completed in 298.39s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.49s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b_v53 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b_v53 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b_v53 status is now torndown due to DeploymentManager action