Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-v48-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-v48-uploader to finish
qwen-qwen3-5-35b-a3b-v48-uploader: Using quantization_mode: fp8
qwen-qwen3-5-35b-a3b-v48-uploader: Checking if ChaiML/Qwen3.5-35B-A3B-FP8 already exists in ChaiML
qwen-qwen3-5-35b-a3b-v48-uploader: Model already exists. Downloading to /dev/shm/model_output...
qwen-qwen3-5-35b-a3b-v48-uploader: Downloading snapshot of ChaiML/Qwen3.5-35B-A3B-FP8...
2026-03-24T19:28:23.959637+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v48
qwen-qwen3-5-35b-a3b-v48-uploader: Downloaded in 41.574s
qwen-qwen3-5-35b-a3b-v48-uploader: Processed model Qwen/Qwen3.5-35B-A3B in 44.096s
qwen-qwen3-5-35b-a3b-v48-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-v48-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v48-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-v48-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-v48-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-v48-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v48-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v48-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v48-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v48-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v48-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v48-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v48-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v48-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-v48-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-v48-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-v48-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-v48-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-v48-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v48/default
qwen-qwen3-5-35b-a3b-v48-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v48/default/.gitattributes
qwen-qwen3-5-35b-a3b-v48-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v48/default/recipe.yaml
qwen-qwen3-5-35b-a3b-v48-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v48/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-v48-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v48/default/generation_config.json
qwen-qwen3-5-35b-a3b-v48-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v48/default/config.json
qwen-qwen3-5-35b-a3b-v48-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v48/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-v48-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v48/default/tokenizer.json
2026-03-24T19:29:24.147270+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v48
qwen-qwen3-5-35b-a3b-v48-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v48/default/model.safetensors
Job qwen-qwen3-5-35b-a3b-v48-uploader completed after 128.71s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-v48-uploader
Pipeline stage VLLMUploader completed in 129.88s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.01s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-v48
Waiting for inference service qwen-qwen3-5-35b-a3b-v48 to be ready
2026-03-24T19:30:24.333334+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v48
2026-03-24T19:31:28.209179+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v48
Inference service qwen-qwen3-5-35b-a3b-v48 ready after 161.89047169685364s
Pipeline stage VLLMDeployer completed in 163.19s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-24T19:32:28.417586+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v48
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v48-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v48-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v48-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
2026-03-24T19:33:28.605574+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v48
Received healthy response to inference request in 10.351577758789062s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v48-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v48-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 10.569565534591675s
Received healthy response to inference request in 3.8476178646087646s
Received healthy response to inference request in 1.6191844940185547s
2026-03-24T19:34:28.792178+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v48
Received healthy response to inference request in 1.3429632186889648s
Received healthy response to inference request in 1.3390300273895264s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v48-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v48-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.0209391117095947s
Received healthy response to inference request in 1.8842272758483887s
Received healthy response to inference request in 1.1394450664520264s
Received healthy response to inference request in 1.1492981910705566s
Received healthy response to inference request in 1.1878736019134521s
Received healthy response to inference request in 1.4643349647521973s
Received healthy response to inference request in 1.5229871273040771s
Received healthy response to inference request in 1.5089471340179443s
Received healthy response to inference request in 1.8288846015930176s
Received healthy response to inference request in 1.6323621273040771s
Received healthy response to inference request in 1.301555871963501s
2026-03-24T19:35:29.127237+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v48
Received healthy response to inference request in 1.1361663341522217s
Received healthy response to inference request in 1.110048770904541s
Received healthy response to inference request in 2.0332531929016113s
Received healthy response to inference request in 2.165717601776123s
Received healthy response to inference request in 1.3302807807922363s
Received healthy response to inference request in 2.819950819015503s
30 requests
7 failed requests
5th percentile: 1.1218016743659973
10th percentile: 1.1391171932220459
20th percentile: 1.2788194179534913
30th percentile: 1.3417832612991334
40th percentile: 1.517371129989624
50th percentile: 1.7306233644485474
60th percentile: 2.0862389564514157
70th percentile: 5.798805832862835
80th percentile: 20.405014085769654
90th percentile: 20.419068336486816
95th percentile: 20.498433649539948
99th percentile: 20.949473118782045
mean time: 6.634672808647156
%s, retrying in %s seconds...
Received healthy response to inference request in 1.6266181468963623s
Received healthy response to inference request in 1.7609786987304688s
Received healthy response to inference request in 1.3626418113708496s
Received healthy response to inference request in 1.444930076599121s
Received healthy response to inference request in 1.053419828414917s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v48-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.340193271636963s
Received healthy response to inference request in 1.7071030139923096s
2026-03-24T19:36:29.326576+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v48
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v48-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.9187228679656982s
Received healthy response to inference request in 1.4750971794128418s
Received healthy response to inference request in 1.2370843887329102s
Received healthy response to inference request in 1.0961735248565674s
Received healthy response to inference request in 1.6252529621124268s
Received healthy response to inference request in 1.1408977508544922s
Received healthy response to inference request in 1.029040813446045s
Received healthy response to inference request in 1.1107416152954102s
Received healthy response to inference request in 0.9045512676239014s
Received healthy response to inference request in 1.0673279762268066s
Received healthy response to inference request in 3.1016459465026855s
Received healthy response to inference request in 0.8555717468261719s
Received healthy response to inference request in 1.2690465450286865s
Received healthy response to inference request in 1.2679874897003174s
Received healthy response to inference request in 1.5403861999511719s
Received healthy response to inference request in 1.7083156108856201s
Received healthy response to inference request in 1.2472643852233887s
Received healthy response to inference request in 1.5268633365631104s
Received healthy response to inference request in 1.1458442211151123s
Received healthy response to inference request in 1.7390360832214355s
Received healthy response to inference request in 1.3384506702423096s
30 requests
2 failed requests
5th percentile: 0.960571563243866
10th percentile: 1.0509819269180298
20th percentile: 1.1078279972076417
30th percentile: 1.2097123384475708
40th percentile: 1.268622922897339
50th percentile: 1.3514175415039062
60th percentile: 1.495803642272949
70th percentile: 1.6256625175476074
80th percentile: 1.7144597053527832
90th percentile: 2.037015175819399
95th percentile: 12.631978976726483
99th percentile: 20.468728103637694
mean time: 2.685181752840678
%s, retrying in %s seconds...
Received healthy response to inference request in 1.509739875793457s
Received healthy response to inference request in 1.4631648063659668s
Received healthy response to inference request in 1.4908597469329834s
Received healthy response to inference request in 1.3924200534820557s
Received healthy response to inference request in 1.0358459949493408s
Received healthy response to inference request in 0.8162713050842285s
Received healthy response to inference request in 1.4570789337158203s
Received healthy response to inference request in 1.4772839546203613s
Received healthy response to inference request in 1.1350224018096924s
Received healthy response to inference request in 1.7227039337158203s
Received healthy response to inference request in 1.821660041809082s
Received healthy response to inference request in 1.7927606105804443s
Received healthy response to inference request in 1.3338406085968018s
Received healthy response to inference request in 1.424422264099121s
2026-03-24T19:37:29.524534+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v48
Received healthy response to inference request in 1.4906835556030273s
Received healthy response to inference request in 1.4053151607513428s
Received healthy response to inference request in 1.2381935119628906s
Received healthy response to inference request in 0.8282103538513184s
Received healthy response to inference request in 1.9013359546661377s
Received healthy response to inference request in 1.6739380359649658s
Received healthy response to inference request in 1.3919508457183838s
Received healthy response to inference request in 1.6929421424865723s
Received healthy response to inference request in 1.4638984203338623s
Received healthy response to inference request in 1.6676034927368164s
Received healthy response to inference request in 1.1485459804534912s
Received healthy response to inference request in 1.5158016681671143s
Received healthy response to inference request in 1.1879653930664062s
Received healthy response to inference request in 1.139843463897705s
Received healthy response to inference request in 1.1833631992340088s
Received healthy response to inference request in 1.5105860233306885s
30 requests
0 failed requests
5th percentile: 0.9216463923454286
10th percentile: 1.1251047611236573
20th percentile: 1.1763997554779053
30th percentile: 1.3051464796066283
40th percentile: 1.4001571178436278
50th percentile: 1.4601218700408936
60th percentile: 1.4826437950134277
70th percentile: 1.5099937200546265
80th percentile: 1.6688704013824462
90th percentile: 1.7297096014022828
95th percentile: 1.808655297756195
99th percentile: 1.8782299399375917
mean time: 1.4104417244593301
Pipeline stage StressChecker completed in 335.24s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.28s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b_v48 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b_v48 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b_v48 status is now torndown due to DeploymentManager action