Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-v49-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-v49-uploader to finish
qwen-qwen3-5-35b-a3b-v49-uploader: Using quantization_mode: fp8
2026-03-24T20:34:48.337573+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
qwen-qwen3-5-35b-a3b-v49-uploader: Downloaded in 42.980s
qwen-qwen3-5-35b-a3b-v49-uploader: Processed model Qwen/Qwen3.5-35B-A3B in 45.485s
qwen-qwen3-5-35b-a3b-v49-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-v49-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v49-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-v49-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-v49-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-v49-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v49-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v49-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v49-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-v49-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v49-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v49-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-v49-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-v49-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-v49-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-v49-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-v49-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-v49-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-v49-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v49/default
qwen-qwen3-5-35b-a3b-v49-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v49/default/.gitattributes
qwen-qwen3-5-35b-a3b-v49-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v49/default/recipe.yaml
qwen-qwen3-5-35b-a3b-v49-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v49/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-v49-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v49/default/generation_config.json
qwen-qwen3-5-35b-a3b-v49-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v49/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-v49-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v49/default/config.json
qwen-qwen3-5-35b-a3b-v49-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v49/default/tokenizer.json
2026-03-24T20:35:48.555065+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
qwen-qwen3-5-35b-a3b-v49-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-v49/default/model.safetensors
Job qwen-qwen3-5-35b-a3b-v49-uploader completed after 159.49s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-v49-uploader
Pipeline stage VLLMUploader completed in 160.76s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.98s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-v49
Waiting for inference service qwen-qwen3-5-35b-a3b-v49 to be ready
2026-03-24T20:36:48.734083+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
2026-03-24T20:37:48.914960+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
2026-03-24T20:38:49.087522+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
Inference service qwen-qwen3-5-35b-a3b-v49 ready after 192.35909962654114s
Pipeline stage VLLMDeployer completed in 193.41s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-24T20:39:49.265461+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v49-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v49-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v49-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.0296127796173096s
2026-03-24T20:40:49.454950+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
Received healthy response to inference request in 1.3321053981781006s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v49-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.4802162647247314s
Received healthy response to inference request in 1.2992703914642334s
Received healthy response to inference request in 2.9918646812438965s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v49-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
2026-03-24T20:41:49.627296+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v49-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.650085210800171s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v49-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.5940990447998047s
Received healthy response to inference request in 1.5356011390686035s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-v49-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.2092230319976807s
2026-03-24T20:42:49.860595+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
Received healthy response to inference request in 1.6980340480804443s
Received healthy response to inference request in 2.275339365005493s
Received healthy response to inference request in 2.436965227127075s
Received healthy response to inference request in 3.850302219390869s
Received healthy response to inference request in 1.9355618953704834s
Received healthy response to inference request in 1.5326740741729736s
Received healthy response to inference request in 2.2633256912231445s
Received healthy response to inference request in 1.3066205978393555s
Received healthy response to inference request in 1.5727436542510986s
Received healthy response to inference request in 1.3164019584655762s
Received healthy response to inference request in 1.9060180187225342s
Received healthy response to inference request in 1.2685229778289795s
Received healthy response to inference request in 1.0645461082458496s
30 requests
8 failed requests
5th percentile: 1.2823593139648437
10th percentile: 1.3058855772018432
20th percentile: 1.4925603389739992
30th percentile: 1.587692427635193
40th percentile: 1.8228244304656984
50th percentile: 2.2362743616104126
60th percentile: 2.658925008773803
70th percentile: 3.5912420511245715
80th percentile: 20.419733142852785
90th percentile: 20.433528327941893
95th percentile: 20.440482568740844
99th percentile: 20.67028424501419
mean time: 6.9105612675348915
%s, retrying in %s seconds...
Received healthy response to inference request in 1.5961875915527344s
Received healthy response to inference request in 1.60270094871521s
Received healthy response to inference request in 1.6155872344970703s
Received healthy response to inference request in 1.700773000717163s
Received healthy response to inference request in 1.3156678676605225s
Received healthy response to inference request in 1.1560792922973633s
Received healthy response to inference request in 1.4043033123016357s
Received healthy response to inference request in 1.2142887115478516s
Received healthy response to inference request in 1.4502842426300049s
Received healthy response to inference request in 1.8148295879364014s
Received healthy response to inference request in 1.538421869277954s
2026-03-24T20:43:50.057631+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
Received healthy response to inference request in 61.18014073371887s
Received healthy response to inference request in 1.8028039932250977s
Received healthy response to inference request in 1.6426434516906738s
Received healthy response to inference request in 1.5643634796142578s
2026-03-24T20:44:50.240090+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
Received healthy response to inference request in 61.21157240867615s
Received healthy response to inference request in 4.03321647644043s
Received healthy response to inference request in 1.5163898468017578s
2026-03-24T20:45:50.446687+00:00 monitor updated for qwen-qwen3-5-35b-a3b_v49
Received healthy response to inference request in 1.8452041149139404s
Received healthy response to inference request in 1.3265972137451172s
Received healthy response to inference request in 1.8782291412353516s
Received healthy response to inference request in 1.7232770919799805s
Received healthy response to inference request in 1.6055772304534912s
Received healthy response to inference request in 1.6531085968017578s
Received healthy response to inference request in 2.2951390743255615s
Received healthy response to inference request in 1.7604544162750244s
Received healthy response to inference request in 1.818312168121338s
Received healthy response to inference request in 1.904552936553955s
Received healthy response to inference request in 2.4192724227905273s
Received healthy response to inference request in 1.3603532314300537s
30 requests
0 failed requests
5th percentile: 1.2599093317985535
10th percentile: 1.3255042791366578
20th percentile: 1.4410880565643311
30th percentile: 1.5565809965133668
40th percentile: 1.6044267177581788
50th percentile: 1.6478760242462158
60th percentile: 1.738148021697998
70th percentile: 1.8158743619918822
80th percentile: 1.8834939002990723
90th percentile: 2.58066682815552
95th percentile: 35.46402481794341
99th percentile: 61.202457222938534
mean time: 5.698344389597575
Pipeline stage StressChecker completed in 389.93s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.28s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b_v49 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b_v49 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b_v49 status is now torndown due to DeploymentManager action