Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3b-mv1-wi-84391-v10-uploader
Waiting for job on chaiml-pony-d3b-mv1-wi-84391-v10-uploader to finish
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: Using quantization_mode: fp8
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: Checking if ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: Downloading snapshot of ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8...
2026-03-28T20:14:40.033481+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v10
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: Downloaded in 34.569s
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: Processed model ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8 in 37.352s
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v10/default
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v10/default/chat_template.jinja
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v10/default/tokenizer_config.json
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v10/default/config.json
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v10/default/.gitattributes
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v10/default/recipe.yaml
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v10/default/generation_config.json
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v10/default/tokenizer.json
2026-03-28T20:15:40.125587+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v10
chaiml-pony-d3b-mv1-wi-84391-v10-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v10/default/model.safetensors
Job chaiml-pony-d3b-mv1-wi-84391-v10-uploader completed after 143.12s with status: succeeded
Stopping job with name chaiml-pony-d3b-mv1-wi-84391-v10-uploader
Pipeline stage VLLMUploader completed in 143.57s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.56s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3b-mv1-wi-84391-v10
Waiting for inference service chaiml-pony-d3b-mv1-wi-84391-v10 to be ready
2026-03-28T20:16:40.211704+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v10
2026-03-28T20:17:40.302656+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v10
2026-03-28T20:18:40.402059+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v10
Inference service chaiml-pony-d3b-mv1-wi-84391-v10 ready after 210.60082936286926s
Pipeline stage VLLMDeployer completed in 211.09s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-28T20:19:40.494995+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v10
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 11.685126066207886s
2026-03-28T20:20:40.580504+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v10
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.339049339294434s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T20:21:40.679334+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v10
Received healthy response to inference request in 14.164968967437744s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.865381479263306s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.4774630069732666s
2026-03-28T20:22:40.774653+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v10
Received healthy response to inference request in 2.889397382736206s
Received healthy response to inference request in 2.3700153827667236s
Received healthy response to inference request in 2.24072003364563s
Received healthy response to inference request in 2.189911127090454s
Received healthy response to inference request in 2.7518746852874756s
Received healthy response to inference request in 2.2604565620422363s
Received healthy response to inference request in 6.51496148109436s
Received healthy response to inference request in 2.2772302627563477s
Received healthy response to inference request in 6.770885944366455s
Received healthy response to inference request in 2.2544660568237305s
Received healthy response to inference request in 2.7883529663085938s
Received healthy response to inference request in 2.908043384552002s
Received healthy response to inference request in 2.3290164470672607s
Received healthy response to inference request in 2.323636770248413s
Received healthy response to inference request in 2.4978790283203125s
Received healthy response to inference request in 2.34793758392334s
Received healthy response to inference request in 2.3706979751586914s
Received healthy response to inference request in 2.7399442195892334s
30 requests
7 failed requests
5th percentile: 2.246905744075775
10th percentile: 2.259857511520386
20th percentile: 2.327940511703491
30th percentile: 2.370493197441101
40th percentile: 2.7471024990081787
50th percentile: 2.898720383644104
60th percentile: 4.549582195281982
70th percentile: 8.24515798091887
80th percentile: 20.11183171272278
90th percentile: 20.136134147644043
95th percentile: 20.156346333026885
99th percentile: 20.20446546792984
mean time: 7.7456258058547975
%s, retrying in %s seconds...
Received healthy response to inference request in 2.1901185512542725s
Received healthy response to inference request in 2.2660398483276367s
Received healthy response to inference request in 2.1513004302978516s
2026-03-28T20:23:40.875267+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v10
Received healthy response to inference request in 2.3677473068237305s
Received healthy response to inference request in 2.1254374980926514s
Received healthy response to inference request in 2.1629817485809326s
Received healthy response to inference request in 2.145042896270752s
Received healthy response to inference request in 2.255835771560669s
Received healthy response to inference request in 2.1525769233703613s
Received healthy response to inference request in 2.2448463439941406s
Received healthy response to inference request in 2.6681456565856934s
Received healthy response to inference request in 2.1148838996887207s
Received healthy response to inference request in 2.2141778469085693s
Received healthy response to inference request in 2.1720364093780518s
Received healthy response to inference request in 2.1873044967651367s
Received healthy response to inference request in 2.1911444664001465s
Received healthy response to inference request in 2.182431221008301s
Received healthy response to inference request in 2.6473350524902344s
Received healthy response to inference request in 2.4689159393310547s
Received healthy response to inference request in 2.374211072921753s
Received healthy response to inference request in 2.3134617805480957s
Received healthy response to inference request in 2.4258594512939453s
Received healthy response to inference request in 2.3887977600097656s
Received healthy response to inference request in 2.535571575164795s
Received healthy response to inference request in 2.4236721992492676s
Received healthy response to inference request in 2.3227555751800537s
Received healthy response to inference request in 2.327958345413208s
Received healthy response to inference request in 2.345839738845825s
2026-03-28T20:24:40.980990+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v10
Received healthy response to inference request in 2.269763708114624s
Received healthy response to inference request in 2.4293487071990967s
30 requests
0 failed requests
5th percentile: 2.1342599272727965
10th percentile: 2.1506746768951417
20th percentile: 2.170225477218628
30th percentile: 2.1892743349075316
40th percentile: 2.2325789451599123
50th percentile: 2.2679017782211304
60th percentile: 2.3248366832733156
70th percentile: 2.3696864366531374
80th percentile: 2.4241096496582033
90th percentile: 2.475581502914429
95th percentile: 2.597041487693786
99th percentile: 2.6621105813980104
mean time: 2.302184740702311
Pipeline stage StressChecker completed in 307.02s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.82s
Shutdown handler de-registered
chaiml-pony-d3b-mv1-wi_84391_v10 status is now deployed due to DeploymentManager action
chaiml-pony-d3b-mv1-wi_84391_v10 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3b-mv1-wi_84391_v10 status is now torndown due to DeploymentManager action