Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3b-mv1-win-84391-v6-uploader
Waiting for job on chaiml-pony-d3b-mv1-win-84391-v6-uploader to finish
chaiml-pony-d3b-mv1-win-84391-v6-uploader: Using quantization_mode: fp8
chaiml-pony-d3b-mv1-win-84391-v6-uploader: Checking if ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3b-mv1-win-84391-v6-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3b-mv1-win-84391-v6-uploader: Downloading snapshot of ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8...
2026-03-28T03:49:55.125736+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v6
chaiml-pony-d3b-mv1-win-84391-v6-uploader: Downloaded in 39.935s
chaiml-pony-d3b-mv1-win-84391-v6-uploader: Processed model ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8 in 42.441s
chaiml-pony-d3b-mv1-win-84391-v6-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3b-mv1-win-84391-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v6-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3b-mv1-win-84391-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3b-mv1-win-84391-v6-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3b-mv1-win-84391-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v6-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-win-84391-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v6-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-win-84391-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v6-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-win-84391-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v6-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-win-84391-v6-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3b-mv1-win-84391-v6-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3b-mv1-win-84391-v6-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3b-mv1-win-84391-v6-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3b-mv1-win-84391-v6-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3b-mv1-win-84391-v6-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v6/default
chaiml-pony-d3b-mv1-win-84391-v6-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v6/default/tokenizer_config.json
chaiml-pony-d3b-mv1-win-84391-v6-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v6/default/config.json
chaiml-pony-d3b-mv1-win-84391-v6-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v6/default/recipe.yaml
chaiml-pony-d3b-mv1-win-84391-v6-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v6/default/generation_config.json
chaiml-pony-d3b-mv1-win-84391-v6-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v6/default/.gitattributes
chaiml-pony-d3b-mv1-win-84391-v6-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v6/default/chat_template.jinja
chaiml-pony-d3b-mv1-win-84391-v6-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v6/default/tokenizer.json
2026-03-28T03:50:55.226378+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v6
chaiml-pony-d3b-mv1-win-84391-v6-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v6/default/model.safetensors
Job chaiml-pony-d3b-mv1-win-84391-v6-uploader completed after 163.62s with status: succeeded
Stopping job with name chaiml-pony-d3b-mv1-win-84391-v6-uploader
Pipeline stage VLLMUploader completed in 164.10s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.79s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3b-mv1-win-84391-v6
Waiting for inference service chaiml-pony-d3b-mv1-win-84391-v6 to be ready
2026-03-28T03:51:55.366298+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v6
2026-03-28T03:52:55.469375+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v6
2026-03-28T03:53:55.560910+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v6
Inference service chaiml-pony-d3b-mv1-win-84391-v6 ready after 180.43097162246704s
Pipeline stage VLLMDeployer completed in 180.92s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-28T03:54:55.658426+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v6
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.881718635559082s
2026-03-28T03:55:55.752184+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v6
Received healthy response to inference request in 3.171421766281128s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.294853210449219s
Received healthy response to inference request in 2.4454312324523926s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.721825122833252s
Received healthy response to inference request in 2.108214855194092s
Received healthy response to inference request in 3.406998872756958s
Received healthy response to inference request in 2.2627785205841064s
2026-03-28T03:56:55.846921+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v6
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.1952085494995117s
Received healthy response to inference request in 1.6605682373046875s
Received healthy response to inference request in 1.664721965789795s
Received healthy response to inference request in 2.1910130977630615s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.273378849029541s
Received healthy response to inference request in 1.7197589874267578s
Received healthy response to inference request in 3.4449284076690674s
Received healthy response to inference request in 1.831468105316162s
Received healthy response to inference request in 2.0658271312713623s
2026-03-28T03:57:55.942948+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v6
Received healthy response to inference request in 2.1499621868133545s
Received healthy response to inference request in 2.0929808616638184s
Received healthy response to inference request in 1.7363932132720947s
Received healthy response to inference request in 1.8510138988494873s
Received healthy response to inference request in 1.9113032817840576s
Received healthy response to inference request in 1.8108196258544922s
30 requests
7 failed requests
5th percentile: 1.6894886255264283
10th percentile: 1.7216185092926026
20th percentile: 1.827338409423828
30th percentile: 2.019469976425171
40th percentile: 2.1332632541656493
50th percentile: 2.228993535041809
60th percentile: 2.735827445983886
70th percentile: 3.5759654760360706
80th percentile: 20.115349817276
90th percentile: 20.174841904640196
95th percentile: 20.284821140766145
99th percentile: 25.075208301544194
mean time: 6.732774162292481
%s, retrying in %s seconds...
Received healthy response to inference request in 1.880589485168457s
Received healthy response to inference request in 1.9359898567199707s
Received healthy response to inference request in 1.6149144172668457s
Received healthy response to inference request in 1.74796462059021s
Received healthy response to inference request in 1.6239185333251953s
Received healthy response to inference request in 1.636911392211914s
Received healthy response to inference request in 1.770951271057129s
Received healthy response to inference request in 1.748642921447754s
Received healthy response to inference request in 1.6850922107696533s
Received healthy response to inference request in 1.6384539604187012s
Received healthy response to inference request in 2.3425045013427734s
Received healthy response to inference request in 1.7801110744476318s
Received healthy response to inference request in 2.0874483585357666s
Received healthy response to inference request in 1.6508228778839111s
Received healthy response to inference request in 2.6211929321289062s
Received healthy response to inference request in 1.933781623840332s
Received healthy response to inference request in 2.034158945083618s
Received healthy response to inference request in 1.7288835048675537s
Received healthy response to inference request in 1.853259563446045s
Received healthy response to inference request in 1.7439682483673096s
Received healthy response to inference request in 1.6688671112060547s
Received healthy response to inference request in 2.2558305263519287s
Received healthy response to inference request in 1.853846549987793s
Received healthy response to inference request in 2.1926662921905518s
Received healthy response to inference request in 1.9570715427398682s
2026-03-28T03:58:56.033558+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v6
Received healthy response to inference request in 1.7995355129241943s
Received healthy response to inference request in 2.691068410873413s
Received healthy response to inference request in 1.7692797183990479s
Received healthy response to inference request in 1.7650012969970703s
Received healthy response to inference request in 2.108804702758789s
30 requests
0 failed requests
5th percentile: 1.6297653198242188
10th percentile: 1.6382997035980225
20th percentile: 1.6818471908569337
30th percentile: 1.74676570892334
40th percentile: 1.7675683498382568
50th percentile: 1.789823293685913
60th percentile: 1.8645437240600586
70th percentile: 1.9423143625259398
80th percentile: 2.091719627380371
90th percentile: 2.2644979238510135
95th percentile: 2.495783138275146
99th percentile: 2.670804522037506
mean time: 1.9040510654449463
Pipeline stage StressChecker completed in 265.17s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.76s
Shutdown handler de-registered
chaiml-pony-d3b-mv1-win_84391_v6 status is now deployed due to DeploymentManager action
chaiml-pony-d3b-mv1-win_84391_v6 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3b-mv1-win_84391_v6 status is now torndown due to DeploymentManager action