Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3a-mv1-plc-30375-v2-uploader
Waiting for job on chaiml-pony-d3a-mv1-plc-30375-v2-uploader to finish
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: Using quantization_mode: none
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: Downloading snapshot of ChaiML/pony-d3a-mv1-plc-q35b-lr5e6ep1g8...
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: Downloaded in 23.834s
2026-03-25T14:56:13.260083+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v2
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: Processed model ChaiML/pony-d3a-mv1-plc-q35b-lr5e6ep1g8 in 50.173s
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/chat_template.jinja
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/generation_config.json
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/README.md
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/args.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/args.json
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/processor_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/processor_config.json
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/config.json
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/added_tokens.json
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/preprocessor_config.json
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/.gitattributes
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/tokenizer_config.json
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/model.safetensors.index.json
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/special_tokens_map.json
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/vocab.json
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/merges.txt
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/tokenizer.json
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/model-00016-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/model-00016-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/model-00007-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/model-00007-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/model-00013-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/model-00013-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/model-00010-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/model-00010-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/model-00004-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/model-00004-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/model-00006-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/model-00006-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/model-00012-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/model-00012-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/model-00003-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/model-00003-of-00016.safetensors
chaiml-pony-d3a-mv1-plc-30375-v2-uploader: cp /dev/shm/model_output/model-00015-of-00016.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-30375-v2/default/model-00015-of-00016.safetensors
Job chaiml-pony-d3a-mv1-plc-30375-v2-uploader completed after 82.87s with status: succeeded
Stopping job with name chaiml-pony-d3a-mv1-plc-30375-v2-uploader
Pipeline stage VLLMUploader completed in 83.53s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 6.30s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3a-mv1-plc-30375-v2
Waiting for inference service chaiml-pony-d3a-mv1-plc-30375-v2 to be ready
2026-03-25T14:57:13.346816+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v2
2026-03-25T14:58:13.444478+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v2
2026-03-25T14:59:13.544045+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v2
Inference service chaiml-pony-d3a-mv1-plc-30375-v2 ready after 170.29208517074585s
Pipeline stage VLLMDeployer completed in 170.73s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T15:00:13.648493+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T15:01:13.749090+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 6.399854421615601s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.7767982482910156s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T15:02:13.862215+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 7.674105882644653s
Received healthy response to inference request in 6.815012454986572s
Received healthy response to inference request in 1.79490327835083s
Received healthy response to inference request in 7.097959041595459s
Received healthy response to inference request in 1.8703458309173584s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.166020154953003s
2026-03-25T15:03:13.956910+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v2
Received healthy response to inference request in 11.367148160934448s
Received healthy response to inference request in 1.5510406494140625s
Received healthy response to inference request in 2.1137592792510986s
Received healthy response to inference request in 2.203639030456543s
Received healthy response to inference request in 1.585148572921753s
Received healthy response to inference request in 1.4151520729064941s
Received healthy response to inference request in 2.5649354457855225s
Received healthy response to inference request in 1.4688184261322021s
Received healthy response to inference request in 1.8741233348846436s
Received healthy response to inference request in 1.5252902507781982s
Received healthy response to inference request in 1.7894916534423828s
Received healthy response to inference request in 1.6070823669433594s
Received healthy response to inference request in 1.5106515884399414s
30 requests
9 failed requests
5th percentile: 1.4876433491706849
10th percentile: 1.5238263845443725
20th percentile: 1.602695608139038
30th percentile: 1.7932797908782958
40th percentile: 2.017904901504517
50th percentile: 2.3842872381210327
60th percentile: 6.928191089630126
70th percentile: 13.992697334289526
80th percentile: 20.13433256149292
90th percentile: 20.223541951179506
95th percentile: 20.29946622848511
99th percentile: 20.557339942455293
mean time: 8.345533323287963
%s, retrying in %s seconds...
Received healthy response to inference request in 1.9223644733428955s
Received healthy response to inference request in 1.9425773620605469s
Received healthy response to inference request in 1.3694615364074707s
Received healthy response to inference request in 1.37158203125s
Received healthy response to inference request in 1.4506216049194336s
Received healthy response to inference request in 1.4309110641479492s
Received healthy response to inference request in 1.522597074508667s
Received healthy response to inference request in 1.4209132194519043s
Received healthy response to inference request in 1.371183156967163s
Received healthy response to inference request in 1.5364813804626465s
Received healthy response to inference request in 1.878957986831665s
Received healthy response to inference request in 1.422633171081543s
Received healthy response to inference request in 1.4628455638885498s
Received healthy response to inference request in 1.4172475337982178s
Received healthy response to inference request in 1.3982527256011963s
Received healthy response to inference request in 1.4625213146209717s
2026-03-25T15:04:14.069354+00:00 monitor updated for chaiml-pony-d3a-mv1-plc_30375_v2
Received healthy response to inference request in 2.4705286026000977s
Received healthy response to inference request in 1.539738416671753s
Received healthy response to inference request in 1.603269100189209s
Received healthy response to inference request in 1.4029052257537842s
Received healthy response to inference request in 2.463808298110962s
Received healthy response to inference request in 2.1178722381591797s
Received healthy response to inference request in 1.8885612487792969s
Received healthy response to inference request in 1.6517894268035889s
Received healthy response to inference request in 1.4882230758666992s
Received healthy response to inference request in 1.473557472229004s
Received healthy response to inference request in 1.4195070266723633s
Received healthy response to inference request in 1.4483692646026611s
Received healthy response to inference request in 1.599362850189209s
Received healthy response to inference request in 1.7666101455688477s
30 requests
0 failed requests
5th percentile: 1.3713626503944396
10th percentile: 1.3955856561660767
20th percentile: 1.419055128097534
30th percentile: 1.4284276962280273
40th percentile: 1.4577614307403564
50th percentile: 1.4808902740478516
60th percentile: 1.5377841949462892
70th percentile: 1.6178251981735228
80th percentile: 1.8808786392211914
90th percentile: 1.9601068496704104
95th percentile: 2.3081370711326588
99th percentile: 2.468579714298248
mean time: 1.6238417863845824
Pipeline stage StressChecker completed in 304.79s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.08s
Shutdown handler de-registered
chaiml-pony-d3a-mv1-plc_30375_v2 status is now deployed due to DeploymentManager action
chaiml-pony-d3a-mv1-plc_30375_v2 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3a-mv1-plc_30375_v2 status is now torndown due to DeploymentManager action