Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3b-mv1-wi-84391-v14-uploader
Waiting for job on chaiml-pony-d3b-mv1-wi-84391-v14-uploader to finish
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: Using quantization_mode: fp8
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: Checking if ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: Downloading snapshot of ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8...
2026-03-29T00:02:00.726152+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v14
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: Downloaded in 33.358s
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: Processed model ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8 in 36.207s
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v14/default
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v14/default/.gitattributes
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v14/default/config.json
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v14/default/chat_template.jinja
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v14/default/tokenizer_config.json
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v14/default/recipe.yaml
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v14/default/generation_config.json
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v14/default/tokenizer.json
2026-03-29T00:03:00.815448+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v14
chaiml-pony-d3b-mv1-wi-84391-v14-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v14/default/model.safetensors
Job chaiml-pony-d3b-mv1-wi-84391-v14-uploader completed after 143.81s with status: succeeded
Stopping job with name chaiml-pony-d3b-mv1-wi-84391-v14-uploader
Pipeline stage VLLMUploader completed in 144.30s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.56s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3b-mv1-wi-84391-v14
Waiting for inference service chaiml-pony-d3b-mv1-wi-84391-v14 to be ready
2026-03-29T00:04:00.913586+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v14
2026-03-29T00:05:01.010526+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v14
2026-03-29T00:06:01.104901+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v14
2026-03-29T00:07:01.204516+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v14
Inference service chaiml-pony-d3b-mv1-wi-84391-v14 ready after 240.63795828819275s
Pipeline stage VLLMDeployer completed in 241.23s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-29T00:08:01.305895+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v14
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-29T00:09:01.399044+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v14
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 5.765625953674316s
Received healthy response to inference request in 7.044789552688599s
Received healthy response to inference request in 2.677455425262451s
Received healthy response to inference request in 6.1334922313690186s
Received healthy response to inference request in 6.410324335098267s
Received healthy response to inference request in 3.352574348449707s
2026-03-29T00:10:01.496727+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v14
Received healthy response to inference request in 5.596423625946045s
Received healthy response to inference request in 3.9219248294830322s
Received healthy response to inference request in 2.4112308025360107s
Received healthy response to inference request in 3.023050546646118s
Received healthy response to inference request in 2.6760098934173584s
Received healthy response to inference request in 2.4790265560150146s
Received healthy response to inference request in 2.8050119876861572s
Received healthy response to inference request in 2.242143154144287s
Received healthy response to inference request in 2.320197105407715s
Received healthy response to inference request in 2.5211400985717773s
Received healthy response to inference request in 2.2472169399261475s
Received healthy response to inference request in 2.4493377208709717s
Received healthy response to inference request in 2.406710624694824s
Received healthy response to inference request in 2.3526885509490967s
Received healthy response to inference request in 2.2254645824432373s
Received healthy response to inference request in 2.27340030670166s
Received healthy response to inference request in 2.2808680534362793s
2026-03-29T00:11:01.596127+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v14
Received healthy response to inference request in 2.482814311981201s
30 requests
6 failed requests
5th percentile: 2.244426357746124
10th percentile: 2.270781970024109
20th percentile: 2.34619026184082
30th percentile: 2.4379056453704835
40th percentile: 2.505809783935547
50th percentile: 2.741233706474304
60th percentile: 3.580314540863036
70th percentile: 5.875985836982726
80th percentile: 9.660415697097815
90th percentile: 20.12778151035309
95th percentile: 20.137190878391266
99th percentile: 20.188642501831055
mean time: 6.698571825027466
%s, retrying in %s seconds...
Received healthy response to inference request in 2.3843348026275635s
Received healthy response to inference request in 2.1193485260009766s
Received healthy response to inference request in 2.124582052230835s
Received healthy response to inference request in 2.1410727500915527s
Received healthy response to inference request in 2.1728668212890625s
Received healthy response to inference request in 2.160006046295166s
Received healthy response to inference request in 2.1157383918762207s
Received healthy response to inference request in 2.518129587173462s
Received healthy response to inference request in 2.2320566177368164s
Received healthy response to inference request in 2.2199347019195557s
Received healthy response to inference request in 2.4010934829711914s
Received healthy response to inference request in 2.2793638706207275s
Received healthy response to inference request in 2.359125852584839s
Received healthy response to inference request in 2.6602256298065186s
Received healthy response to inference request in 2.713758707046509s
Received healthy response to inference request in 2.449352741241455s
Received healthy response to inference request in 2.2921619415283203s
Received healthy response to inference request in 2.2021830081939697s
Received healthy response to inference request in 2.2902333736419678s
Received healthy response to inference request in 2.2347402572631836s
Received healthy response to inference request in 2.237457513809204s
Received healthy response to inference request in 2.333099842071533s
Received healthy response to inference request in 2.2712817192077637s
Received healthy response to inference request in 2.7170159816741943s
2026-03-29T00:12:01.692391+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v14
Received healthy response to inference request in 2.2321159839630127s
Received healthy response to inference request in 2.2996020317077637s
Received healthy response to inference request in 2.250608205795288s
Received healthy response to inference request in 2.2333381175994873s
Received healthy response to inference request in 2.2741410732269287s
Received healthy response to inference request in 2.2747676372528076s
30 requests
0 failed requests
5th percentile: 2.1217036128044127
10th percentile: 2.139423680305481
20th percentile: 2.1963197708129885
30th percentile: 2.232098174095154
40th percentile: 2.2363706111907957
50th percentile: 2.272711396217346
60th percentile: 2.2837116718292236
70th percentile: 2.3096513748168945
80th percentile: 2.3876865386962893
90th percentile: 2.5323391914367677
95th percentile: 2.689668822288513
99th percentile: 2.7160713720321654
mean time: 2.3064579089482624
Pipeline stage StressChecker completed in 287.71s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.75s
Shutdown handler de-registered
chaiml-pony-d3b-mv1-wi_84391_v14 status is now deployed due to DeploymentManager action
chaiml-pony-d3b-mv1-wi_84391_v14 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3b-mv1-wi_84391_v14 status is now torndown due to DeploymentManager action