Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3b-mv1-wi-84391-v11-uploader
Waiting for job on chaiml-pony-d3b-mv1-wi-84391-v11-uploader to finish
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: Using quantization_mode: fp8
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: Checking if ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: Downloading snapshot of ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8...
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: Downloaded in 34.153s
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: Processed model ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8 in 36.961s
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v11/default
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v11/default/chat_template.jinja
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v11/default/recipe.yaml
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v11/default/generation_config.json
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v11/default/.gitattributes
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v11/default/tokenizer_config.json
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v11/default/config.json
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v11/default/tokenizer.json
2026-03-28T20:53:42.034395+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v11
chaiml-pony-d3b-mv1-wi-84391-v11-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-wi-84391-v11/default/model.safetensors
Job chaiml-pony-d3b-mv1-wi-84391-v11-uploader completed after 112.51s with status: succeeded
Stopping job with name chaiml-pony-d3b-mv1-wi-84391-v11-uploader
Pipeline stage VLLMUploader completed in 112.96s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 3.52s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3b-mv1-wi-84391-v11
Waiting for inference service chaiml-pony-d3b-mv1-wi-84391-v11 to be ready
2026-03-28T20:54:42.118463+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v11
2026-03-28T20:55:42.205239+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v11
2026-03-28T20:56:42.290117+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v11
Inference service chaiml-pony-d3b-mv1-wi-84391-v11 ready after 160.35422348976135s
Pipeline stage VLLMDeployer completed in 160.86s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T20:57:42.375139+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v11
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T20:58:42.714594+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v11
Received healthy response to inference request in 6.1554107666015625s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 5.995059251785278s
Received healthy response to inference request in 6.629352331161499s
Received healthy response to inference request in 5.9567482471466064s
Received healthy response to inference request in 2.787306070327759s
Received healthy response to inference request in 2.308591842651367s
Received healthy response to inference request in 2.430299758911133s
Received healthy response to inference request in 2.786400556564331s
Received healthy response to inference request in 2.310145616531372s
Received healthy response to inference request in 2.9910314083099365s
2026-03-28T20:59:42.802762+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v11
Received healthy response to inference request in 2.836399555206299s
Received healthy response to inference request in 2.4461328983306885s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 12.852012872695923s
2026-03-28T21:00:42.899557+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v11
Received healthy response to inference request in 3.0843167304992676s
Received healthy response to inference request in 2.2359931468963623s
Received healthy response to inference request in 2.7058675289154053s
Received healthy response to inference request in 2.5147504806518555s
Received healthy response to inference request in 2.469071865081787s
Received healthy response to inference request in 2.3550631999969482s
Received healthy response to inference request in 2.254240036010742s
Received healthy response to inference request in 2.294656753540039s
Received healthy response to inference request in 3.0077292919158936s
Received healthy response to inference request in 2.315053701400757s
30 requests
7 failed requests
5th percentile: 2.272427558898926
10th percentile: 2.307198333740234
20th percentile: 2.34706130027771
30th percentile: 2.4621901750564574
40th percentile: 2.7541873455047607
50th percentile: 2.9137154817581177
60th percentile: 4.233289337158199
70th percentile: 6.297593235969542
80th percentile: 20.106862545013428
90th percentile: 20.12515823841095
95th percentile: 20.134119617938996
99th percentile: 20.14654236793518
mean time: 7.486656427383423
%s, retrying in %s seconds...
Received healthy response to inference request in 2.1511991024017334s
Received healthy response to inference request in 2.0793659687042236s
Received healthy response to inference request in 2.247436046600342s
Received healthy response to inference request in 2.099492311477661s
Received healthy response to inference request in 2.14627742767334s
Received healthy response to inference request in 2.6997556686401367s
Received healthy response to inference request in 2.2065041065216064s
Received healthy response to inference request in 2.1704866886138916s
Received healthy response to inference request in 2.178297996520996s
Received healthy response to inference request in 2.8149361610412598s
Received healthy response to inference request in 2.1836142539978027s
Received healthy response to inference request in 2.183446168899536s
Received healthy response to inference request in 2.263045310974121s
Received healthy response to inference request in 2.1703004837036133s
Received healthy response to inference request in 2.1415622234344482s
2026-03-28T21:01:43.007193+00:00 monitor updated for chaiml-pony-d3b-mv1-wi_84391_v11
Received healthy response to inference request in 2.193063259124756s
Received healthy response to inference request in 2.2132182121276855s
Received healthy response to inference request in 2.271867275238037s
Received healthy response to inference request in 2.4651646614074707s
Received healthy response to inference request in 2.221964120864868s
Received healthy response to inference request in 2.2408230304718018s
Received healthy response to inference request in 2.3386881351470947s
Received healthy response to inference request in 2.211432456970215s
Received healthy response to inference request in 2.250056028366089s
Received healthy response to inference request in 2.4210219383239746s
Received healthy response to inference request in 2.2448225021362305s
Received healthy response to inference request in 2.2357051372528076s
Received healthy response to inference request in 2.285353660583496s
Received healthy response to inference request in 2.260219097137451s
Received healthy response to inference request in 2.4699552059173584s
30 requests
0 failed requests
5th percentile: 2.1184237718582155
10th percentile: 2.145805907249451
20th percentile: 2.170449447631836
30th percentile: 2.1835638284683228
40th percentile: 2.2094611167907714
50th percentile: 2.228834629058838
60th percentile: 2.245867919921875
70th percentile: 2.261066961288452
80th percentile: 2.296020555496216
90th percentile: 2.4656437158584597
95th percentile: 2.5963454604148857
99th percentile: 2.781533818244934
mean time: 2.2686358213424684
Pipeline stage StressChecker completed in 297.94s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.13s
Shutdown handler de-registered
chaiml-pony-d3b-mv1-wi_84391_v11 status is now deployed due to DeploymentManager action
chaiml-pony-d3b-mv1-wi_84391_v11 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3b-mv1-wi_84391_v11 status is now torndown due to DeploymentManager action