Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3b-mv1-win-84391-v7-uploader
Waiting for job on chaiml-pony-d3b-mv1-win-84391-v7-uploader to finish
chaiml-pony-d3b-mv1-win-84391-v7-uploader: Using quantization_mode: fp8
chaiml-pony-d3b-mv1-win-84391-v7-uploader: Checking if ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3b-mv1-win-84391-v7-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3b-mv1-win-84391-v7-uploader: Downloading snapshot of ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8...
2026-03-28T14:49:39.657766+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v7
chaiml-pony-d3b-mv1-win-84391-v7-uploader: Downloaded in 32.973s
chaiml-pony-d3b-mv1-win-84391-v7-uploader: Processed model ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8 in 35.451s
chaiml-pony-d3b-mv1-win-84391-v7-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3b-mv1-win-84391-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v7-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3b-mv1-win-84391-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3b-mv1-win-84391-v7-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3b-mv1-win-84391-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v7-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-win-84391-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v7-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-win-84391-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v7-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-win-84391-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v7-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-win-84391-v7-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3b-mv1-win-84391-v7-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3b-mv1-win-84391-v7-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3b-mv1-win-84391-v7-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3b-mv1-win-84391-v7-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3b-mv1-win-84391-v7-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v7/default
chaiml-pony-d3b-mv1-win-84391-v7-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v7/default/generation_config.json
chaiml-pony-d3b-mv1-win-84391-v7-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v7/default/chat_template.jinja
chaiml-pony-d3b-mv1-win-84391-v7-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v7/default/config.json
chaiml-pony-d3b-mv1-win-84391-v7-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v7/default/tokenizer_config.json
chaiml-pony-d3b-mv1-win-84391-v7-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v7/default/.gitattributes
chaiml-pony-d3b-mv1-win-84391-v7-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v7/default/recipe.yaml
chaiml-pony-d3b-mv1-win-84391-v7-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v7/default/tokenizer.json
2026-03-28T14:50:39.743255+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v7
chaiml-pony-d3b-mv1-win-84391-v7-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v7/default/model.safetensors
Job chaiml-pony-d3b-mv1-win-84391-v7-uploader completed after 123.26s with status: succeeded
Stopping job with name chaiml-pony-d3b-mv1-win-84391-v7-uploader
Pipeline stage VLLMUploader completed in 123.71s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.88s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3b-mv1-win-84391-v7
Waiting for inference service chaiml-pony-d3b-mv1-win-84391-v7 to be ready
Failed to get request counts for guanaco-submitter. Falling back to default
Failed to get response for submission chaiml-pony-d3-g46-pv2-l_7830_v2: ('http://chaiml-pony-d3-g46-pv2-l-7830-v2-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
2026-03-28T14:51:39.838764+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v7
2026-03-28T14:52:39.932393+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v7
Failed to get response for submission chaiml-pony-d3a-mv1-plc-_5598_v2: ('http://chaiml-pony-d3a-mv1-plc-5598-v2-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'request timeout')
2026-03-28T14:53:40.033409+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v7
Inference service chaiml-pony-d3b-mv1-win-84391-v7 ready after 190.61823081970215s
Pipeline stage VLLMDeployer completed in 191.11s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T14:54:40.122118+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v7
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T14:55:40.210991+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v7
Received healthy response to inference request in 5.8474109172821045s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 5.724545001983643s
Received healthy response to inference request in 1.5832059383392334s
Received healthy response to inference request in 1.313965082168579s
Received healthy response to inference request in 5.922857999801636s
Received healthy response to inference request in 1.3518242835998535s
2026-03-28T14:56:40.298107+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v7
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.5764825344085693s
Received healthy response to inference request in 1.2778284549713135s
Received healthy response to inference request in 5.806933403015137s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.3618943691253662s
Received healthy response to inference request in 1.2650210857391357s
Received healthy response to inference request in 1.7848143577575684s
Received healthy response to inference request in 1.3083581924438477s
Received healthy response to inference request in 1.8279039859771729s
2026-03-28T14:57:40.391850+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v7
Received healthy response to inference request in 17.56202793121338s
Received healthy response to inference request in 1.423954725265503s
Received healthy response to inference request in 1.3210692405700684s
Received healthy response to inference request in 1.4432611465454102s
Received healthy response to inference request in 1.3018138408660889s
Received healthy response to inference request in 1.3079710006713867s
Received healthy response to inference request in 1.4168367385864258s
30 requests
9 failed requests
5th percentile: 1.2886218786239625
10th percentile: 1.307355284690857
20th percentile: 1.3196484088897704
30th percentile: 1.400354027748108
40th percentile: 1.5231939792633058
50th percentile: 1.8063591718673706
60th percentile: 5.823124408721924
70th percentile: 18.323200440406794
80th percentile: 20.112390279769897
90th percentile: 20.129192662239074
95th percentile: 20.158578395843506
99th percentile: 20.25429950475693
mean time: 8.16720274289449
%s, retrying in %s seconds...
Received healthy response to inference request in 1.8425140380859375s
Received healthy response to inference request in 1.390730381011963s
Received healthy response to inference request in 1.5476391315460205s
Received healthy response to inference request in 1.5580737590789795s
Received healthy response to inference request in 1.3399691581726074s
Received healthy response to inference request in 1.3204357624053955s
Received healthy response to inference request in 1.5115668773651123s
Received healthy response to inference request in 1.2691478729248047s
Received healthy response to inference request in 1.2679777145385742s
Received healthy response to inference request in 2.164940118789673s
Received healthy response to inference request in 1.3413524627685547s
Received healthy response to inference request in 1.238349437713623s
Received healthy response to inference request in 1.3210208415985107s
Received healthy response to inference request in 1.3370895385742188s
Received healthy response to inference request in 1.3521523475646973s
Received healthy response to inference request in 1.604283094406128s
Received healthy response to inference request in 1.297940731048584s
Received healthy response to inference request in 1.3217222690582275s
Received healthy response to inference request in 1.3779816627502441s
Received healthy response to inference request in 2.056396484375s
Received healthy response to inference request in 1.4359962940216064s
Received healthy response to inference request in 1.3199448585510254s
Received healthy response to inference request in 1.5777921676635742s
2026-03-28T14:58:40.485151+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v7
Received healthy response to inference request in 1.2775990962982178s
Received healthy response to inference request in 1.5218837261199951s
Received healthy response to inference request in 1.4032366275787354s
Received healthy response to inference request in 1.3371589183807373s
Received healthy response to inference request in 1.3211479187011719s
Received healthy response to inference request in 1.369128942489624s
Received healthy response to inference request in 1.2940313816070557s
30 requests
0 failed requests
5th percentile: 1.2685042858123778
10th percentile: 1.2767539739608764
20th percentile: 1.315544033050537
30th percentile: 1.3211097955703734
40th percentile: 1.3371311664581298
50th percentile: 1.346752405166626
60th percentile: 1.3830811500549316
70th percentile: 1.458667469024658
80th percentile: 1.5497260570526123
90th percentile: 1.6281061887741093
95th percentile: 1.9601493835449213
99th percentile: 2.133462464809418
mean time: 1.44397345383962
Pipeline stage StressChecker completed in 293.43s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.98s
Shutdown handler de-registered
chaiml-pony-d3b-mv1-win_84391_v7 status is now deployed due to DeploymentManager action
chaiml-pony-d3b-mv1-win_84391_v7 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3b-mv1-win_84391_v7 status is now torndown due to DeploymentManager action