Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3b-mv1-top-9386-v12-uploader
Waiting for job on chaiml-pony-d3b-mv1-top-9386-v12-uploader to finish
chaiml-pony-d3b-mv1-top-9386-v12-uploader: Using quantization_mode: fp8
chaiml-pony-d3b-mv1-top-9386-v12-uploader: Checking if ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3b-mv1-top-9386-v12-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3b-mv1-top-9386-v12-uploader: Downloading snapshot of ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8-FP8...
2026-03-29T00:02:01.803496+00:00 monitor updated for chaiml-pony-d3b-mv1-top_9386_v12
chaiml-pony-d3b-mv1-top-9386-v12-uploader: Downloaded in 33.771s
chaiml-pony-d3b-mv1-top-9386-v12-uploader: Processed model ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8 in 36.583s
chaiml-pony-d3b-mv1-top-9386-v12-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3b-mv1-top-9386-v12-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top-9386-v12-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3b-mv1-top-9386-v12-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3b-mv1-top-9386-v12-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3b-mv1-top-9386-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top-9386-v12-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-top-9386-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top-9386-v12-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-top-9386-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top-9386-v12-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-top-9386-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top-9386-v12-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-top-9386-v12-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3b-mv1-top-9386-v12-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3b-mv1-top-9386-v12-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3b-mv1-top-9386-v12-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3b-mv1-top-9386-v12-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3b-mv1-top-9386-v12-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top-9386-v12/default
chaiml-pony-d3b-mv1-top-9386-v12-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top-9386-v12/default/config.json
chaiml-pony-d3b-mv1-top-9386-v12-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top-9386-v12/default/.gitattributes
chaiml-pony-d3b-mv1-top-9386-v12-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top-9386-v12/default/recipe.yaml
chaiml-pony-d3b-mv1-top-9386-v12-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top-9386-v12/default/chat_template.jinja
chaiml-pony-d3b-mv1-top-9386-v12-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top-9386-v12/default/generation_config.json
chaiml-pony-d3b-mv1-top-9386-v12-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top-9386-v12/default/tokenizer_config.json
chaiml-pony-d3b-mv1-top-9386-v12-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top-9386-v12/default/tokenizer.json
2026-03-29T00:03:01.895682+00:00 monitor updated for chaiml-pony-d3b-mv1-top_9386_v12
chaiml-pony-d3b-mv1-top-9386-v12-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top-9386-v12/default/model.safetensors
Job chaiml-pony-d3b-mv1-top-9386-v12-uploader completed after 143.41s with status: succeeded
Stopping job with name chaiml-pony-d3b-mv1-top-9386-v12-uploader
Pipeline stage VLLMUploader completed in 143.87s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.65s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3b-mv1-top-9386-v12
Waiting for inference service chaiml-pony-d3b-mv1-top-9386-v12 to be ready
2026-03-29T00:04:01.994726+00:00 monitor updated for chaiml-pony-d3b-mv1-top_9386_v12
2026-03-29T00:05:02.088175+00:00 monitor updated for chaiml-pony-d3b-mv1-top_9386_v12
2026-03-29T00:06:02.301357+00:00 monitor updated for chaiml-pony-d3b-mv1-top_9386_v12
2026-03-29T00:07:02.400910+00:00 monitor updated for chaiml-pony-d3b-mv1-top_9386_v12
Inference service chaiml-pony-d3b-mv1-top-9386-v12 ready after 230.6931290626526s
Pipeline stage VLLMDeployer completed in 231.16s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-29T00:08:02.515422+00:00 monitor updated for chaiml-pony-d3b-mv1-top_9386_v12
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.9624969959259033s
2026-03-29T00:09:02.605936+00:00 monitor updated for chaiml-pony-d3b-mv1-top_9386_v12
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.23984432220459s
Received healthy response to inference request in 3.057446002960205s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.7810485363006592s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.331671953201294s
Received healthy response to inference request in 4.050152063369751s
Received healthy response to inference request in 3.065455436706543s
2026-03-29T00:10:02.698008+00:00 monitor updated for chaiml-pony-d3b-mv1-top_9386_v12
Received healthy response to inference request in 1.7084059715270996s
Received healthy response to inference request in 2.216404676437378s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.413785219192505s
Received healthy response to inference request in 1.7154664993286133s
Received healthy response to inference request in 2.2972521781921387s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.9086973667144775s
Received healthy response to inference request in 2.2517881393432617s
2026-03-29T00:11:02.793570+00:00 monitor updated for chaiml-pony-d3b-mv1-top_9386_v12
Received healthy response to inference request in 1.9604461193084717s
Received healthy response to inference request in 1.7847907543182373s
Received healthy response to inference request in 2.303366184234619s
Received healthy response to inference request in 1.7122249603271484s
Received healthy response to inference request in 1.7898566722869873s
Received healthy response to inference request in 1.777561902999878s
Received healthy response to inference request in 2.3669967651367188s
30 requests
9 failed requests
5th percentile: 1.7136836528778077
10th percentile: 1.7713523626327514
20th percentile: 1.7888434886932374
30th percentile: 2.139617109298706
40th percentile: 2.300920581817627
50th percentile: 2.390390992164612
60th percentile: 3.4242720603942858
70th percentile: 8.9986706256866
80th percentile: 20.117841958999634
90th percentile: 20.12256932258606
95th percentile: 20.125707268714905
99th percentile: 23.42556733131409
mean time: 7.880367151896158
%s, retrying in %s seconds...
Received healthy response to inference request in 1.6679694652557373s
Received healthy response to inference request in 1.6533336639404297s
Received healthy response to inference request in 1.745863676071167s
Received healthy response to inference request in 1.8049402236938477s
Received healthy response to inference request in 1.812978982925415s
Received healthy response to inference request in 1.9840381145477295s
Received healthy response to inference request in 1.746314287185669s
Received healthy response to inference request in 1.8443598747253418s
Received healthy response to inference request in 1.7464642524719238s
Received healthy response to inference request in 1.7187156677246094s
Received healthy response to inference request in 1.7548625469207764s
Received healthy response to inference request in 1.8211758136749268s
Received healthy response to inference request in 1.6966931819915771s
Received healthy response to inference request in 1.8366177082061768s
Received healthy response to inference request in 1.704411506652832s
Received healthy response to inference request in 2.1884605884552s
Received healthy response to inference request in 1.778768539428711s
Received healthy response to inference request in 1.7035462856292725s
Received healthy response to inference request in 1.6957204341888428s
Received healthy response to inference request in 1.9626195430755615s
Received healthy response to inference request in 1.8248450756072998s
Received healthy response to inference request in 1.6834006309509277s
Received healthy response to inference request in 2.227463960647583s
Received healthy response to inference request in 2.152616262435913s
2026-03-29T00:12:02.893922+00:00 monitor updated for chaiml-pony-d3b-mv1-top_9386_v12
Received healthy response to inference request in 1.658785104751587s
Received healthy response to inference request in 1.6869268417358398s
Received healthy response to inference request in 1.666210412979126s
Received healthy response to inference request in 1.8952834606170654s
Received healthy response to inference request in 1.7191221714019775s
Received healthy response to inference request in 1.8324966430664062s
30 requests
0 failed requests
5th percentile: 1.6621264934539794
10th percentile: 1.6677935600280762
20th percentile: 1.6939617156982423
30th percentile: 1.7041519403457641
40th percentile: 1.7351670742034913
50th percentile: 1.75066339969635
60th percentile: 1.8081557273864746
70th percentile: 1.8271405458450318
80th percentile: 1.8545445919036867
90th percentile: 2.000895929336548
95th percentile: 2.172330641746521
99th percentile: 2.216152982711792
mean time: 1.807166830698649
Pipeline stage StressChecker completed in 295.88s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.46s
Shutdown handler de-registered
chaiml-pony-d3b-mv1-top_9386_v12 status is now deployed due to DeploymentManager action
chaiml-pony-d3b-mv1-top_9386_v12 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3b-mv1-top_9386_v12 status is now torndown due to DeploymentManager action