Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3b-mv1-win-84391-v5-uploader
Waiting for job on chaiml-pony-d3b-mv1-win-84391-v5-uploader to finish
chaiml-pony-d3b-mv1-win-84391-v5-uploader: Using quantization_mode: fp8
chaiml-pony-d3b-mv1-win-84391-v5-uploader: Checking if ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3b-mv1-win-84391-v5-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3b-mv1-win-84391-v5-uploader: Downloading snapshot of ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8-FP8...
chaiml-pony-d3b-mv1-win-84391-v5-uploader: Downloaded in 37.951s
chaiml-pony-d3b-mv1-win-84391-v5-uploader: Processed model ChaiML/pony-d3b-mv1-winall-q35b-lr5e6ep2g8 in 40.436s
chaiml-pony-d3b-mv1-win-84391-v5-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3b-mv1-win-84391-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v5-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3b-mv1-win-84391-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3b-mv1-win-84391-v5-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3b-mv1-win-84391-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v5-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-win-84391-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v5-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-win-84391-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v5-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-win-84391-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-win-84391-v5-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-win-84391-v5-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3b-mv1-win-84391-v5-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3b-mv1-win-84391-v5-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3b-mv1-win-84391-v5-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3b-mv1-win-84391-v5-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3b-mv1-win-84391-v5-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v5/default
chaiml-pony-d3b-mv1-win-84391-v5-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v5/default/.gitattributes
chaiml-pony-d3b-mv1-win-84391-v5-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v5/default/chat_template.jinja
chaiml-pony-d3b-mv1-win-84391-v5-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v5/default/generation_config.json
chaiml-pony-d3b-mv1-win-84391-v5-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v5/default/tokenizer_config.json
chaiml-pony-d3b-mv1-win-84391-v5-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v5/default/recipe.yaml
chaiml-pony-d3b-mv1-win-84391-v5-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v5/default/config.json
2026-03-28T03:49:33.218487+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v5
chaiml-pony-d3b-mv1-win-84391-v5-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-win-84391-v5/default/model.safetensors
Job chaiml-pony-d3b-mv1-win-84391-v5-uploader completed after 112.42s with status: succeeded
Stopping job with name chaiml-pony-d3b-mv1-win-84391-v5-uploader
Pipeline stage VLLMUploader completed in 112.85s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.12s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3b-mv1-win-84391-v5
Waiting for inference service chaiml-pony-d3b-mv1-win-84391-v5 to be ready
2026-03-28T03:50:33.352681+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v5
2026-03-28T03:51:34.060441+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v5
2026-03-28T03:52:34.160981+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v5
Inference service chaiml-pony-d3b-mv1-win-84391-v5 ready after 180.4045958518982s
Pipeline stage VLLMDeployer completed in 180.93s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-28T03:53:34.255068+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v5
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T03:54:34.344130+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v5
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 11.826644897460938s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 3.737553834915161s
2026-03-28T03:55:34.433716+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v5
Received healthy response to inference request in 1.5074939727783203s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.1658782958984375s
Received healthy response to inference request in 3.6143171787261963s
Received healthy response to inference request in 1.2418339252471924s
Received healthy response to inference request in 1.46793532371521s
Received healthy response to inference request in 1.093752145767212s
Received healthy response to inference request in 1.855555772781372s
Received healthy response to inference request in 1.1004970073699951s
Received healthy response to inference request in 3.5784971714019775s
Received healthy response to inference request in 1.1029698848724365s
2026-03-28T03:56:34.531381+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v5
Received healthy response to inference request in 15.0485999584198s
Received healthy response to inference request in 1.1178040504455566s
Received healthy response to inference request in 1.252697229385376s
Received healthy response to inference request in 1.0170416831970215s
Received healthy response to inference request in 1.4298481941223145s
Received healthy response to inference request in 1.0784103870391846s
Received healthy response to inference request in 1.0725462436676025s
Received healthy response to inference request in 1.088587999343872s
Received healthy response to inference request in 1.0685200691223145s
Received healthy response to inference request in 1.2346854209899902s
Received healthy response to inference request in 1.221001386642456s
30 requests
7 failed requests
5th percentile: 1.0703318476676942
10th percentile: 1.0778239727020265
20th percentile: 1.0991480350494385
30th percentile: 1.1514560222625732
40th percentile: 1.2389745235443115
50th percentile: 1.4488917589187622
60th percentile: 2.5447323322296116
70th percentile: 6.164281153678871
80th percentile: 20.117645502090454
90th percentile: 20.142646050453187
95th percentile: 20.15635907649994
99th percentile: 25.46093934297562
mean time: 6.946200021107992
%s, retrying in %s seconds...
Received healthy response to inference request in 1.1085155010223389s
Received healthy response to inference request in 1.149404525756836s
Received healthy response to inference request in 1.0200812816619873s
Received healthy response to inference request in 1.042571783065796s
Received healthy response to inference request in 2.3038508892059326s
Received healthy response to inference request in 1.1457812786102295s
Received healthy response to inference request in 1.1002402305603027s
Received healthy response to inference request in 1.2939414978027344s
Received healthy response to inference request in 1.1537635326385498s
Received healthy response to inference request in 1.2525908946990967s
Received healthy response to inference request in 1.0888926982879639s
Received healthy response to inference request in 1.6895265579223633s
Received healthy response to inference request in 1.2418475151062012s
Received healthy response to inference request in 1.1816747188568115s
Received healthy response to inference request in 1.1980321407318115s
Received healthy response to inference request in 1.4838058948516846s
Received healthy response to inference request in 1.0716512203216553s
Received healthy response to inference request in 1.078859806060791s
Received healthy response to inference request in 1.4956395626068115s
Received healthy response to inference request in 1.3582875728607178s
Received healthy response to inference request in 1.1504151821136475s
Received healthy response to inference request in 1.0677189826965332s
Received healthy response to inference request in 1.2635648250579834s
Received healthy response to inference request in 1.5760629177093506s
Received healthy response to inference request in 1.1428322792053223s
2026-03-28T03:57:34.620885+00:00 monitor updated for chaiml-pony-d3b-mv1-win_84391_v5
Received healthy response to inference request in 1.2528495788574219s
Received healthy response to inference request in 1.3173480033874512s
Received healthy response to inference request in 1.0679121017456055s
Received healthy response to inference request in 1.116783857345581s
Received healthy response to inference request in 1.5212123394012451s
30 requests
0 failed requests
5th percentile: 1.0538880228996277
10th percentile: 1.0678927898406982
20th percentile: 1.0868861198425293
30th percentile: 1.1143033504486084
40th percentile: 1.1479552268981934
50th percentile: 1.1677191257476807
60th percentile: 1.2461448669433595
70th percentile: 1.2726778268814085
80th percentile: 1.3833912372589114
90th percentile: 1.5266973972320557
95th percentile: 1.6384679198265073
99th percentile: 2.125696833133698
mean time: 1.2645219723383585
Pipeline stage StressChecker completed in 253.13s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.37s
Shutdown handler de-registered
chaiml-pony-d3b-mv1-win_84391_v5 status is now deployed due to DeploymentManager action
chaiml-pony-d3b-mv1-win_84391_v5 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3b-mv1-win_84391_v5 status is now torndown due to DeploymentManager action