Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v3a-q27b-lr-21575-v4-uploader
Waiting for job on chaiml-pony-v3a-q27b-lr-21575-v4-uploader to finish
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: Using quantization_mode: fp8
2026-03-31T05:01:02.180055+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v4
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: Downloaded in 40.591s
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: Processed model ChaiML/pony-v3a-q27b-lr5e6ep2g8 in 43.551s
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v4/default
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v4/default/.gitattributes
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v4/default/chat_template.jinja
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v4/default/tokenizer_config.json
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v4/default/config.json
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v4/default/generation_config.json
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v4/default/recipe.yaml
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v4/default/tokenizer.json
chaiml-pony-v3a-q27b-lr-21575-v4-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v4/default/model.safetensors
Job chaiml-pony-v3a-q27b-lr-21575-v4-uploader completed after 113.48s with status: succeeded
Stopping job with name chaiml-pony-v3a-q27b-lr-21575-v4-uploader
Pipeline stage VLLMUploader completed in 113.91s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.40s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v3a-q27b-lr-21575-v4
Waiting for inference service chaiml-pony-v3a-q27b-lr-21575-v4 to be ready
2026-03-31T05:02:04.379856+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v4
2026-03-31T05:03:04.487574+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v4
2026-03-31T05:04:04.577259+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v4
Inference service chaiml-pony-v3a-q27b-lr-21575-v4 ready after 160.21761322021484s
Pipeline stage VLLMDeployer completed in 160.64s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-31T05:05:04.668176+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v4
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-31T05:06:04.763046+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v4
Received healthy response to inference request in 19.678868055343628s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 16.323834896087646s
2026-03-31T05:07:04.882201+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v4
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.9895412921905518s
Received healthy response to inference request in 4.1283605098724365s
Received healthy response to inference request in 4.056355953216553s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.9490370750427246s
Received healthy response to inference request in 2.2434589862823486s
2026-03-31T05:08:04.994157+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v4
Received healthy response to inference request in 2.052698850631714s
Received healthy response to inference request in 1.908337116241455s
Received healthy response to inference request in 2.123020648956299s
Received healthy response to inference request in 1.950202465057373s
Received healthy response to inference request in 1.913586139678955s
Received healthy response to inference request in 1.9312539100646973s
Received healthy response to inference request in 2.229335308074951s
Received healthy response to inference request in 2.4077749252319336s
Received healthy response to inference request in 1.9490950107574463s
Received healthy response to inference request in 2.5561139583587646s
Received healthy response to inference request in 2.120229959487915s
Received healthy response to inference request in 2.041001558303833s
Received healthy response to inference request in 2.4647719860076904s
Received healthy response to inference request in 2.107038974761963s
Received healthy response to inference request in 1.9655699729919434s
Received healthy response to inference request in 2.078338861465454s
30 requests
7 failed requests
5th percentile: 1.921536636352539
10th percentile: 1.947258758544922
20th percentile: 1.9624964714050293
30th percentile: 2.0491896629333497
40th percentile: 2.114953565597534
50th percentile: 2.23639714717865
60th percentile: 2.5013087749481198
70th percentile: 7.787002825736964
80th percentile: 20.104914140701293
90th percentile: 20.14835307598114
95th percentile: 20.156640923023225
99th percentile: 29.173619325160992
mean time: 7.927771131197612
%s, retrying in %s seconds...
Received healthy response to inference request in 2.289978504180908s
Received healthy response to inference request in 1.825779914855957s
Received healthy response to inference request in 1.8187687397003174s
Received healthy response to inference request in 1.8467211723327637s
Received healthy response to inference request in 1.6966824531555176s
Received healthy response to inference request in 2.0548908710479736s
Received healthy response to inference request in 1.9934072494506836s
Received healthy response to inference request in 1.8633642196655273s
Received healthy response to inference request in 1.9503142833709717s
Received healthy response to inference request in 1.9215199947357178s
Received healthy response to inference request in 1.9691145420074463s
Received healthy response to inference request in 2.454536199569702s
2026-03-31T05:09:05.111664+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v4
Received healthy response to inference request in 1.8565223217010498s
Received healthy response to inference request in 2.2022252082824707s
Received healthy response to inference request in 1.965461015701294s
Received healthy response to inference request in 1.995314359664917s
Received healthy response to inference request in 1.9098563194274902s
Received healthy response to inference request in 1.8608062267303467s
Received healthy response to inference request in 1.8952281475067139s
Received healthy response to inference request in 1.8524601459503174s
Received healthy response to inference request in 2.241959571838379s
Received healthy response to inference request in 2.040260076522827s
Received healthy response to inference request in 1.9191100597381592s
Received healthy response to inference request in 2.136960983276367s
Received healthy response to inference request in 2.2641727924346924s
Received healthy response to inference request in 1.9941444396972656s
Received healthy response to inference request in 1.9298017024993896s
Received healthy response to inference request in 2.1305413246154785s
Received healthy response to inference request in 2.1055896282196045s
Received healthy response to inference request in 2.0738115310668945s
30 requests
0 failed requests
5th percentile: 1.8219237685203553
10th percentile: 1.844627046585083
20th percentile: 1.8599494457244874
30th percentile: 1.9054678678512573
40th percentile: 1.9264890193939208
50th percentile: 1.9672877788543701
60th percentile: 1.9946124076843261
70th percentile: 2.0605670690536497
80th percentile: 2.131825256347656
90th percentile: 2.2441808938980103
95th percentile: 2.278365933895111
99th percentile: 2.406814467906952
mean time: 2.001976799964905
Pipeline stage StressChecker completed in 303.42s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.79s
Shutdown handler de-registered
chaiml-pony-v3a-q27b-lr_21575_v4 status is now deployed due to DeploymentManager action
chaiml-pony-v3a-q27b-lr_21575_v4 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v3a-q27b-lr_21575_v4 status is now torndown due to DeploymentManager action