Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3a-mv1-son-75599-v3-uploader
Waiting for job on chaiml-pony-d3a-mv1-son-75599-v3-uploader to finish
chaiml-pony-d3a-mv1-son-75599-v3-uploader: Using quantization_mode: fp8
chaiml-pony-d3a-mv1-son-75599-v3-uploader: Checking if ChaiML/pony-d3a-mv1-sonnetwintop2-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3a-mv1-son-75599-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3a-mv1-son-75599-v3-uploader: Downloading snapshot of ChaiML/pony-d3a-mv1-sonnetwintop2-q35b-lr5e6ep2g8-FP8...
2026-03-28T06:29:53.080743+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v3
chaiml-pony-d3a-mv1-son-75599-v3-uploader: Downloaded in 36.582s
chaiml-pony-d3a-mv1-son-75599-v3-uploader: Processed model ChaiML/pony-d3a-mv1-sonnetwintop2-q35b-lr5e6ep2g8 in 39.112s
chaiml-pony-d3a-mv1-son-75599-v3-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3a-mv1-son-75599-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-75599-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3a-mv1-son-75599-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3a-mv1-son-75599-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3a-mv1-son-75599-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-75599-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-son-75599-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-75599-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-son-75599-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-75599-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-son-75599-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-75599-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-son-75599-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3a-mv1-son-75599-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3a-mv1-son-75599-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3a-mv1-son-75599-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3a-mv1-son-75599-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3a-mv1-son-75599-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v3/default
chaiml-pony-d3a-mv1-son-75599-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v3/default/.gitattributes
chaiml-pony-d3a-mv1-son-75599-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v3/default/generation_config.json
chaiml-pony-d3a-mv1-son-75599-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v3/default/config.json
chaiml-pony-d3a-mv1-son-75599-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v3/default/tokenizer_config.json
chaiml-pony-d3a-mv1-son-75599-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v3/default/chat_template.jinja
chaiml-pony-d3a-mv1-son-75599-v3-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v3/default/recipe.yaml
chaiml-pony-d3a-mv1-son-75599-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v3/default/tokenizer.json
2026-03-28T06:30:53.167449+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v3
chaiml-pony-d3a-mv1-son-75599-v3-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v3/default/model.safetensors
Job chaiml-pony-d3a-mv1-son-75599-v3-uploader completed after 163.7s with status: succeeded
Stopping job with name chaiml-pony-d3a-mv1-son-75599-v3-uploader
Pipeline stage VLLMUploader completed in 164.20s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.65s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 3.99s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3a-mv1-son-75599-v3
Waiting for inference service chaiml-pony-d3a-mv1-son-75599-v3 to be ready
2026-03-28T06:31:53.356057+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v3
2026-03-28T06:32:53.479937+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v3
2026-03-28T06:33:53.570708+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v3
2026-03-28T06:34:53.675913+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v3
Inference service chaiml-pony-d3a-mv1-son-75599-v3 ready after 201.49155497550964s
Pipeline stage VLLMDeployer completed in 204.31s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T06:35:53.778840+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T06:36:53.871782+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.8635928630828857s
Received healthy response to inference request in 7.855092525482178s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 5.310424566268921s
Received healthy response to inference request in 1.2636373043060303s
2026-03-28T06:37:53.989280+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.1159634590148926s
Received healthy response to inference request in 1.116868019104004s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.2857568264007568s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.219907522201538s
2026-03-28T06:38:54.457126+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v3
Received healthy response to inference request in 5.427231550216675s
Received healthy response to inference request in 1.211944341659546s
Received healthy response to inference request in 1.0893526077270508s
Received healthy response to inference request in 1.1373653411865234s
Received healthy response to inference request in 1.190887689590454s
Received healthy response to inference request in 1.2940638065338135s
Received healthy response to inference request in 1.0807428359985352s
Received healthy response to inference request in 1.1631531715393066s
Received healthy response to inference request in 1.08024263381958s
Received healthy response to inference request in 1.0695645809173584s
Received healthy response to inference request in 1.1603028774261475s
Received healthy response to inference request in 1.1252057552337646s
30 requests
10 failed requests
5th percentile: 1.0804677248001098
10th percentile: 1.0884916305541992
20th percentile: 1.1235382080078125
30th percentile: 1.1622980833053589
40th percentile: 1.2167222499847412
50th percentile: 1.2899103164672852
60th percentile: 5.357147359848022
70th percentile: 20.12297942638397
80th percentile: 20.13007879257202
90th percentile: 20.135573863983154
95th percentile: 20.14250245094299
99th percentile: 20.19291339635849
mean time: 8.082138975461325
%s, retrying in %s seconds...
Received healthy response to inference request in 1.4487090110778809s
Received healthy response to inference request in 1.4162414073944092s
Received healthy response to inference request in 1.026552438735962s
Received healthy response to inference request in 1.225508213043213s
Received healthy response to inference request in 1.029144287109375s
Received healthy response to inference request in 1.1009509563446045s
Received healthy response to inference request in 1.1584641933441162s
Received healthy response to inference request in 1.074366807937622s
Received healthy response to inference request in 1.4046297073364258s
Received healthy response to inference request in 1.2134058475494385s
Received healthy response to inference request in 1.1688313484191895s
Received healthy response to inference request in 1.11698317527771s
Received healthy response to inference request in 1.5948200225830078s
Received healthy response to inference request in 1.2648422718048096s
Received healthy response to inference request in 1.0835719108581543s
Received healthy response to inference request in 1.0716371536254883s
Received healthy response to inference request in 1.1543574333190918s
Received healthy response to inference request in 1.0796818733215332s
Received healthy response to inference request in 1.2212836742401123s
Received healthy response to inference request in 1.5781562328338623s
Received healthy response to inference request in 1.0759923458099365s
Received healthy response to inference request in 1.1336643695831299s
Received healthy response to inference request in 1.497248888015747s
Received healthy response to inference request in 1.1707453727722168s
Received healthy response to inference request in 1.3212542533874512s
Received healthy response to inference request in 1.0937349796295166s
Received healthy response to inference request in 1.6942694187164307s
Received healthy response to inference request in 1.2532720565795898s
Received healthy response to inference request in 1.2870330810546875s
Received healthy response to inference request in 1.1919512748718262s
30 requests
0 failed requests
5th percentile: 1.0482660770416259
10th percentile: 1.0740938425064086
20th percentile: 1.08279390335083
30th percentile: 1.1121735095977783
40th percentile: 1.1568214893341064
50th percentile: 1.1813483238220215
60th percentile: 1.2229734897613525
70th percentile: 1.271499514579773
80th percentile: 1.4069520473480226
90th percentile: 1.5053396224975588
95th percentile: 1.5873213171958922
99th percentile: 1.665429093837738
mean time: 1.238376800219218
Pipeline stage StressChecker completed in 286.38s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
2026-03-28T06:39:54.917682+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v3
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.16s
Shutdown handler de-registered
chaiml-pony-d3a-mv1-son_75599_v3 status is now deployed due to DeploymentManager action
chaiml-pony-d3a-mv1-son_75599_v3 status is now inactive due to admin request