Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3b-mv1-top2-9386-v3-uploader
Waiting for job on chaiml-pony-d3b-mv1-top2-9386-v3-uploader to finish
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: Using quantization_mode: fp8
Failed to get request counts for guanaco-submitter. Falling back to default
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: Checking if ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: Downloading snapshot of ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8-FP8...
2026-03-28T01:23:41.133252+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v3
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: Downloaded in 45.258s
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: Processed model ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8 in 47.738s
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v3/default
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v3/default/.gitattributes
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v3/default/config.json
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v3/default/chat_template.jinja
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v3/default/recipe.yaml
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v3/default/generation_config.json
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v3/default/tokenizer_config.json
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v3/default/tokenizer.json
2026-03-28T01:24:41.227507+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v3
chaiml-pony-d3b-mv1-top2-9386-v3-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v3/default/model.safetensors
Job chaiml-pony-d3b-mv1-top2-9386-v3-uploader completed after 153.69s with status: succeeded
Stopping job with name chaiml-pony-d3b-mv1-top2-9386-v3-uploader
Pipeline stage VLLMUploader completed in 154.16s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.82s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3b-mv1-top2-9386-v3
Waiting for inference service chaiml-pony-d3b-mv1-top2-9386-v3 to be ready
2026-03-28T01:25:41.755772+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v3
2026-03-28T01:26:42.011918+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v3
2026-03-28T01:27:42.188233+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v3
Inference service chaiml-pony-d3b-mv1-top2-9386-v3 ready after 150.17271733283997s
Pipeline stage VLLMDeployer completed in 150.64s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T01:28:42.300908+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 5.583249807357788s
Received healthy response to inference request in 1.257978916168213s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T01:29:42.391641+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 10.421149253845215s
Received healthy response to inference request in 1.6713314056396484s
2026-03-28T01:30:42.494408+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v3
Received healthy response to inference request in 3.4151899814605713s
Received healthy response to inference request in 6.36253809928894s
Received healthy response to inference request in 5.975501537322998s
Received healthy response to inference request in 3.7599117755889893s
Received healthy response to inference request in 1.2716584205627441s
Received healthy response to inference request in 1.2759137153625488s
Received healthy response to inference request in 1.5634765625s
Received healthy response to inference request in 1.292484998703003s
Received healthy response to inference request in 1.2886133193969727s
Received healthy response to inference request in 1.2901811599731445s
Received healthy response to inference request in 1.279212474822998s
Received healthy response to inference request in 1.6350924968719482s
Received healthy response to inference request in 1.3848443031311035s
Received healthy response to inference request in 1.3443078994750977s
Received healthy response to inference request in 1.2940528392791748s
Received healthy response to inference request in 1.3060111999511719s
Received healthy response to inference request in 1.4506206512451172s
Received healthy response to inference request in 1.3189435005187988s
30 requests
8 failed requests
5th percentile: 1.2735733032226562
10th percentile: 1.2788825988769532
20th percentile: 1.2920242309570313
30th percentile: 1.3150638103485108
40th percentile: 1.4243101119995119
50th percentile: 1.6532119512557983
60th percentile: 4.4892469882965065
70th percentile: 7.580121445655811
80th percentile: 20.12186269760132
90th percentile: 20.138047671318056
95th percentile: 20.144449877738953
99th percentile: 20.176133596897124
mean time: 7.284737928708394
%s, retrying in %s seconds...
Received healthy response to inference request in 1.178952693939209s
Received healthy response to inference request in 1.2406220436096191s
Received healthy response to inference request in 1.3429012298583984s
Received healthy response to inference request in 1.5177781581878662s
Received healthy response to inference request in 1.3582022190093994s
Received healthy response to inference request in 1.2594807147979736s
Received healthy response to inference request in 1.3290040493011475s
Received healthy response to inference request in 1.420081615447998s
Received healthy response to inference request in 1.326174259185791s
Received healthy response to inference request in 1.2155041694641113s
Received healthy response to inference request in 1.2462470531463623s
Received healthy response to inference request in 1.4631316661834717s
Received healthy response to inference request in 1.2847840785980225s
Received healthy response to inference request in 1.392824411392212s
2026-03-28T01:31:53.729622+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v3
Received healthy response to inference request in 1.2853279113769531s
Received healthy response to inference request in 1.398407220840454s
Received healthy response to inference request in 1.3843646049499512s
Received healthy response to inference request in 1.2668702602386475s
Received healthy response to inference request in 1.4576666355133057s
Received healthy response to inference request in 2.405714750289917s
Received healthy response to inference request in 1.8329362869262695s
Received healthy response to inference request in 1.2811181545257568s
Received healthy response to inference request in 1.2932662963867188s
Received healthy response to inference request in 1.2972052097320557s
Received healthy response to inference request in 1.2983729839324951s
Received healthy response to inference request in 1.3600077629089355s
Received healthy response to inference request in 1.365433692932129s
Received healthy response to inference request in 1.4033217430114746s
Received healthy response to inference request in 1.8908448219299316s
Received healthy response to inference request in 1.3126790523529053s
30 requests
0 failed requests
5th percentile: 1.22680721282959
10th percentile: 1.245684552192688
20th percentile: 1.278268575668335
30th percentile: 1.290884780883789
40th percentile: 1.3069566249847413
50th percentile: 1.335952639579773
60th percentile: 1.362178134918213
70th percentile: 1.3944992542266845
80th percentile: 1.4275986194610597
90th percentile: 1.549293971061707
95th percentile: 1.8647859811782834
99th percentile: 2.2564024710655217
mean time: 1.403640858332316
Pipeline stage StressChecker completed in 269.81s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.80s
Shutdown handler de-registered
chaiml-pony-d3b-mv1-top2_9386_v3 status is now deployed due to DeploymentManager action
chaiml-pony-d3b-mv1-top2_9386_v3 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3b-mv1-top2_9386_v3 status is now torndown due to DeploymentManager action