Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v3-q27b-lr5-22882-v3-uploader
Waiting for job on chaiml-pony-v3-q27b-lr5-22882-v3-uploader to finish
Unable to record family friendly update due to error: Invalid JSON input: Expecting value: line 1 column 1 (char 0)
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: Using quantization_mode: fp8
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: Checking if ChaiML/pony-v3-q27b-lr5e6ep1g8-FP8 already exists in ChaiML
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: Downloading snapshot of ChaiML/pony-v3-q27b-lr5e6ep1g8-FP8...
2026-03-28T10:49:01.547120+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v3
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: Downloaded in 31.823s
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: Processed model ChaiML/pony-v3-q27b-lr5e6ep1g8 in 34.290s
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v3/default
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v3/default/generation_config.json
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v3/default/.gitattributes
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v3/default/recipe.yaml
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v3/default/tokenizer_config.json
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v3/default/chat_template.jinja
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v3/default/config.json
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v3/default/tokenizer.json
2026-03-28T10:50:01.643632+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v3
chaiml-pony-v3-q27b-lr5-22882-v3-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v3/default/model.safetensors
Job chaiml-pony-v3-q27b-lr5-22882-v3-uploader completed after 144.0s with status: succeeded
Stopping job with name chaiml-pony-v3-q27b-lr5-22882-v3-uploader
Pipeline stage VLLMUploader completed in 144.51s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.31s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v3-q27b-lr5-22882-v3
Waiting for inference service chaiml-pony-v3-q27b-lr5-22882-v3 to be ready
2026-03-28T10:51:01.733091+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v3
2026-03-28T10:52:01.879226+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v3
2026-03-28T10:53:01.994761+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v3
Inference service chaiml-pony-v3-q27b-lr5-22882-v3 ready after 190.6836655139923s
Pipeline stage VLLMDeployer completed in 191.11s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T10:54:02.091731+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Failed to get response for submission chaiml-gspo-glm47-chai-_76408_v1: ('http://chaiml-gspo-glm47-chai-76408-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T10:55:02.196112+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v3
Received healthy response to inference request in 11.796586275100708s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.5890262126922607s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.0079615116119385s
2026-03-28T10:56:02.344979+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.0293657779693604s
Received healthy response to inference request in 2.039592742919922s
Received healthy response to inference request in 2.4617316722869873s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.969226837158203s
Received healthy response to inference request in 1.9911823272705078s
Received healthy response to inference request in 1.8246238231658936s
Received healthy response to inference request in 2.0673670768737793s
Received healthy response to inference request in 1.9074325561523438s
Received healthy response to inference request in 2.2193682193756104s
Received healthy response to inference request in 2.0418548583984375s
Received healthy response to inference request in 2.6643481254577637s
2026-03-28T10:57:02.467134+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v3
Received healthy response to inference request in 1.9537549018859863s
Received healthy response to inference request in 1.9696934223175049s
Received healthy response to inference request in 2.0460402965545654s
Received healthy response to inference request in 2.3210554122924805s
Received healthy response to inference request in 2.184772253036499s
Received healthy response to inference request in 2.121777057647705s
Received healthy response to inference request in 2.0974133014678955s
Received healthy response to inference request in 2.0508899688720703s
30 requests
8 failed requests
5th percentile: 1.928277611732483
10th percentile: 1.9680995702743531
20th percentile: 2.025084924697876
30th percentile: 2.044784665107727
40th percentile: 2.085394811630249
50th percentile: 2.2020702362060547
60th percentile: 2.5126494884490964
70th percentile: 5.6174346685409295
80th percentile: 20.140903520584107
90th percentile: 20.15069122314453
95th percentile: 20.151762545108795
99th percentile: 20.252611815929413
mean time: 7.288442047437032
%s, retrying in %s seconds...
Received healthy response to inference request in 1.9279260635375977s
Received healthy response to inference request in 1.7688655853271484s
Received healthy response to inference request in 2.067963123321533s
Received healthy response to inference request in 1.8454663753509521s
Received healthy response to inference request in 1.8947186470031738s
Received healthy response to inference request in 1.8728857040405273s
Received healthy response to inference request in 1.8420562744140625s
Received healthy response to inference request in 1.9577767848968506s
Received healthy response to inference request in 2.315826177597046s
Received healthy response to inference request in 1.9314327239990234s
Received healthy response to inference request in 1.9603817462921143s
Received healthy response to inference request in 2.111626386642456s
Failed to get request counts for guanaco-submitter. Falling back to default
Received healthy response to inference request in 1.9080407619476318s
Received healthy response to inference request in 1.8255267143249512s
Received healthy response to inference request in 1.963179111480713s
Received healthy response to inference request in 1.9437310695648193s
Received healthy response to inference request in 1.8770036697387695s
Received healthy response to inference request in 2.379216194152832s
Received healthy response to inference request in 1.9142444133758545s
Received healthy response to inference request in 2.0744571685791016s
Received healthy response to inference request in 2.0652546882629395s
2026-03-28T10:58:02.568381+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v3
Received healthy response to inference request in 2.0948832035064697s
Received healthy response to inference request in 1.9313597679138184s
Received healthy response to inference request in 1.9967775344848633s
Received healthy response to inference request in 2.063298463821411s
Received healthy response to inference request in 2.025479316711426s
Received healthy response to inference request in 1.972175121307373s
Received healthy response to inference request in 2.222087860107422s
Received healthy response to inference request in 2.610930919647217s
Received healthy response to inference request in 1.998084306716919s
30 requests
0 failed requests
5th percentile: 1.8329650163650513
10th percentile: 1.8451253652572632
20th percentile: 1.8911756515502929
30th percentile: 1.9238215684890747
40th percentile: 1.938811731338501
50th percentile: 1.9617804288864136
60th percentile: 1.9973002433776856
70th percentile: 2.0638853311538696
80th percentile: 2.078542375564575
90th percentile: 2.2314616918563845
95th percentile: 2.350690686702728
99th percentile: 2.5437336492538454
mean time: 2.0120885292689006
Pipeline stage StressChecker completed in 285.02s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.74s
Shutdown handler de-registered
chaiml-pony-v3-q27b-lr5_22882_v3 status is now deployed due to DeploymentManager action
chaiml-pony-v3-q27b-lr5_22882_v3 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v3-q27b-lr5_22882_v3 status is now torndown due to DeploymentManager action