Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v3-q27b-lr5-64169-v2-uploader
Waiting for job on chaiml-pony-v3-q27b-lr5-64169-v2-uploader to finish
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: Using quantization_mode: fp8
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: Checking if ChaiML/pony-v3-q27b-lr5e6ep1g8-shuffle-FP8 already exists in ChaiML
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: Downloading snapshot of ChaiML/pony-v3-q27b-lr5e6ep1g8-shuffle-FP8...
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: Downloaded in 35.959s
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: Processed model ChaiML/pony-v3-q27b-lr5e6ep1g8-shuffle in 39.098s
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v2/default
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v2/default/.gitattributes
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v2/default/generation_config.json
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v2/default/tokenizer_config.json
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v2/default/config.json
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v2/default/chat_template.jinja
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v2/default/recipe.yaml
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v2/default/tokenizer.json
2026-03-31T05:00:38.065442+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v2
chaiml-pony-v3-q27b-lr5-64169-v2-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v2/default/model.safetensors
Job chaiml-pony-v3-q27b-lr5-64169-v2-uploader completed after 112.76s with status: succeeded
Stopping job with name chaiml-pony-v3-q27b-lr5-64169-v2-uploader
Pipeline stage VLLMUploader completed in 113.20s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.10s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Failed to get request counts for guanaco-submitter. Falling back to default
Pipeline stage VLLMTemplater completed in 4.74s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v3-q27b-lr5-64169-v2
Waiting for inference service chaiml-pony-v3-q27b-lr5-64169-v2 to be ready
2026-03-31T05:01:38.153765+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v2
2026-03-31T05:02:38.244317+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v2
2026-03-31T05:03:38.336527+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v2
Inference service chaiml-pony-v3-q27b-lr5-64169-v2 ready after 181.20902252197266s
Pipeline stage VLLMDeployer completed in 181.64s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-31T05:04:38.434333+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-31T05:05:38.637361+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.808162450790405s
Received healthy response to inference request in 2.124208450317383s
2026-03-31T05:06:38.736840+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.152536869049072s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.130697011947632s
Received healthy response to inference request in 1.9007904529571533s
Received healthy response to inference request in 1.8603804111480713s
Received healthy response to inference request in 2.03655743598938s
Received healthy response to inference request in 1.908104658126831s
2026-03-31T05:07:38.826845+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v2
{"detail":"('http://chaiml-pony-v3-q27b-lr5-64169-v2-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'upstream connect error or disconnect/reset before headers. reset reason: connection termination')"}
Received unhealthy response to inference request!
Received healthy response to inference request in 4.196499586105347s
Received healthy response to inference request in 1.966113805770874s
Received healthy response to inference request in 2.142585515975952s
Received healthy response to inference request in 1.9689738750457764s
Received healthy response to inference request in 2.070551872253418s
Received healthy response to inference request in 2.0822362899780273s
Received healthy response to inference request in 2.2362921237945557s
Received healthy response to inference request in 2.5340049266815186s
Received healthy response to inference request in 2.0085556507110596s
Received healthy response to inference request in 2.0055251121520996s
Received healthy response to inference request in 2.026087999343872s
Received healthy response to inference request in 2.020052194595337s
Received healthy response to inference request in 2.0381786823272705s
Received healthy response to inference request in 2.3064327239990234s
30 requests
8 failed requests
5th percentile: 1.9040818452835082
10th percentile: 1.9603128910064698
20th percentile: 2.0079495429992678
30th percentile: 2.0334166049957276
40th percentile: 2.0775625228881838
50th percentile: 2.136641263961792
60th percentile: 2.397461605072021
70th percentile: 4.3799984455108625
80th percentile: 20.1185001373291
90th percentile: 20.134312558174134
95th percentile: 20.476947844028473
99th percentile: 27.97474704027177
mean time: 7.261261494954427
%s, retrying in %s seconds...
Received healthy response to inference request in 1.6665396690368652s
Received healthy response to inference request in 1.732008934020996s
Received healthy response to inference request in 1.8361563682556152s
Received healthy response to inference request in 1.9227089881896973s
Received healthy response to inference request in 2.0798397064208984s
Received healthy response to inference request in 1.8866767883300781s
Received healthy response to inference request in 2.379333734512329s
Received healthy response to inference request in 1.8548145294189453s
Received healthy response to inference request in 2.155494213104248s
2026-03-31T05:08:39.085194+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v2
Received healthy response to inference request in 2.3126413822174072s
Received healthy response to inference request in 1.8912620544433594s
Received healthy response to inference request in 1.8693022727966309s
Received healthy response to inference request in 2.098677158355713s
Received healthy response to inference request in 1.8646461963653564s
Received healthy response to inference request in 2.3853094577789307s
Received healthy response to inference request in 1.891453742980957s
Received healthy response to inference request in 2.6345913410186768s
Received healthy response to inference request in 1.8793134689331055s
Received healthy response to inference request in 2.623957395553589s
Received healthy response to inference request in 2.1595983505249023s
Received healthy response to inference request in 1.9889979362487793s
Received healthy response to inference request in 1.8184361457824707s
Received healthy response to inference request in 2.629451274871826s
Received healthy response to inference request in 1.9867956638336182s
Received healthy response to inference request in 2.0594630241394043s
Received healthy response to inference request in 2.4538564682006836s
Received healthy response to inference request in 2.1377744674682617s
Received healthy response to inference request in 1.9857792854309082s
Received healthy response to inference request in 2.1750190258026123s
Received healthy response to inference request in 1.9818952083587646s
30 requests
0 failed requests
5th percentile: 1.7709011793136598
10th percentile: 1.8343843460083007
20th percentile: 1.868371057510376
30th percentile: 1.889886474609375
40th percentile: 1.9582207202911377
50th percentile: 1.9878968000411987
60th percentile: 2.087374687194824
70th percentile: 2.156725454330444
80th percentile: 2.325979852676392
90th percentile: 2.4708665609359746
95th percentile: 2.6269790291786195
99th percentile: 2.6331007218360902
mean time: 2.0780598084131876
Pipeline stage StressChecker completed in 287.01s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.83s
Shutdown handler de-registered
chaiml-pony-v3-q27b-lr5_64169_v2 status is now deployed due to DeploymentManager action
chaiml-pony-v3-q27b-lr5_64169_v2 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v3-q27b-lr5_64169_v2 status is now torndown due to DeploymentManager action