Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v3-q27b-lr5-22882-v5-uploader
Waiting for job on chaiml-pony-v3-q27b-lr5-22882-v5-uploader to finish
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: Using quantization_mode: fp8
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: Checking if ChaiML/pony-v3-q27b-lr5e6ep1g8-FP8 already exists in ChaiML
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: Downloading snapshot of ChaiML/pony-v3-q27b-lr5e6ep1g8-FP8...
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: Downloaded in 25.162s
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: Processed model ChaiML/pony-v3-q27b-lr5e6ep1g8 in 27.633s
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v5/default
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v5/default/.gitattributes
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v5/default/recipe.yaml
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v5/default/generation_config.json
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v5/default/config.json
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v5/default/chat_template.jinja
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v5/default/tokenizer_config.json
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v5/default/tokenizer.json
2026-03-28T10:49:26.609697+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v5
chaiml-pony-v3-q27b-lr5-22882-v5-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v5/default/model.safetensors
Job chaiml-pony-v3-q27b-lr5-22882-v5-uploader completed after 113.24s with status: succeeded
Stopping job with name chaiml-pony-v3-q27b-lr5-22882-v5-uploader
Pipeline stage VLLMUploader completed in 113.92s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.19s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.19s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v3-q27b-lr5-22882-v5
Waiting for inference service chaiml-pony-v3-q27b-lr5-22882-v5 to be ready
2026-03-28T10:50:26.701090+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v5
2026-03-28T10:51:26.798445+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v5
Failed to get response for submission chaiml-gspo-glm47-combi_10268_v1: ('http://chaiml-gspo-glm47-combi-10268-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
2026-03-28T10:52:26.904154+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v5
Inference service chaiml-pony-v3-q27b-lr5-22882-v5 ready after 180.49115109443665s
Pipeline stage VLLMDeployer completed in 180.90s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-28T10:53:27.048923+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v5
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T10:54:27.156906+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v5
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.159894943237305s
Received healthy response to inference request in 2.259557008743286s
2026-03-28T10:55:27.244077+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v5
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 7.505593299865723s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.132624387741089s
{"detail":"('http://chaiml-pony-v3-q27b-lr5-22882-v5-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'upstream connect error or disconnect/reset before headers. reset reason: connection termination')"}
Received unhealthy response to inference request!
Received healthy response to inference request in 1.8746552467346191s
Received healthy response to inference request in 4.253622770309448s
Received healthy response to inference request in 2.055887460708618s
Received healthy response to inference request in 2.121082067489624s
Received healthy response to inference request in 1.8573651313781738s
Received healthy response to inference request in 1.8962016105651855s
Received healthy response to inference request in 2.269896984100342s
2026-03-28T10:56:27.360788+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v5
Received healthy response to inference request in 1.845470905303955s
Received healthy response to inference request in 2.5011305809020996s
Received healthy response to inference request in 3.0774219036102295s
Received healthy response to inference request in 1.9901297092437744s
Received healthy response to inference request in 2.545943260192871s
Received healthy response to inference request in 2.554232597351074s
Received healthy response to inference request in 1.9477813243865967s
Received healthy response to inference request in 2.348682165145874s
Received healthy response to inference request in 1.9526877403259277s
Received healthy response to inference request in 2.004051685333252s
Received healthy response to inference request in 2.138624429702759s
30 requests
8 failed requests
5th percentile: 1.8651456832885742
10th percentile: 1.8940469741821289
20th percentile: 1.9826413154602052
30th percentile: 2.101523685455322
40th percentile: 2.2657609939575196
50th percentile: 2.5235369205474854
60th percentile: 3.499502897262572
70th percentile: 4.941198468208307
80th percentile: 20.12583236694336
90th percentile: 20.23867349624634
95th percentile: 20.312811970710754
99th percentile: 20.669794726371766
mean time: 6.931692576408386
%s, retrying in %s seconds...
Received healthy response to inference request in 1.7100417613983154s
Received healthy response to inference request in 1.777595043182373s
Received healthy response to inference request in 2.3917524814605713s
Received healthy response to inference request in 2.031334161758423s
Received healthy response to inference request in 2.001732349395752s
Received healthy response to inference request in 1.8591227531433105s
Received healthy response to inference request in 1.9281666278839111s
Received healthy response to inference request in 2.2791783809661865s
Received healthy response to inference request in 2.0028650760650635s
Received healthy response to inference request in 2.1868178844451904s
Received healthy response to inference request in 1.8404722213745117s
Received healthy response to inference request in 1.838785171508789s
Received healthy response to inference request in 1.4972419738769531s
Received healthy response to inference request in 1.9044215679168701s
Received healthy response to inference request in 2.1820218563079834s
Received healthy response to inference request in 2.2255051136016846s
2026-03-28T10:57:27.588376+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v5
Received healthy response to inference request in 2.195605516433716s
Received healthy response to inference request in 1.9661498069763184s
Received healthy response to inference request in 1.9812273979187012s
Received healthy response to inference request in 2.145644187927246s
Received healthy response to inference request in 2.1185922622680664s
Received healthy response to inference request in 1.9519686698913574s
Received healthy response to inference request in 2.2736082077026367s
Received healthy response to inference request in 1.9841969013214111s
Received healthy response to inference request in 1.8092048168182373s
Received healthy response to inference request in 2.027086019515991s
Received healthy response to inference request in 2.04327130317688s
Received healthy response to inference request in 1.9929206371307373s
Received healthy response to inference request in 2.9129014015197754s
Received healthy response to inference request in 2.3202908039093018s
30 requests
0 failed requests
5th percentile: 1.7404407382011413
10th percentile: 1.806043839454651
20th percentile: 1.8553926467895507
30th percentile: 1.9448280572891234
40th percentile: 1.9830090999603271
50th percentile: 2.0022987127304077
60th percentile: 2.0361090183258055
70th percentile: 2.156557488441467
80th percentile: 2.2015854358673095
90th percentile: 2.2832896232604982
95th percentile: 2.3595947265625
99th percentile: 2.761768214702607
mean time: 2.045990745226542
Pipeline stage StressChecker completed in 275.70s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.35s
Shutdown handler de-registered
chaiml-pony-v3-q27b-lr5_22882_v5 status is now deployed due to DeploymentManager action
chaiml-pony-v3-q27b-lr5_22882_v5 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v3-q27b-lr5_22882_v5 status is now torndown due to DeploymentManager action