Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v4-gm4-31b-30120-v1-uploader
Waiting for job on chaiml-pony-v4-gm4-31b-30120-v1-uploader to finish
chaiml-pony-v4-gm4-31b-30120-v1-uploader: Using quantization_mode: none
chaiml-pony-v4-gm4-31b-30120-v1-uploader: Downloading snapshot of ChaiML/pony-v4-gm4-31b-lr5e6ep2g16...
chaiml-pony-v4-gm4-31b-30120-v1-uploader: Downloaded in 33.754s
chaiml-pony-v4-gm4-31b-30120-v1-uploader: Processed model ChaiML/pony-v4-gm4-31b-lr5e6ep2g16 in 36.025s
chaiml-pony-v4-gm4-31b-30120-v1-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v4-gm4-31b-30120-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v4-gm4-31b-30120-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v4-gm4-31b-30120-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v4-gm4-31b-30120-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v4-gm4-31b-30120-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v4-gm4-31b-30120-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v4-gm4-31b-30120-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v4-gm4-31b-30120-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v4-gm4-31b-30120-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v4-gm4-31b-30120-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v4-gm4-31b-30120-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v4-gm4-31b-30120-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v4-gm4-31b-30120-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v4-gm4-31b-30120-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v4-gm4-31b-30120-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v4-gm4-31b-30120-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v4-gm4-31b-30120-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v4-gm4-31b-30120-v1-uploader: uploading /tmp/model_output to s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/chat_template.jinja
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/tokenizer_config.json
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/args.json s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/args.json
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model.safetensors.index.json
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/generation_config.json
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/.gitattributes
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/config.json
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/processor_config.json s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/processor_config.json
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/tokenizer.json
2026-04-14T15:09:03.921167+00:00 monitor updated for chaiml-pony-v4-gm4-31b-_30120_v1
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00013-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00013-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00004-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00004-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00007-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00007-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00011-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00011-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00010-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00010-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00005-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00005-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00009-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00009-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00003-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00003-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00012-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00012-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00001-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00001-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00002-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00002-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00006-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00006-of-00013.safetensors
chaiml-pony-v4-gm4-31b-30120-v1-uploader: cp /tmp/model_output/model-00008-of-00013.safetensors s3://guanaco-vllm-models/chaiml-pony-v4-gm4-31b-30120-v1/default/model-00008-of-00013.safetensors
Job chaiml-pony-v4-gm4-31b-30120-v1-uploader completed after 63.16s with status: succeeded
Stopping job with name chaiml-pony-v4-gm4-31b-30120-v1-uploader
Pipeline stage VLLMUploader completed in 63.89s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.10s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 5.02s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v4-gm4-31b-30120-v1
Waiting for inference service chaiml-pony-v4-gm4-31b-30120-v1 to be ready
2026-04-14T15:10:04.066125+00:00 monitor updated for chaiml-pony-v4-gm4-31b-_30120_v1
2026-04-14T15:11:04.199980+00:00 monitor updated for chaiml-pony-v4-gm4-31b-_30120_v1
2026-04-14T15:12:04.305883+00:00 monitor updated for chaiml-pony-v4-gm4-31b-_30120_v1
Inference service chaiml-pony-v4-gm4-31b-30120-v1 ready after 210.29472303390503s
Pipeline stage VLLMDeployer completed in 210.84s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 11.447071313858032s
Received healthy response to inference request in 2.3362209796905518s
2026-04-14T15:13:04.401804+00:00 monitor updated for chaiml-pony-v4-gm4-31b-_30120_v1
Received healthy response to inference request in 10.216003179550171s
Received healthy response to inference request in 10.632049083709717s
Received healthy response to inference request in 2.2654573917388916s
Received healthy response to inference request in 11.808109045028687s
Received healthy response to inference request in 2.449577808380127s
Received healthy response to inference request in 9.420896291732788s
Received healthy response to inference request in 2.448676109313965s
Received healthy response to inference request in 2.2697579860687256s
Received healthy response to inference request in 2.749323606491089s
Received healthy response to inference request in 2.3039727210998535s
Received healthy response to inference request in 2.55428409576416s
Received healthy response to inference request in 3.6641194820404053s
2026-04-14T15:14:04.532754+00:00 monitor updated for chaiml-pony-v4-gm4-31b-_30120_v1
Received healthy response to inference request in 2.294494867324829s
Received healthy response to inference request in 2.4095206260681152s
Received healthy response to inference request in 2.556274652481079s
Received healthy response to inference request in 3.561695098876953s
Received healthy response to inference request in 2.3602283000946045s
Received healthy response to inference request in 2.555020570755005s
Received healthy response to inference request in 2.5291788578033447s
Received healthy response to inference request in 2.5091028213500977s
Received healthy response to inference request in 2.3328282833099365s
Received healthy response to inference request in 2.3139824867248535s
Received healthy response to inference request in 2.6970179080963135s
Received healthy response to inference request in 2.3704612255096436s
Received healthy response to inference request in 2.4725663661956787s
Received healthy response to inference request in 2.309882640838623s
Received healthy response to inference request in 2.417517900466919s
Received healthy response to inference request in 2.378901243209839s
30 requests
0 failed requests
5th percentile: 2.2808895826339723
10th percentile: 2.303024935722351
20th percentile: 2.32905912399292
30th percentile: 2.367391347885132
40th percentile: 2.4143189907073976
50th percentile: 2.461072087287903
60th percentile: 2.539220952987671
70th percentile: 2.598497629165649
80th percentile: 3.5821799755096437
90th percentile: 10.257607769966127
95th percentile: 11.080311310291288
99th percentile: 11.703408102989197
mean time: 3.887806431452433
Pipeline stage StressChecker completed in 119.21s
Shutdown handler de-registered
chaiml-pony-v4-gm4-31b-_30120_v1 status is now deployed due to DeploymentManager action
chaiml-pony-v4-gm4-31b-_30120_v1 status is now inactive due to auto deactivation removed underperforming models