Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d2-q235b-pv-18495-v3-uploader
Waiting for job on chaiml-pony-d2-q235b-pv-18495-v3-uploader to finish
chaiml-pony-d2-q235b-pv-18495-v3-uploader: Using quantization_mode: w4a16
chaiml-pony-d2-q235b-pv-18495-v3-uploader: Checking if ChaiML/pony-d2-q235b-pv1-lr5e6ep2r64g4-W4A16 already exists in ChaiML
chaiml-pony-d2-q235b-pv-18495-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d2-q235b-pv-18495-v3-uploader: Downloading snapshot of ChaiML/pony-d2-q235b-pv1-lr5e6ep2r64g4-W4A16...
chaiml-pony-d2-q235b-pv-18495-v3-uploader: Downloaded in 48.398s
chaiml-pony-d2-q235b-pv-18495-v3-uploader: Processed model ChaiML/pony-d2-q235b-pv1-lr5e6ep2r64g4 in 48.930s
chaiml-pony-d2-q235b-pv-18495-v3-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d2-q235b-pv-18495-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d2-q235b-pv-18495-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d2-q235b-pv-18495-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d2-q235b-pv-18495-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d2-q235b-pv-18495-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d2-q235b-pv-18495-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d2-q235b-pv-18495-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d2-q235b-pv-18495-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d2-q235b-pv-18495-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d2-q235b-pv-18495-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d2-q235b-pv-18495-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d2-q235b-pv-18495-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d2-q235b-pv-18495-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d2-q235b-pv-18495-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d2-q235b-pv-18495-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d2-q235b-pv-18495-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d2-q235b-pv-18495-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d2-q235b-pv-18495-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/.gitattributes
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/chat_template.jinja
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/added_tokens.json
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/tokenizer_config.json
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/quantization_config.json
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/generation_config.json
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/config.json
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/special_tokens_map.json
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/merges.txt
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/vocab.json
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model.safetensors.index.json
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/tokenizer.json
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00027-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00017-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00004-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00001-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00021-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00015-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00010-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00014-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00011-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00022-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00013-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00003-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00019-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00005-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00008-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00026-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00023-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00024-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00006-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00025-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00009-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00020-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00018-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00012-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00016-of-00027.safetensors
chaiml-pony-d2-q235b-pv-18495-v3-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-18495-v3/default/model-00007-of-00027.safetensors
Job chaiml-pony-d2-q235b-pv-18495-v3-uploader completed after 148.44s with status: succeeded
Stopping job with name chaiml-pony-d2-q235b-pv-18495-v3-uploader
Pipeline stage VLLMUploader completed in 150.11s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.56s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d2-q235b-pv-18495-v3
Waiting for inference service chaiml-pony-d2-q235b-pv-18495-v3 to be ready
Inference service chaiml-pony-d2-q235b-pv-18495-v3 ready after 501.49575328826904s
Pipeline stage VLLMDeployer completed in 502.05s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.2553155422210693s
Received healthy response to inference request in 1.8654444217681885s
Received healthy response to inference request in 2.254574775695801s
Received healthy response to inference request in 2.232745409011841s
Received healthy response to inference request in 1.9736294746398926s
Received healthy response to inference request in 1.8665504455566406s
Received healthy response to inference request in 1.9592671394348145s
Received healthy response to inference request in 1.8647475242614746s
Received healthy response to inference request in 1.9421536922454834s
Received healthy response to inference request in 1.922365427017212s
Received healthy response to inference request in 1.8713932037353516s
Received healthy response to inference request in 2.001941204071045s
Received healthy response to inference request in 2.063560724258423s
Received healthy response to inference request in 1.980093002319336s
Received healthy response to inference request in 1.8433096408843994s
Received healthy response to inference request in 1.984461784362793s
Received healthy response to inference request in 2.2784881591796875s
Received healthy response to inference request in 2.1441197395324707s
Received healthy response to inference request in 1.9457509517669678s
Received healthy response to inference request in 1.9888396263122559s
Received healthy response to inference request in 1.912337303161621s
Received healthy response to inference request in 1.9669759273529053s
Received healthy response to inference request in 2.085549831390381s
Received healthy response to inference request in 1.9108097553253174s
Received healthy response to inference request in 1.9291839599609375s
Received healthy response to inference request in 1.9678840637207031s
Received healthy response to inference request in 1.8995976448059082s
Received healthy response to inference request in 2.147432565689087s
Received healthy response to inference request in 1.9183194637298584s
Received healthy response to inference request in 1.9695155620574951s
30 requests
0 failed requests
5th percentile: 1.8650611281394958
10th percentile: 1.8664398431777953
20th percentile: 1.9085673332214355
30th percentile: 1.9211516380310059
40th percentile: 1.944312047958374
50th percentile: 1.9674299955368042
60th percentile: 1.97621488571167
70th percentile: 1.9927700996398925
80th percentile: 2.097263813018799
90th percentile: 2.234928345680237
95th percentile: 2.2549821972846984
99th percentile: 2.271768100261688
mean time: 1.998211932182312
Pipeline stage StressChecker completed in 65.38s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.86s
Shutdown handler de-registered
chaiml-pony-d2-q235b-pv_18495_v3 status is now deployed due to DeploymentManager action
chaiml-pony-d2-q235b-pv_18495_v3 status is now inactive due to auto deactivation removed underperforming models