Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-02f4-69d4-linear-76375-v6-uploader
Waiting for job on chaiml-02f4-69d4-linear-76375-v6-uploader to finish
chaiml-02f4-69d4-linear-76375-v6-uploader: Using quantization_mode: none
chaiml-02f4-69d4-linear-76375-v6-uploader: Downloading snapshot of ChaiML/02f4-69d4-linear-w01-W4A16-G128-AutoRound...
chaiml-02f4-69d4-linear-76375-v6-uploader:
Fetching 12 files: 0%| | 0/12 [00:00<?, ?it/s]
Fetching 12 files: 8%|▊ | 1/12 [00:00<00:03, 2.99it/s]
Fetching 12 files: 42%|████▏ | 5/12 [00:07<00:11, 1.63s/it]
Fetching 12 files: 50%|█████ | 6/12 [00:08<00:08, 1.44s/it]
Fetching 12 files: 100%|██████████| 12/12 [00:08<00:00, 1.40it/s]
chaiml-02f4-69d4-linear-76375-v6-uploader: Downloaded in 8.695s
chaiml-02f4-69d4-linear-76375-v6-uploader: Processed model ChaiML/02f4-69d4-linear-w01-W4A16-G128-AutoRound in 14.286s
chaiml-02f4-69d4-linear-76375-v6-uploader: creating bucket guanaco-vllm-models
chaiml-02f4-69d4-linear-76375-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-76375-v6-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-02f4-69d4-linear-76375-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-02f4-69d4-linear-76375-v6-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-02f4-69d4-linear-76375-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-76375-v6-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-76375-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-76375-v6-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-76375-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-76375-v6-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-76375-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-76375-v6-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-76375-v6-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-02f4-69d4-linear-76375-v6-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-02f4-69d4-linear-76375-v6-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-02f4-69d4-linear-76375-v6-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-02f4-69d4-linear-76375-v6-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-02f4-69d4-linear-76375-v6-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v6
chaiml-02f4-69d4-linear-76375-v6-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v6/model-00003-of-00003.safetensors
chaiml-02f4-69d4-linear-76375-v6-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v6/model-00001-of-00003.safetensors
Job chaiml-02f4-69d4-linear-76375-v6-uploader completed after 166.34s with status: succeeded
Stopping job with name chaiml-02f4-69d4-linear-76375-v6-uploader
Pipeline stage VLLMUploader completed in 167.10s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-02f4-69d4-linear-76375-v6
Waiting for inference service chaiml-02f4-69d4-linear-76375-v6 to be ready
Inference service chaiml-02f4-69d4-linear-76375-v6 ready after 223.34654068946838s
Pipeline stage VLLMDeployer completed in 224.69s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.2158761024475098s
Received healthy response to inference request in 1.248687744140625s
Received healthy response to inference request in 1.6595609188079834s
Received healthy response to inference request in 1.008162260055542s
Received healthy response to inference request in 0.9464771747589111s
Received healthy response to inference request in 1.1450564861297607s
Received healthy response to inference request in 0.9626116752624512s
Received healthy response to inference request in 1.611250877380371s
Received healthy response to inference request in 1.0348353385925293s
Received healthy response to inference request in 1.12453031539917s
Received healthy response to inference request in 1.249610424041748s
Received healthy response to inference request in 1.0233635902404785s
Received healthy response to inference request in 0.9661505222320557s
Received healthy response to inference request in 1.246039628982544s
Received healthy response to inference request in 1.1747722625732422s
Received healthy response to inference request in 1.2346744537353516s
Received healthy response to inference request in 1.4873971939086914s
Received healthy response to inference request in 1.2075355052947998s
Received healthy response to inference request in 1.0124778747558594s
Received healthy response to inference request in 1.127213478088379s
Received healthy response to inference request in 1.207911491394043s
Received healthy response to inference request in 2.030205249786377s
Received healthy response to inference request in 1.300689458847046s
Received healthy response to inference request in 1.3109548091888428s
Received healthy response to inference request in 1.2482171058654785s
Received healthy response to inference request in 1.457101821899414s
Received healthy response to inference request in 1.254692792892456s
Received healthy response to inference request in 1.0299830436706543s
Received healthy response to inference request in 1.2255799770355225s
Received healthy response to inference request in 1.5028290748596191s
30 requests
0 failed requests
5th percentile: 0.9642041563987732
10th percentile: 1.0039610862731934
20th percentile: 1.0286591529846192
30th percentile: 1.1264085292816162
40th percentile: 1.1944302082061768
50th percentile: 1.2207280397415161
60th percentile: 1.2469106197357178
70th percentile: 1.2511351346969604
80th percentile: 1.3401842117309575
90th percentile: 1.5136712551116944
95th percentile: 1.6378214001655578
99th percentile: 1.922718393802643
mean time: 1.241814955075582
Pipeline stage StressChecker completed in 41.06s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.67s
Shutdown handler de-registered
chaiml-02f4-69d4-linear_76375_v6 status is now deployed due to DeploymentManager action
chaiml-02f4-69d4-linear_76375_v6 status is now inactive due to auto deactivation removed underperforming models
chaiml-02f4-69d4-linear_76375_v6 status is now torndown due to DeploymentManager action