Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-2fe5-c13f-linear-w01-v43-uploader
Waiting for job on chaiml-2fe5-c13f-linear-w01-v43-uploader to finish
chaiml-2fe5-c13f-linear-w01-v43-uploader: Using quantization_mode: none
chaiml-2fe5-c13f-linear-w01-v43-uploader: Downloading snapshot of ChaiML/2fe5-c13f-linear-w01...
chaiml-2fe5-c13f-linear-w01-v43-uploader:
Fetching 14 files: 0%| | 0/14 [00:00<?, ?it/s]
Fetching 14 files: 7%|▋ | 1/14 [00:00<00:04, 3.21it/s]
Fetching 14 files: 43%|████▎ | 6/14 [00:13<00:18, 2.31s/it]
Fetching 14 files: 50%|█████ | 7/14 [00:13<00:13, 1.89s/it]
Fetching 14 files: 57%|█████▋ | 8/14 [00:13<00:09, 1.51s/it]
Fetching 14 files: 64%|██████▍ | 9/14 [00:14<00:06, 1.23s/it]
Fetching 14 files: 100%|██████████| 14/14 [00:14<00:00, 1.00s/it]
chaiml-2fe5-c13f-linear-w01-v43-uploader: Downloaded in 14.148s
chaiml-2fe5-c13f-linear-w01-v43-uploader: Processed model ChaiML/2fe5-c13f-linear-w01 in 25.382s
chaiml-2fe5-c13f-linear-w01-v43-uploader: creating bucket guanaco-vllm-models
chaiml-2fe5-c13f-linear-w01-v43-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-w01-v43-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-2fe5-c13f-linear-w01-v43-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-2fe5-c13f-linear-w01-v43-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-2fe5-c13f-linear-w01-v43-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-w01-v43-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-2fe5-c13f-linear-w01-v43-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-w01-v43-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-2fe5-c13f-linear-w01-v43-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-w01-v43-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-2fe5-c13f-linear-w01-v43-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-w01-v43-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-2fe5-c13f-linear-w01-v43-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-2fe5-c13f-linear-w01-v43-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-2fe5-c13f-linear-w01-v43-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-2fe5-c13f-linear-w01-v43-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-2fe5-c13f-linear-w01-v43-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-2fe5-c13f-linear-w01-v43-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/config.json
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/README.md
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/mergekit_config.yaml s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/mergekit_config.yaml
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/special_tokens_map.json
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/.gitattributes
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/mergekit_config.yml s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/mergekit_config.yml
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/model.safetensors.index.json
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/tokenizer_config.json
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/model-00001-of-00005.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/model-00001-of-00005.safetensors
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/model-00003-of-00005.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/model-00003-of-00005.safetensors
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/model-00004-of-00005.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/model-00004-of-00005.safetensors
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/model-00002-of-00005.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/model-00002-of-00005.safetensors
chaiml-2fe5-c13f-linear-w01-v43-uploader: cp /dev/shm/model_output/model-00005-of-00005.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v43/model-00005-of-00005.safetensors
Job chaiml-2fe5-c13f-linear-w01-v43-uploader completed after 470.86s with status: succeeded
Stopping job with name chaiml-2fe5-c13f-linear-w01-v43-uploader
Pipeline stage VLLMUploader completed in 471.70s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.13s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-2fe5-c13f-linear-w01-v43
Waiting for inference service chaiml-2fe5-c13f-linear-w01-v43 to be ready
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-2fe5-c13f-linear-w01-v43 ready after 241.34261417388916s
Pipeline stage VLLMDeployer completed in 241.85s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.4513349533081055s
Received healthy response to inference request in 2.6607322692871094s
Received healthy response to inference request in 2.805098056793213s
Received healthy response to inference request in 2.5220186710357666s
Received healthy response to inference request in 3.030937910079956s
Received healthy response to inference request in 2.837183713912964s
Received healthy response to inference request in 3.0030930042266846s
Received healthy response to inference request in 2.635551691055298s
Received healthy response to inference request in 2.249591827392578s
Received healthy response to inference request in 2.7273032665252686s
Received healthy response to inference request in 2.5106356143951416s
Received healthy response to inference request in 2.503021478652954s
Received healthy response to inference request in 2.622650384902954s
Received healthy response to inference request in 2.3845255374908447s
Received healthy response to inference request in 2.65041446685791s
Received healthy response to inference request in 2.758033275604248s
Received healthy response to inference request in 2.7140932083129883s
Received healthy response to inference request in 2.7663984298706055s
Received healthy response to inference request in 2.3924612998962402s
Received healthy response to inference request in 2.50882887840271s
Received healthy response to inference request in 2.678577184677124s
Received healthy response to inference request in 2.751471757888794s
Received healthy response to inference request in 2.47756290435791s
Received healthy response to inference request in 2.54744291305542s
Received healthy response to inference request in 2.4053542613983154s
Received healthy response to inference request in 2.6755850315093994s
Received healthy response to inference request in 2.4524381160736084s
Received healthy response to inference request in 2.646585702896118s
Received healthy response to inference request in 2.6474709510803223s
Received healthy response to inference request in 2.769224166870117s
30 requests
0 failed requests
5th percentile: 2.3880966305732727
10th percentile: 2.404064965248108
20th percentile: 2.47253794670105
30th percentile: 2.510093593597412
40th percentile: 2.5925673961639406
50th percentile: 2.64702832698822
60th percentile: 2.6666733741760256
70th percentile: 2.7180562257766723
80th percentile: 2.7597063064575194
90th percentile: 2.8083066225051883
95th percentile: 2.9284338235855096
99th percentile: 3.0228628873825074
mean time: 2.6261873642603555
Pipeline stage StressChecker completed in 82.05s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
chaiml-2fe5-c13f-linear-w01_v43 status is now deployed due to DeploymentManager action
chaiml-2fe5-c13f-linear-w01_v43 status is now inactive due to auto deactivation removed underperforming models
chaiml-2fe5-c13f-linear-w01_v43 status is now torndown due to DeploymentManager action