Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-4d70-fd43-linear-w01-v35-uploader
Waiting for job on chaiml-4d70-fd43-linear-w01-v35-uploader to finish
chaiml-4d70-fd43-linear-w01-v35-uploader: Using quantization_mode: none
chaiml-4d70-fd43-linear-w01-v35-uploader: Downloading snapshot of ChaiML/4d70-fd43-linear-w01...
chaiml-4d70-fd43-linear-w01-v35-uploader:
Fetching 14 files: 0%| | 0/14 [00:00<?, ?it/s]
Fetching 14 files: 7%|▋ | 1/14 [00:00<00:03, 3.60it/s]
Fetching 14 files: 43%|████▎ | 6/14 [00:15<00:21, 2.69s/it]
Fetching 14 files: 100%|██████████| 14/14 [00:15<00:00, 1.10s/it]
chaiml-4d70-fd43-linear-w01-v35-uploader: Downloaded in 15.525s
chaiml-4d70-fd43-linear-w01-v35-uploader: Processed model ChaiML/4d70-fd43-linear-w01 in 24.771s
chaiml-4d70-fd43-linear-w01-v35-uploader: creating bucket guanaco-vllm-models
chaiml-4d70-fd43-linear-w01-v35-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-w01-v35-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-4d70-fd43-linear-w01-v35-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-4d70-fd43-linear-w01-v35-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-4d70-fd43-linear-w01-v35-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-w01-v35-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linear-w01-v35-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-w01-v35-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linear-w01-v35-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-w01-v35-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linear-w01-v35-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-w01-v35-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linear-w01-v35-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-4d70-fd43-linear-w01-v35-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-4d70-fd43-linear-w01-v35-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-4d70-fd43-linear-w01-v35-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-4d70-fd43-linear-w01-v35-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-4d70-fd43-linear-w01-v35-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/config.json
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/.gitattributes
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/special_tokens_map.json
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/mergekit_config.yaml s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/mergekit_config.yaml
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/README.md
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/mergekit_config.yml s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/mergekit_config.yml
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/tokenizer_config.json
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/model.safetensors.index.json
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/tokenizer.json
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/model-00003-of-00005.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/model-00003-of-00005.safetensors
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/model-00001-of-00005.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/model-00001-of-00005.safetensors
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/model-00005-of-00005.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/model-00005-of-00005.safetensors
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/model-00002-of-00005.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/model-00002-of-00005.safetensors
chaiml-4d70-fd43-linear-w01-v35-uploader: cp /dev/shm/model_output/model-00004-of-00005.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-w01-v35/model-00004-of-00005.safetensors
Job chaiml-4d70-fd43-linear-w01-v35-uploader completed after 390.51s with status: succeeded
Stopping job with name chaiml-4d70-fd43-linear-w01-v35-uploader
Pipeline stage VLLMUploader completed in 391.04s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-4d70-fd43-linear-w01-v35
Waiting for inference service chaiml-4d70-fd43-linear-w01-v35 to be ready
Inference service chaiml-4d70-fd43-linear-w01-v35 ready after 171.08913040161133s
Pipeline stage VLLMDeployer completed in 171.67s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.6523756980895996s
Received healthy response to inference request in 1.7982535362243652s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 1.6500601768493652s
Received healthy response to inference request in 2.4993133544921875s
Received healthy response to inference request in 1.8302149772644043s
Received healthy response to inference request in 1.660543441772461s
Received healthy response to inference request in 2.3544349670410156s
Received healthy response to inference request in 1.6354069709777832s
Received healthy response to inference request in 1.6388261318206787s
Received healthy response to inference request in 1.5823094844818115s
Received healthy response to inference request in 1.547067642211914s
Received healthy response to inference request in 1.585737705230713s
Received healthy response to inference request in 1.4603352546691895s
Received healthy response to inference request in 1.5544486045837402s
Received healthy response to inference request in 1.9322185516357422s
Received healthy response to inference request in 1.841141939163208s
Received healthy response to inference request in 1.7245934009552002s
Received healthy response to inference request in 1.8099942207336426s
Received healthy response to inference request in 1.9275426864624023s
Received healthy response to inference request in 1.676483392715454s
Received healthy response to inference request in 2.2893950939178467s
Received healthy response to inference request in 1.9305109977722168s
Received healthy response to inference request in 1.7307474613189697s
Received healthy response to inference request in 1.6324830055236816s
Received healthy response to inference request in 1.8307268619537354s
Received healthy response to inference request in 1.4636609554290771s
Received healthy response to inference request in 2.140592098236084s
Received healthy response to inference request in 1.5794589519500732s
Received healthy response to inference request in 1.617673397064209s
Received healthy response to inference request in 1.8951079845428467s
30 requests
0 failed requests
5th percentile: 1.5011939644813537
10th percentile: 1.5537105083465577
20th percentile: 1.5850520610809327
30th percentile: 1.6345297813415527
40th percentile: 1.651449489593506
50th percentile: 1.7005383968353271
60th percentile: 1.802949810028076
70th percentile: 1.8338513851165772
80th percentile: 1.9281363487243652
90th percentile: 2.1554723978042603
95th percentile: 2.3251670241355895
99th percentile: 2.457298622131348
mean time: 1.7823886315027873
Pipeline stage StressChecker completed in 56.89s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.63s
Shutdown handler de-registered
chaiml-4d70-fd43-linear-w01_v35 status is now deployed due to DeploymentManager action
chaiml-4d70-fd43-linear-w01_v35 status is now inactive due to auto deactivation removed underperforming models
chaiml-4d70-fd43-linear-w01_v35 status is now torndown due to DeploymentManager action