Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-02f4-69d4-linear-w01-v59-uploader
Waiting for job on chaiml-02f4-69d4-linear-w01-v59-uploader to finish
chaiml-02f4-69d4-linear-w01-v59-uploader: Using quantization_mode: none
chaiml-02f4-69d4-linear-w01-v59-uploader: Downloading snapshot of ChaiML/02f4-69d4-linear-w01...
chaiml-02f4-69d4-linear-w01-v59-uploader:
Fetching 19 files: 0%| | 0/19 [00:00<?, ?it/s]
Fetching 19 files: 5%|▌ | 1/19 [00:00<00:06, 2.82it/s]
Fetching 19 files: 32%|███▏ | 6/19 [00:16<00:37, 2.92s/it]
Fetching 19 files: 37%|███▋ | 7/19 [00:16<00:28, 2.37s/it]
Fetching 19 files: 47%|████▋ | 9/19 [00:20<00:20, 2.07s/it]
Fetching 19 files: 53%|█████▎ | 10/19 [00:20<00:15, 1.77s/it]
Fetching 19 files: 74%|███████▎ | 14/19 [00:23<00:06, 1.24s/it]
Fetching 19 files: 79%|███████▉ | 15/19 [00:24<00:04, 1.18s/it]
Fetching 19 files: 100%|██████████| 19/19 [00:24<00:00, 1.30s/it]
chaiml-02f4-69d4-linear-w01-v59-uploader: Downloaded in 24.797s
chaiml-02f4-69d4-linear-w01-v59-uploader: Processed model ChaiML/02f4-69d4-linear-w01 in 42.520s
chaiml-02f4-69d4-linear-w01-v59-uploader: creating bucket guanaco-vllm-models
chaiml-02f4-69d4-linear-w01-v59-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-w01-v59-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-02f4-69d4-linear-w01-v59-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-02f4-69d4-linear-w01-v59-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-02f4-69d4-linear-w01-v59-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-w01-v59-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-w01-v59-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-w01-v59-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-w01-v59-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-w01-v59-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-w01-v59-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-w01-v59-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-w01-v59-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-02f4-69d4-linear-w01-v59-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-02f4-69d4-linear-w01-v59-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-02f4-69d4-linear-w01-v59-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-02f4-69d4-linear-w01-v59-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-02f4-69d4-linear-w01-v59-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/README.md
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/config.json
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/.gitattributes
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/mergekit_config.yml s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/mergekit_config.yml
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/mergekit_config.yaml s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/mergekit_config.yaml
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/special_tokens_map.json
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/model.safetensors.index.json
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/tokenizer_config.json
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/tokenizer.json
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/model-00010-of-00010.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/model-00010-of-00010.safetensors
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/model-00005-of-00010.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/model-00005-of-00010.safetensors
HTTP Request: %s %s "%s %d %s"
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/model-00007-of-00010.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/model-00007-of-00010.safetensors
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/model-00003-of-00010.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/model-00003-of-00010.safetensors
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/model-00008-of-00010.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/model-00008-of-00010.safetensors
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/model-00002-of-00010.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/model-00002-of-00010.safetensors
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/model-00004-of-00010.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/model-00004-of-00010.safetensors
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/model-00001-of-00010.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/model-00001-of-00010.safetensors
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/model-00006-of-00010.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/model-00006-of-00010.safetensors
chaiml-02f4-69d4-linear-w01-v59-uploader: cp /dev/shm/model_output/model-00009-of-00010.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v59/model-00009-of-00010.safetensors
Job chaiml-02f4-69d4-linear-w01-v59-uploader completed after 461.78s with status: succeeded
Stopping job with name chaiml-02f4-69d4-linear-w01-v59-uploader
Pipeline stage VLLMUploader completed in 462.35s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-02f4-69d4-linear-w01-v59
Waiting for inference service chaiml-02f4-69d4-linear-w01-v59 to be ready
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-02f4-69d4-linear-w01-v59 ready after 231.35801148414612s
Pipeline stage VLLMDeployer completed in 232.34s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.724193811416626s
Received healthy response to inference request in 2.2781357765197754s
Received healthy response to inference request in 2.653033971786499s
Received healthy response to inference request in 2.361124277114868s
Received healthy response to inference request in 2.358997106552124s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.4652695655822754s
Received healthy response to inference request in 2.393549680709839s
Received healthy response to inference request in 2.568701982498169s
Received healthy response to inference request in 2.6071367263793945s
Received healthy response to inference request in 2.4450292587280273s
Received healthy response to inference request in 2.48641300201416s
Received healthy response to inference request in 2.3122828006744385s
Received healthy response to inference request in 2.3030741214752197s
Received healthy response to inference request in 2.3625640869140625s
Received healthy response to inference request in 2.225961208343506s
Received healthy response to inference request in 2.3558597564697266s
Received healthy response to inference request in 2.3660428524017334s
Received healthy response to inference request in 2.4307022094726562s
Received healthy response to inference request in 2.521045684814453s
Received healthy response to inference request in 2.317425489425659s
Received healthy response to inference request in 2.3123080730438232s
Received healthy response to inference request in 2.3818840980529785s
Received healthy response to inference request in 2.5595757961273193s
Received healthy response to inference request in 2.3740904331207275s
Received healthy response to inference request in 2.617979049682617s
Received healthy response to inference request in 2.386936664581299s
Received healthy response to inference request in 2.2607944011688232s
Received healthy response to inference request in 2.197282314300537s
Received healthy response to inference request in 2.2605552673339844s
Received healthy response to inference request in 2.478706121444702s
30 requests
0 failed requests
5th percentile: 2.241528534889221
10th percentile: 2.260770487785339
20th percentile: 2.3104410648345945
30th percentile: 2.3443294763565063
40th percentile: 2.361988162994385
50th percentile: 2.377987265586853
60th percentile: 2.4084106922149657
70th percentile: 2.4693005323410033
80th percentile: 2.5287517070770265
90th percentile: 2.6082209587097167
95th percentile: 2.637259256839752
99th percentile: 2.703557457923889
mean time: 2.4122218529383344
Pipeline stage StressChecker completed in 75.12s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.78s
Shutdown handler de-registered
chaiml-02f4-69d4-linear-w01_v59 status is now deployed due to DeploymentManager action
chaiml-02f4-69d4-linear-w01_v59 status is now inactive due to auto deactivation removed underperforming models
chaiml-02f4-69d4-linear-w01_v59 status is now torndown due to DeploymentManager action