Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-02f4-69d4-linear-30131-v1-uploader
Waiting for job on chaiml-02f4-69d4-linear-30131-v1-uploader to finish
chaiml-02f4-69d4-linear-30131-v1-uploader: Using quantization_mode: none
chaiml-02f4-69d4-linear-30131-v1-uploader: Downloading snapshot of ChaiML/02f4-69d4-linear-w01-FP8...
chaiml-02f4-69d4-linear-30131-v1-uploader:
Fetching 14 files: 0%| | 0/14 [00:00<?, ?it/s]
Fetching 14 files: 7%|▋ | 1/14 [00:00<00:03, 3.33it/s]
Fetching 14 files: 29%|██▊ | 4/14 [00:09<00:26, 2.64s/it]
Fetching 14 files: 43%|████▎ | 6/14 [00:10<00:12, 1.52s/it]
Fetching 14 files: 100%|██████████| 14/14 [00:10<00:00, 1.40it/s]
chaiml-02f4-69d4-linear-30131-v1-uploader: Downloaded in 10.184s
chaiml-02f4-69d4-linear-30131-v1-uploader: Processed model ChaiML/02f4-69d4-linear-w01-FP8 in 19.201s
chaiml-02f4-69d4-linear-30131-v1-uploader: creating bucket guanaco-vllm-models
chaiml-02f4-69d4-linear-30131-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-02f4-69d4-linear-30131-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-02f4-69d4-linear-30131-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-02f4-69d4-linear-30131-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-30131-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-30131-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-30131-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-30131-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-02f4-69d4-linear-30131-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-02f4-69d4-linear-30131-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-02f4-69d4-linear-30131-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-02f4-69d4-linear-30131-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-02f4-69d4-linear-30131-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/.gitattributes
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/generation_config.json
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/config.json
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/recipe.yaml
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/model.safetensors.index.json
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/special_tokens_map.json
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/tokenizer_config.json
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/tokenizer.json
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/model-00006-of-00006.safetensors
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/model-00005-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/model-00005-of-00006.safetensors
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/model-00004-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/model-00004-of-00006.safetensors
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/model-00001-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/model-00001-of-00006.safetensors
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/model-00003-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/model-00003-of-00006.safetensors
chaiml-02f4-69d4-linear-30131-v1-uploader: cp /dev/shm/model_output/model-00002-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v1/model-00002-of-00006.safetensors
Job chaiml-02f4-69d4-linear-30131-v1-uploader completed after 185.3s with status: succeeded
Stopping job with name chaiml-02f4-69d4-linear-30131-v1-uploader
Pipeline stage VLLMUploader completed in 185.75s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-02f4-69d4-linear-30131-v1
Waiting for inference service chaiml-02f4-69d4-linear-30131-v1 to be ready
Inference service chaiml-02f4-69d4-linear-30131-v1 ready after 160.6968731880188s
Pipeline stage VLLMDeployer completed in 161.26s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.6450586318969727s
Received healthy response to inference request in 1.630673885345459s
Received healthy response to inference request in 1.4105002880096436s
Received healthy response to inference request in 1.4793212413787842s
Received healthy response to inference request in 1.8205738067626953s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 1.578401803970337s
Received healthy response to inference request in 1.3559472560882568s
Received healthy response to inference request in 1.8939173221588135s
Received healthy response to inference request in 1.5601170063018799s
Received healthy response to inference request in 1.387953281402588s
Received healthy response to inference request in 1.4302892684936523s
Received healthy response to inference request in 1.3657896518707275s
Received healthy response to inference request in 1.5863001346588135s
Received healthy response to inference request in 1.6118991374969482s
Received healthy response to inference request in 1.5230696201324463s
Received healthy response to inference request in 1.5222032070159912s
Received healthy response to inference request in 1.6522126197814941s
Received healthy response to inference request in 1.4202296733856201s
Received healthy response to inference request in 1.3554785251617432s
Received healthy response to inference request in 1.4439022541046143s
Received healthy response to inference request in 1.6222479343414307s
Received healthy response to inference request in 1.6782252788543701s
Received healthy response to inference request in 1.7974939346313477s
Received healthy response to inference request in 1.4114484786987305s
Received healthy response to inference request in 1.4066739082336426s
Received healthy response to inference request in 1.3954510688781738s
Received healthy response to inference request in 1.5672602653503418s
Received healthy response to inference request in 1.476170539855957s
Received healthy response to inference request in 1.7923696041107178s
Received healthy response to inference request in 1.3772974014282227s
30 requests
0 failed requests
5th percentile: 1.3603763341903687
10th percentile: 1.376146626472473
20th percentile: 1.4044293403625487
30th percentile: 1.4175953149795533
40th percentile: 1.4632632255554199
50th percentile: 1.5226364135742188
60th percentile: 1.57171688079834
70th percentile: 1.615003776550293
80th percentile: 1.646489429473877
90th percentile: 1.7928820371627807
95th percentile: 1.810187864303589
99th percentile: 1.8726477026939392
mean time: 1.5399492343266805
Pipeline stage StressChecker completed in 49.75s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.60s
Shutdown handler de-registered
chaiml-02f4-69d4-linear_30131_v1 status is now deployed due to DeploymentManager action
chaiml-02f4-69d4-linear_30131_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-02f4-69d4-linear_30131_v1 status is now torndown due to DeploymentManager action