Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-4d70-fd43-linear-51732-v5-uploader
Waiting for job on chaiml-4d70-fd43-linear-51732-v5-uploader to finish
chaiml-4d70-fd43-linear-51732-v5-uploader: Using quantization_mode: none
chaiml-4d70-fd43-linear-51732-v5-uploader: Downloading snapshot of ChaiML/4d70-fd43-linear-w01-FP8...
chaiml-4d70-fd43-linear-51732-v5-uploader:
Fetching 12 files: 0%| | 0/12 [00:00<?, ?it/s]
Fetching 12 files: 8%|▊ | 1/12 [00:00<00:03, 3.36it/s]
Fetching 12 files: 42%|████▏ | 5/12 [00:09<00:13, 1.97s/it]
Fetching 12 files: 100%|██████████| 12/12 [00:09<00:00, 1.28it/s]
chaiml-4d70-fd43-linear-51732-v5-uploader: Downloaded in 9.548s
chaiml-4d70-fd43-linear-51732-v5-uploader: Processed model ChaiML/4d70-fd43-linear-w01-FP8 in 14.671s
chaiml-4d70-fd43-linear-51732-v5-uploader: creating bucket guanaco-vllm-models
chaiml-4d70-fd43-linear-51732-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v5-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-4d70-fd43-linear-51732-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-4d70-fd43-linear-51732-v5-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-4d70-fd43-linear-51732-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v5-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linear-51732-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v5-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linear-51732-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v5-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linear-51732-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v5-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linear-51732-v5-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-4d70-fd43-linear-51732-v5-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-4d70-fd43-linear-51732-v5-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-4d70-fd43-linear-51732-v5-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-4d70-fd43-linear-51732-v5-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-4d70-fd43-linear-51732-v5-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/.gitattributes
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/chat_template.jinja
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/generation_config.json
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/recipe.yaml
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/config.json
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/special_tokens_map.json
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/model.safetensors.index.json
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/tokenizer_config.json
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/tokenizer.json
HTTP Request: %s %s "%s %d %s"
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/model-00003-of-00003.safetensors
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/model-00001-of-00003.safetensors
chaiml-4d70-fd43-linear-51732-v5-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v5/model-00002-of-00003.safetensors
Job chaiml-4d70-fd43-linear-51732-v5-uploader completed after 246.4s with status: succeeded
Stopping job with name chaiml-4d70-fd43-linear-51732-v5-uploader
Pipeline stage VLLMUploader completed in 247.26s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-4d70-fd43-linear-51732-v5
Waiting for inference service chaiml-4d70-fd43-linear-51732-v5 to be ready
Inference service chaiml-4d70-fd43-linear-51732-v5 ready after 342.79850673675537s
Pipeline stage VLLMDeployer completed in 343.37s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.262068748474121s
Received healthy response to inference request in 1.921759843826294s
Received healthy response to inference request in 1.8139748573303223s
Received healthy response to inference request in 2.2993693351745605s
Received healthy response to inference request in 1.8656291961669922s
Received healthy response to inference request in 2.096179485321045s
Received healthy response to inference request in 2.6981022357940674s
Received healthy response to inference request in 2.070909023284912s
Received healthy response to inference request in 2.0163750648498535s
Received healthy response to inference request in 1.930345058441162s
Received healthy response to inference request in 2.3440752029418945s
Received healthy response to inference request in 2.066230297088623s
Received healthy response to inference request in 1.9074761867523193s
Received healthy response to inference request in 2.1675913333892822s
Received healthy response to inference request in 2.0092220306396484s
Received healthy response to inference request in 2.15114426612854s
Received healthy response to inference request in 2.245950937271118s
Received healthy response to inference request in 1.997901439666748s
Received healthy response to inference request in 1.955291986465454s
Received healthy response to inference request in 2.4449551105499268s
Received healthy response to inference request in 1.8711223602294922s
Received healthy response to inference request in 2.3657591342926025s
Received healthy response to inference request in 1.8234012126922607s
Received healthy response to inference request in 2.397702217102051s
Received healthy response to inference request in 1.9607837200164795s
Received healthy response to inference request in 2.211822032928467s
Received healthy response to inference request in 1.9810717105865479s
Received healthy response to inference request in 1.9695615768432617s
Received healthy response to inference request in 2.1259994506835938s
Received healthy response to inference request in 1.8053443431854248s
30 requests
0 failed requests
5th percentile: 1.8182167172431947
10th percentile: 1.861406397819519
20th percentile: 1.918903112411499
30th percentile: 1.9591361999511718
40th percentile: 1.991169548034668
50th percentile: 2.0413026809692383
60th percentile: 2.1081074714660644
70th percentile: 2.1808605432510375
80th percentile: 2.269528865814209
90th percentile: 2.3689534425735475
95th percentile: 2.423691308498382
99th percentile: 2.624689569473267
mean time: 2.0925706466039022
Pipeline stage StressChecker completed in 66.82s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.60s
Shutdown handler de-registered
chaiml-4d70-fd43-linear_51732_v5 status is now deployed due to DeploymentManager action
chaiml-4d70-fd43-linear_51732_v5 status is now inactive due to system request
chaiml-4d70-fd43-linear_51732_v5 status is now inactive due to auto deactivation removed underperforming models