Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-2fe5-c13f-linear-57126-v4-uploader
Waiting for job on chaiml-2fe5-c13f-linear-57126-v4-uploader to finish
HTTP Request: %s %s "%s %d %s"
chaiml-2fe5-c13f-linear-57126-v4-uploader: Using quantization_mode: none
chaiml-2fe5-c13f-linear-57126-v4-uploader: Downloading snapshot of ChaiML/2fe5-c13f-linear-w01-FP8...
chaiml-2fe5-c13f-linear-57126-v4-uploader:
Fetching 12 files: 0%| | 0/12 [00:00<?, ?it/s]
Fetching 12 files: 8%|▊ | 1/12 [00:00<00:03, 3.30it/s]
Fetching 12 files: 42%|████▏ | 5/12 [00:08<00:12, 1.85s/it]
Fetching 12 files: 100%|██████████| 12/12 [00:08<00:00, 1.37it/s]
chaiml-2fe5-c13f-linear-57126-v4-uploader: Downloaded in 8.918s
chaiml-2fe5-c13f-linear-57126-v4-uploader: Processed model ChaiML/2fe5-c13f-linear-w01-FP8 in 14.022s
chaiml-2fe5-c13f-linear-57126-v4-uploader: creating bucket guanaco-vllm-models
chaiml-2fe5-c13f-linear-57126-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v4-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-2fe5-c13f-linear-57126-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-2fe5-c13f-linear-57126-v4-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-2fe5-c13f-linear-57126-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v4-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-2fe5-c13f-linear-57126-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v4-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-2fe5-c13f-linear-57126-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v4-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-2fe5-c13f-linear-57126-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v4-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-2fe5-c13f-linear-57126-v4-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-2fe5-c13f-linear-57126-v4-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-2fe5-c13f-linear-57126-v4-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-2fe5-c13f-linear-57126-v4-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-2fe5-c13f-linear-57126-v4-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-2fe5-c13f-linear-57126-v4-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4
chaiml-2fe5-c13f-linear-57126-v4-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4/.gitattributes
chaiml-2fe5-c13f-linear-57126-v4-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4/config.json
chaiml-2fe5-c13f-linear-57126-v4-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4/chat_template.jinja
chaiml-2fe5-c13f-linear-57126-v4-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4/generation_config.json
chaiml-2fe5-c13f-linear-57126-v4-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4/recipe.yaml
chaiml-2fe5-c13f-linear-57126-v4-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4/special_tokens_map.json
chaiml-2fe5-c13f-linear-57126-v4-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4/model.safetensors.index.json
chaiml-2fe5-c13f-linear-57126-v4-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4/tokenizer_config.json
chaiml-2fe5-c13f-linear-57126-v4-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4/tokenizer.json
chaiml-2fe5-c13f-linear-57126-v4-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4/model-00001-of-00003.safetensors
chaiml-2fe5-c13f-linear-57126-v4-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v4/model-00002-of-00003.safetensors
Job chaiml-2fe5-c13f-linear-57126-v4-uploader completed after 236.98s with status: succeeded
Stopping job with name chaiml-2fe5-c13f-linear-57126-v4-uploader
Pipeline stage VLLMUploader completed in 238.05s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.18s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-2fe5-c13f-linear-57126-v4
Waiting for inference service chaiml-2fe5-c13f-linear-57126-v4 to be ready
Inference service chaiml-2fe5-c13f-linear-57126-v4 ready after 322.2115161418915s
Pipeline stage VLLMDeployer completed in 322.77s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.087371349334717s
Received healthy response to inference request in 1.8539512157440186s
Received healthy response to inference request in 1.7013652324676514s
Received healthy response to inference request in 1.6185319423675537s
Received healthy response to inference request in 2.081845760345459s
Received healthy response to inference request in 2.073981523513794s
Received healthy response to inference request in 1.6716835498809814s
Received healthy response to inference request in 1.9759888648986816s
Received healthy response to inference request in 1.9800631999969482s
Received healthy response to inference request in 1.6176843643188477s
Received healthy response to inference request in 1.7112421989440918s
Received healthy response to inference request in 1.7919666767120361s
Received healthy response to inference request in 1.867624044418335s
Received healthy response to inference request in 2.1996283531188965s
Received healthy response to inference request in 1.8580262660980225s
Received healthy response to inference request in 1.8534865379333496s
Received healthy response to inference request in 2.2567942142486572s
Received healthy response to inference request in 1.8307459354400635s
Received healthy response to inference request in 1.839869737625122s
Received healthy response to inference request in 1.7926523685455322s
Received healthy response to inference request in 1.7188427448272705s
Received healthy response to inference request in 1.8886141777038574s
Received healthy response to inference request in 1.7383840084075928s
Received healthy response to inference request in 2.19899320602417s
Received healthy response to inference request in 2.1965715885162354s
Received healthy response to inference request in 1.6792051792144775s
Received healthy response to inference request in 2.0358729362487793s
Received healthy response to inference request in 1.6354374885559082s
Received healthy response to inference request in 1.6671898365020752s
Received healthy response to inference request in 1.8605124950408936s
30 requests
0 failed requests
5th percentile: 1.6261394381523133
10th percentile: 1.6640146017074584
20th percentile: 1.6969332218170166
30th percentile: 1.7325216293334962
40th percentile: 1.815508508682251
50th percentile: 1.853718876838684
60th percentile: 1.8633571147918702
70th percentile: 1.9772111654281617
80th percentile: 2.075554370880127
90th percentile: 2.1968137502670286
95th percentile: 2.1993425369262694
99th percentile: 2.2402161145210266
mean time: 1.8761375665664672
Pipeline stage StressChecker completed in 61.17s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.08s
Shutdown handler de-registered
chaiml-2fe5-c13f-linear_57126_v4 status is now deployed due to DeploymentManager action
chaiml-2fe5-c13f-linear_57126_v4 status is now inactive due to system request
chaiml-2fe5-c13f-linear_57126_v4 status is now inactive due to auto deactivation removed underperforming models