Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-02f4-69d4-linear-76375-v8-uploader
Waiting for job on chaiml-02f4-69d4-linear-76375-v8-uploader to finish
chaiml-02f4-69d4-linear-76375-v8-uploader: Using quantization_mode: none
chaiml-02f4-69d4-linear-76375-v8-uploader: Downloading snapshot of ChaiML/02f4-69d4-linear-w01-W4A16-G128-AutoRound...
chaiml-02f4-69d4-linear-76375-v8-uploader:
Fetching 12 files: 0%| | 0/12 [00:00<?, ?it/s]
Fetching 12 files: 8%|▊ | 1/12 [00:00<00:02, 4.21it/s]
Fetching 12 files: 42%|████▏ | 5/12 [00:06<00:09, 1.40s/it]
Fetching 12 files: 50%|█████ | 6/12 [00:07<00:07, 1.25s/it]
Fetching 12 files: 100%|██████████| 12/12 [00:07<00:00, 1.62it/s]
chaiml-02f4-69d4-linear-76375-v8-uploader: Downloaded in 7.534s
chaiml-02f4-69d4-linear-76375-v8-uploader: Processed model ChaiML/02f4-69d4-linear-w01-W4A16-G128-AutoRound in 12.741s
chaiml-02f4-69d4-linear-76375-v8-uploader: creating bucket guanaco-vllm-models
chaiml-02f4-69d4-linear-76375-v8-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-76375-v8-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-02f4-69d4-linear-76375-v8-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-02f4-69d4-linear-76375-v8-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-02f4-69d4-linear-76375-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-76375-v8-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-76375-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-76375-v8-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-76375-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-76375-v8-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-76375-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-76375-v8-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-76375-v8-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-02f4-69d4-linear-76375-v8-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-02f4-69d4-linear-76375-v8-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-02f4-69d4-linear-76375-v8-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-02f4-69d4-linear-76375-v8-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-02f4-69d4-linear-76375-v8-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8
chaiml-02f4-69d4-linear-76375-v8-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8/.gitattributes
chaiml-02f4-69d4-linear-76375-v8-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8/README.md
chaiml-02f4-69d4-linear-76375-v8-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8/recipe.yaml
chaiml-02f4-69d4-linear-76375-v8-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8/generation_config.json
chaiml-02f4-69d4-linear-76375-v8-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8/config.json
chaiml-02f4-69d4-linear-76375-v8-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8/special_tokens_map.json
chaiml-02f4-69d4-linear-76375-v8-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8/model.safetensors.index.json
chaiml-02f4-69d4-linear-76375-v8-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8/tokenizer_config.json
chaiml-02f4-69d4-linear-76375-v8-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8/tokenizer.json
chaiml-02f4-69d4-linear-76375-v8-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8/model-00003-of-00003.safetensors
chaiml-02f4-69d4-linear-76375-v8-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-76375-v8/model-00002-of-00003.safetensors
Job chaiml-02f4-69d4-linear-76375-v8-uploader completed after 121.65s with status: succeeded
Stopping job with name chaiml-02f4-69d4-linear-76375-v8-uploader
Pipeline stage VLLMUploader completed in 122.33s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-02f4-69d4-linear-76375-v8
Waiting for inference service chaiml-02f4-69d4-linear-76375-v8 to be ready
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-02f4-69d4-linear-76375-v8 ready after 160.8827075958252s
Pipeline stage VLLMDeployer completed in 161.74s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.103761911392212s
Received healthy response to inference request in 1.2143001556396484s
Received healthy response to inference request in 1.4753379821777344s
Received healthy response to inference request in 0.9521987438201904s
Received healthy response to inference request in 1.9204020500183105s
Received healthy response to inference request in 1.1528615951538086s
Received healthy response to inference request in 1.3599917888641357s
Received healthy response to inference request in 1.1594793796539307s
Received healthy response to inference request in 0.9882094860076904s
Received healthy response to inference request in 0.9454214572906494s
Received healthy response to inference request in 1.317251205444336s
Received healthy response to inference request in 1.2420353889465332s
Received healthy response to inference request in 0.9635715484619141s
Received healthy response to inference request in 1.0367398262023926s
Received healthy response to inference request in 0.9772744178771973s
Received healthy response to inference request in 1.2245519161224365s
Received healthy response to inference request in 1.353315830230713s
Received healthy response to inference request in 0.9659638404846191s
Received healthy response to inference request in 1.072394609451294s
Received healthy response to inference request in 1.4235341548919678s
Received healthy response to inference request in 1.537687063217163s
Received healthy response to inference request in 1.7540433406829834s
Received healthy response to inference request in 1.006382703781128s
Received healthy response to inference request in 1.04013991355896s
Received healthy response to inference request in 1.3382389545440674s
Received healthy response to inference request in 0.9949185848236084s
Received healthy response to inference request in 1.0139191150665283s
Received healthy response to inference request in 1.1486711502075195s
Received healthy response to inference request in 1.3170528411865234s
Received healthy response to inference request in 1.0131142139434814s
30 requests
0 failed requests
5th percentile: 0.9573165059089661
10th percentile: 0.9657246112823487
20th percentile: 0.9935767650604248
30th percentile: 1.0136776447296143
40th percentile: 1.0594927310943605
50th percentile: 1.150766372680664
60th percentile: 1.2184008598327636
70th percentile: 1.317112350463867
80th percentile: 1.3546510219573975
90th percentile: 1.4815728902816774
95th percentile: 1.6566830158233636
99th percentile: 1.8721580243110658
mean time: 1.2004255056381226
Pipeline stage StressChecker completed in 39.02s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.65s
Shutdown handler de-registered
chaiml-02f4-69d4-linear_76375_v8 status is now deployed due to DeploymentManager action
chaiml-02f4-69d4-linear_76375_v8 status is now inactive due to auto deactivation removed underperforming models
chaiml-02f4-69d4-linear_76375_v8 status is now torndown due to DeploymentManager action