Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-2a6f-69d4-linear-82448-v4-uploader
Waiting for job on chaiml-2a6f-69d4-linear-82448-v4-uploader to finish
chaiml-2a6f-69d4-linear-82448-v4-uploader: Using quantization_mode: none
chaiml-2a6f-69d4-linear-82448-v4-uploader: Downloading snapshot of ChaiML/2a6f-69d4-linear-w01-W4A16-G128-AutoRound...
chaiml-2a6f-69d4-linear-82448-v4-uploader:
Fetching 13 files: 0%| | 0/13 [00:00<?, ?it/s]
Fetching 13 files: 8%|▊ | 1/13 [00:00<00:03, 3.94it/s]
Fetching 13 files: 46%|████▌ | 6/13 [00:10<00:12, 1.75s/it]
Fetching 13 files: 100%|██████████| 13/13 [00:10<00:00, 1.29it/s]
chaiml-2a6f-69d4-linear-82448-v4-uploader: Downloaded in 10.191s
chaiml-2a6f-69d4-linear-82448-v4-uploader: Processed model ChaiML/2a6f-69d4-linear-w01-W4A16-G128-AutoRound in 15.323s
chaiml-2a6f-69d4-linear-82448-v4-uploader: creating bucket guanaco-vllm-models
chaiml-2a6f-69d4-linear-82448-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-82448-v4-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-2a6f-69d4-linear-82448-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-2a6f-69d4-linear-82448-v4-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-2a6f-69d4-linear-82448-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-82448-v4-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-2a6f-69d4-linear-82448-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-82448-v4-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-2a6f-69d4-linear-82448-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-82448-v4-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-2a6f-69d4-linear-82448-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-82448-v4-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-2a6f-69d4-linear-82448-v4-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-2a6f-69d4-linear-82448-v4-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-2a6f-69d4-linear-82448-v4-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-2a6f-69d4-linear-82448-v4-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-2a6f-69d4-linear-82448-v4-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-2a6f-69d4-linear-82448-v4-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/.gitattributes
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/README.md
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/recipe.yaml
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/chat_template.jinja
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/config.json
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/generation_config.json
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/special_tokens_map.json
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/model.safetensors.index.json
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/tokenizer_config.json
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/tokenizer.json
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/model-00003-of-00003.safetensors
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/model-00002-of-00003.safetensors
chaiml-2a6f-69d4-linear-82448-v4-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-82448-v4/model-00001-of-00003.safetensors
Job chaiml-2a6f-69d4-linear-82448-v4-uploader completed after 114.69s with status: succeeded
Stopping job with name chaiml-2a6f-69d4-linear-82448-v4-uploader
Pipeline stage VLLMUploader completed in 118.45s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.99s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-2a6f-69d4-linear-82448-v4
Waiting for inference service chaiml-2a6f-69d4-linear-82448-v4 to be ready
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-2a6f-69d4-linear-82448-v4 ready after 170.78555250167847s
Pipeline stage VLLMDeployer completed in 174.84s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.4183413982391357s
Received healthy response to inference request in 1.2427663803100586s
Received healthy response to inference request in 1.2333691120147705s
Received healthy response to inference request in 1.8170530796051025s
Received healthy response to inference request in 1.3368663787841797s
Received healthy response to inference request in 1.1918668746948242s
Received healthy response to inference request in 1.4786200523376465s
Received healthy response to inference request in 1.6635198593139648s
Received healthy response to inference request in 1.0374314785003662s
Received healthy response to inference request in 1.1172888278961182s
Received healthy response to inference request in 1.4135103225708008s
Received healthy response to inference request in 1.0667076110839844s
Received healthy response to inference request in 1.0238101482391357s
Received healthy response to inference request in 0.9348094463348389s
Received healthy response to inference request in 1.1456429958343506s
Received healthy response to inference request in 1.2903499603271484s
Received healthy response to inference request in 1.66733980178833s
Received healthy response to inference request in 1.1439671516418457s
Received healthy response to inference request in 1.3212130069732666s
Received healthy response to inference request in 1.409825086593628s
Received healthy response to inference request in 1.634291648864746s
Received healthy response to inference request in 0.9836392402648926s
Received healthy response to inference request in 0.9879086017608643s
Received healthy response to inference request in 1.266228199005127s
Received healthy response to inference request in 1.3538849353790283s
Received healthy response to inference request in 1.0868806838989258s
Received healthy response to inference request in 1.6631219387054443s
Received healthy response to inference request in 0.9463725090026855s
Received healthy response to inference request in 1.1977100372314453s
Received healthy response to inference request in 1.2243010997772217s
30 requests
0 failed requests
5th percentile: 0.9631425380706787
10th percentile: 0.987481665611267
20th percentile: 1.0608523845672608
30th percentile: 1.1359636545181275
40th percentile: 1.1953727722167968
50th percentile: 1.2380677461624146
60th percentile: 1.3026951789855956
70th percentile: 1.370666980743408
80th percentile: 1.430397129058838
90th percentile: 1.6631617307662965
95th percentile: 1.6656208276748656
99th percentile: 1.7736362290382386
mean time: 1.2766212622324626
Pipeline stage StressChecker completed in 50.86s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.66s
Shutdown handler de-registered
chaiml-2a6f-69d4-linear_82448_v4 status is now deployed due to DeploymentManager action
chaiml-2a6f-69d4-linear_82448_v4 status is now inactive due to auto deactivation removed underperforming models
chaiml-2a6f-69d4-linear_82448_v4 status is now torndown due to DeploymentManager action