Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-muster-v0b-lr5e6-21126-v3-uploader
Waiting for job on chaiml-muster-v0b-lr5e6-21126-v3-uploader to finish
HTTP Request: %s %s "%s %d %s"
chaiml-muster-v0b-lr5e6-21126-v3-uploader: Using quantization_mode: w4a16
chaiml-muster-v0b-lr5e6-21126-v3-uploader: Checking if ChaiML/muster-v0b-lr5e6ep2r64g4b01-W4A16 already exists in ChaiML
chaiml-muster-v0b-lr5e6-21126-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-muster-v0b-lr5e6-21126-v3-uploader: Downloading snapshot of ChaiML/muster-v0b-lr5e6ep2r64g4b01-W4A16...
chaiml-muster-v0b-lr5e6-21126-v3-uploader:
Fetching 39 files: 0%| | 0/39 [00:00<?, ?it/s]
Fetching 39 files: 3%|▎ | 1/39 [00:00<00:12, 3.09it/s]
Fetching 39 files: 18%|█▊ | 7/39 [00:15<01:15, 2.37s/it]
Fetching 39 files: 21%|██ | 8/39 [00:21<01:31, 2.96s/it]
Fetching 39 files: 38%|███▊ | 15/39 [00:27<00:37, 1.58s/it]
Fetching 39 files: 41%|████ | 16/39 [00:29<00:37, 1.64s/it]
Fetching 39 files: 46%|████▌ | 18/39 [00:34<00:40, 1.90s/it]
Fetching 39 files: 56%|█████▋ | 22/39 [00:35<00:20, 1.21s/it]
Fetching 39 files: 59%|█████▉ | 23/39 [00:41<00:28, 1.77s/it]
Fetching 39 files: 62%|██████▏ | 24/39 [00:47<00:36, 2.40s/it]
Fetching 39 files: 69%|██████▉ | 27/39 [00:48<00:19, 1.60s/it]
Fetching 39 files: 74%|███████▍ | 29/39 [00:50<00:14, 1.44s/it]
Fetching 39 files: 77%|███████▋ | 30/39 [00:52<00:13, 1.51s/it]
Fetching 39 files: 79%|███████▉ | 31/39 [00:52<00:10, 1.31s/it]
Fetching 39 files: 82%|████████▏ | 32/39 [00:52<00:07, 1.10s/it]
Fetching 39 files: 100%|██████████| 39/39 [00:52<00:00, 1.36s/it]
chaiml-muster-v0b-lr5e6-21126-v3-uploader: Downloaded in 53.035s
chaiml-muster-v0b-lr5e6-21126-v3-uploader: Processed model ChaiML/muster-v0b-lr5e6ep2r64g4b01 in 53.568s
chaiml-muster-v0b-lr5e6-21126-v3-uploader: creating bucket guanaco-vllm-models
chaiml-muster-v0b-lr5e6-21126-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0b-lr5e6-21126-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-muster-v0b-lr5e6-21126-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-muster-v0b-lr5e6-21126-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-muster-v0b-lr5e6-21126-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0b-lr5e6-21126-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-muster-v0b-lr5e6-21126-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0b-lr5e6-21126-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-muster-v0b-lr5e6-21126-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0b-lr5e6-21126-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-muster-v0b-lr5e6-21126-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0b-lr5e6-21126-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-muster-v0b-lr5e6-21126-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-muster-v0b-lr5e6-21126-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-muster-v0b-lr5e6-21126-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-muster-v0b-lr5e6-21126-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-muster-v0b-lr5e6-21126-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-muster-v0b-lr5e6-21126-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/.gitattributes
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/added_tokens.json
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/generation_config.json
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/tokenizer_config.json
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/quantization_config.json
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/config.json
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/merges.txt
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/chat_template.jinja
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/special_tokens_map.json
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/vocab.json
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/tokenizer.json
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model.safetensors.index.json
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00027-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00011-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00012-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00017-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00026-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00004-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00010-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00006-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00001-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00002-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00005-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00007-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00020-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00022-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00014-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00008-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00016-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00025-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00018-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00003-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00024-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00015-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00019-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00013-of-00027.safetensors
chaiml-muster-v0b-lr5e6-21126-v3-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0b-lr5e6-21126-v3/model-00023-of-00027.safetensors
Job chaiml-muster-v0b-lr5e6-21126-v3-uploader completed after 510.79s with status: succeeded
Stopping job with name chaiml-muster-v0b-lr5e6-21126-v3-uploader
Pipeline stage VLLMUploader completed in 511.16s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-muster-v0b-lr5e6-21126-v3
Waiting for inference service chaiml-muster-v0b-lr5e6-21126-v3 to be ready
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-muster-v0b-lr5e6-21126-v3 ready after 630.6182100772858s
Pipeline stage VLLMDeployer completed in 630.97s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.180022716522217s
Received healthy response to inference request in 1.8862738609313965s
Received healthy response to inference request in 2.0096380710601807s
Received healthy response to inference request in 1.8895885944366455s
Received healthy response to inference request in 1.9450383186340332s
Received healthy response to inference request in 2.016106367111206s
Received healthy response to inference request in 2.289552688598633s
Received healthy response to inference request in 2.263131618499756s
Received healthy response to inference request in 2.0056076049804688s
Received healthy response to inference request in 2.067863702774048s
Received healthy response to inference request in 1.9829490184783936s
Received healthy response to inference request in 1.9918980598449707s
Received healthy response to inference request in 1.8960201740264893s
Received healthy response to inference request in 2.023494243621826s
Received healthy response to inference request in 1.8874461650848389s
Received healthy response to inference request in 2.326613187789917s
Received healthy response to inference request in 1.8877949714660645s
Received healthy response to inference request in 2.052690029144287s
Received healthy response to inference request in 1.9201626777648926s
Received healthy response to inference request in 2.1834867000579834s
Received healthy response to inference request in 1.925039529800415s
Received healthy response to inference request in 2.0971717834472656s
Received healthy response to inference request in 2.26011061668396s
Received healthy response to inference request in 1.9594008922576904s
Received healthy response to inference request in 1.99161958694458s
Received healthy response to inference request in 2.0685689449310303s
Received healthy response to inference request in 2.2210233211517334s
Received healthy response to inference request in 2.078432559967041s
Received healthy response to inference request in 1.9775645732879639s
Received healthy response to inference request in 1.920426368713379s
30 requests
0 failed requests
5th percentile: 1.8876031279563903
10th percentile: 1.8894092321395874
20th percentile: 1.9203736305236816
30th percentile: 1.9550921201705933
40th percentile: 1.9881513595581055
50th percentile: 2.0076228380203247
60th percentile: 2.0351725578308106
70th percentile: 2.0715280294418337
80th percentile: 2.1807155132293703
90th percentile: 2.2604127168655395
95th percentile: 2.2776632070541383
99th percentile: 2.3158656430244444
mean time: 2.0401578982671102
Pipeline stage StressChecker completed in 63.83s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.61s
Shutdown handler de-registered
chaiml-muster-v0b-lr5e6_21126_v3 status is now deployed due to DeploymentManager action
chaiml-muster-v0b-lr5e6_21126_v3 status is now inactive due to auto deactivation removed underperforming models
chaiml-muster-v0b-lr5e6_21126_v3 status is now torndown due to DeploymentManager action