Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-muster-v0d-lr1e5-93728-v2-uploader
Waiting for job on chaiml-muster-v0d-lr1e5-93728-v2-uploader to finish
chaiml-muster-v0d-lr1e5-93728-v2-uploader: Using quantization_mode: w4a16
chaiml-muster-v0d-lr1e5-93728-v2-uploader: Checking if ChaiML/muster-v0d-lr1e5ep2r64g4b01-W4A16 already exists in ChaiML
chaiml-muster-v0d-lr1e5-93728-v2-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-muster-v0d-lr1e5-93728-v2-uploader: Downloading snapshot of ChaiML/muster-v0d-lr1e5ep2r64g4b01-W4A16...
chaiml-muster-v0d-lr1e5-93728-v2-uploader:
Fetching 39 files: 0%| | 0/39 [00:00<?, ?it/s]
Fetching 39 files: 3%|▎ | 1/39 [00:00<00:10, 3.69it/s]
Fetching 39 files: 18%|█▊ | 7/39 [00:14<01:07, 2.10s/it]
Fetching 39 files: 23%|██▎ | 9/39 [00:15<00:50, 1.68s/it]
Fetching 39 files: 38%|███▊ | 15/39 [00:23<00:35, 1.49s/it]
Fetching 39 files: 41%|████ | 16/39 [00:25<00:34, 1.51s/it]
Fetching 39 files: 44%|████▎ | 17/39 [00:26<00:31, 1.41s/it]
Fetching 39 files: 46%|████▌ | 18/39 [00:26<00:26, 1.27s/it]
Fetching 39 files: 49%|████▊ | 19/39 [00:27<00:24, 1.23s/it]
Fetching 39 files: 51%|█████▏ | 20/39 [00:29<00:25, 1.35s/it]
Fetching 39 files: 59%|█████▉ | 23/39 [00:37<00:30, 1.91s/it]
Fetching 39 files: 62%|██████▏ | 24/39 [00:38<00:26, 1.80s/it]
Fetching 39 files: 64%|██████▍ | 25/39 [00:39<00:21, 1.57s/it]
Fetching 39 files: 67%|██████▋ | 26/39 [00:39<00:17, 1.36s/it]
Fetching 39 files: 69%|██████▉ | 27/39 [00:40<00:14, 1.21s/it]
Fetching 39 files: 72%|███████▏ | 28/39 [00:42<00:14, 1.33s/it]
Fetching 39 files: 74%|███████▍ | 29/39 [00:42<00:10, 1.02s/it]
Fetching 39 files: 79%|███████▉ | 31/39 [00:45<00:09, 1.20s/it]
Fetching 39 files: 100%|██████████| 39/39 [00:45<00:00, 1.16s/it]
chaiml-muster-v0d-lr1e5-93728-v2-uploader: Downloaded in 45.342s
chaiml-muster-v0d-lr1e5-93728-v2-uploader: Processed model ChaiML/muster-v0d-lr1e5ep2r64g4b01 in 45.978s
chaiml-muster-v0d-lr1e5-93728-v2-uploader: creating bucket guanaco-vllm-models
chaiml-muster-v0d-lr1e5-93728-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-muster-v0d-lr1e5-93728-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-muster-v0d-lr1e5-93728-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-muster-v0d-lr1e5-93728-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-muster-v0d-lr1e5-93728-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-muster-v0d-lr1e5-93728-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-muster-v0d-lr1e5-93728-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-muster-v0d-lr1e5-93728-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-muster-v0d-lr1e5-93728-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-muster-v0d-lr1e5-93728-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-muster-v0d-lr1e5-93728-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-muster-v0d-lr1e5-93728-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-muster-v0d-lr1e5-93728-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/chat_template.jinja
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/generation_config.json
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/.gitattributes
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/tokenizer_config.json
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/added_tokens.json
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/special_tokens_map.json
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/config.json
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/merges.txt
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/quantization_config.json
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/vocab.json
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/tokenizer.json
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model.safetensors.index.json
HTTP Request: %s %s "%s %d %s"
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00027-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00006-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00007-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00014-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00025-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00004-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00012-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00005-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00001-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00015-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00017-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00022-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00003-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00010-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00023-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00008-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00009-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00002-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00019-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00026-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00020-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00018-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00021-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00016-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00013-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v2-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v2/model-00024-of-00027.safetensors
Job chaiml-muster-v0d-lr1e5-93728-v2-uploader completed after 788.26s with status: succeeded
Stopping job with name chaiml-muster-v0d-lr1e5-93728-v2-uploader
Pipeline stage VLLMUploader completed in 788.69s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-muster-v0d-lr1e5-93728-v2
Waiting for inference service chaiml-muster-v0d-lr1e5-93728-v2 to be ready
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-muster-v0d-lr1e5-93728-v2 ready after 862.5399181842804s
Pipeline stage VLLMDeployer completed in 863.09s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.9338586330413818s
Received healthy response to inference request in 1.7884235382080078s
Received healthy response to inference request in 1.774756908416748s
Received healthy response to inference request in 1.7757701873779297s
Received healthy response to inference request in 2.0494725704193115s
Received healthy response to inference request in 1.8372349739074707s
Received healthy response to inference request in 1.987264633178711s
Received healthy response to inference request in 2.010521173477173s
Received healthy response to inference request in 1.843583106994629s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.782477378845215s
Received healthy response to inference request in 1.7150063514709473s
Received healthy response to inference request in 1.9158666133880615s
Received healthy response to inference request in 2.037799119949341s
Received healthy response to inference request in 1.9492902755737305s
Received healthy response to inference request in 2.249088764190674s
Received healthy response to inference request in 1.8532969951629639s
Received healthy response to inference request in 1.7791917324066162s
Received healthy response to inference request in 1.896125078201294s
Received healthy response to inference request in 2.0851433277130127s
Received healthy response to inference request in 1.8598518371582031s
Received healthy response to inference request in 1.953859567642212s
Received healthy response to inference request in 2.6329398155212402s
Received healthy response to inference request in 2.045351982116699s
Received healthy response to inference request in 1.8080859184265137s
Received healthy response to inference request in 1.7410502433776855s
Received healthy response to inference request in 2.0252528190612793s
Received healthy response to inference request in 2.1554150581359863s
Received healthy response to inference request in 1.8821206092834473s
Received healthy response to inference request in 1.7894043922424316s
Received healthy response to inference request in 2.4305615425109863s
30 requests
0 failed requests
5th percentile: 1.7562182426452637
10th percentile: 1.7756688594818115
20th percentile: 1.789208221435547
30th percentile: 1.8416786670684815
40th percentile: 1.8732131004333497
50th percentile: 1.9248626232147217
60th percentile: 1.9672215938568114
70th percentile: 2.029016709327698
80th percentile: 2.056606721878052
90th percentile: 2.2672360420227053
95th percentile: 2.5418695926666253
99th percentile: 2.739111485481262
mean time: 1.9862688382466633
Pipeline stage StressChecker completed in 63.03s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.67s
Shutdown handler de-registered
chaiml-muster-v0d-lr1e5_93728_v2 status is now deployed due to DeploymentManager action
chaiml-muster-v0d-lr1e5_93728_v2 status is now inactive due to auto deactivation removed underperforming models
chaiml-muster-v0d-lr1e5_93728_v2 status is now torndown due to DeploymentManager action