Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-muster-v0-q235b-52842-v6-uploader
Waiting for job on chaiml-muster-v0-q235b-52842-v6-uploader to finish
chaiml-muster-v0-q235b-52842-v6-uploader: Using quantization_mode: w4a16
chaiml-muster-v0-q235b-52842-v6-uploader: Checking if ChaiML/muster-v0-q235b-lr1e4ep2r64g4-W4A16 already exists in ChaiML
chaiml-muster-v0-q235b-52842-v6-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-muster-v0-q235b-52842-v6-uploader: Downloading snapshot of ChaiML/muster-v0-q235b-lr1e4ep2r64g4-W4A16...
chaiml-muster-v0-q235b-52842-v6-uploader:
Fetching 39 files: 0%| | 0/39 [00:00<?, ?it/s]
Fetching 39 files: 3%|▎ | 1/39 [00:00<00:09, 3.82it/s]
Fetching 39 files: 5%|▌ | 2/39 [00:00<00:06, 5.75it/s]
Fetching 39 files: 18%|█▊ | 7/39 [00:16<01:23, 2.61s/it]
Fetching 39 files: 23%|██▎ | 9/39 [00:16<00:54, 1.81s/it]
Fetching 39 files: 33%|███▎ | 13/39 [00:17<00:27, 1.06s/it]
Fetching 39 files: 38%|███▊ | 15/39 [00:23<00:38, 1.59s/it]
Fetching 39 files: 41%|████ | 16/39 [00:26<00:41, 1.82s/it]
Fetching 39 files: 44%|████▎ | 17/39 [00:28<00:39, 1.78s/it]
Fetching 39 files: 46%|████▌ | 18/39 [00:31<00:44, 2.14s/it]
Fetching 39 files: 56%|█████▋ | 22/39 [00:33<00:19, 1.13s/it]
Fetching 39 files: 59%|█████▉ | 23/39 [00:38<00:28, 1.77s/it]
Fetching 39 files: 62%|██████▏ | 24/39 [00:38<00:21, 1.46s/it]
Fetching 39 files: 64%|██████▍ | 25/39 [00:41<00:27, 1.94s/it]
Fetching 39 files: 67%|██████▋ | 26/39 [00:45<00:30, 2.35s/it]
Fetching 39 files: 69%|██████▉ | 27/39 [00:46<00:24, 2.03s/it]
Fetching 39 files: 77%|███████▋ | 30/39 [00:47<00:09, 1.10s/it]
Fetching 39 files: 79%|███████▉ | 31/39 [00:48<00:09, 1.15s/it]
Fetching 39 files: 82%|████████▏ | 32/39 [00:49<00:07, 1.09s/it]
Fetching 39 files: 100%|██████████| 39/39 [00:49<00:00, 1.27s/it]
chaiml-muster-v0-q235b-52842-v6-uploader: Downloaded in 49.710s
chaiml-muster-v0-q235b-52842-v6-uploader: Processed model ChaiML/muster-v0-q235b-lr1e4ep2r64g4 in 50.431s
chaiml-muster-v0-q235b-52842-v6-uploader: creating bucket guanaco-vllm-models
chaiml-muster-v0-q235b-52842-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0-q235b-52842-v6-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-muster-v0-q235b-52842-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-muster-v0-q235b-52842-v6-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-muster-v0-q235b-52842-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0-q235b-52842-v6-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-muster-v0-q235b-52842-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0-q235b-52842-v6-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-muster-v0-q235b-52842-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0-q235b-52842-v6-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-muster-v0-q235b-52842-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0-q235b-52842-v6-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-muster-v0-q235b-52842-v6-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-muster-v0-q235b-52842-v6-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-muster-v0-q235b-52842-v6-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-muster-v0-q235b-52842-v6-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-muster-v0-q235b-52842-v6-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-muster-v0-q235b-52842-v6-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/.gitattributes
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/tokenizer_config.json
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/generation_config.json
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/added_tokens.json
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/chat_template.jinja
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/config.json
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/quantization_config.json
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/special_tokens_map.json
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/merges.txt
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model.safetensors.index.json
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/vocab.json
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/tokenizer.json
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00027-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00017-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00007-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00026-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00005-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00002-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00016-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00003-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00015-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00010-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00009-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00013-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00020-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00021-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00008-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00023-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00011-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00018-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00025-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00014-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00022-of-00027.safetensors
chaiml-muster-v0-q235b-52842-v6-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0-q235b-52842-v6/model-00006-of-00027.safetensors
Job chaiml-muster-v0-q235b-52842-v6-uploader completed after 629.83s with status: succeeded
Stopping job with name chaiml-muster-v0-q235b-52842-v6-uploader
Pipeline stage VLLMUploader completed in 631.83s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.23s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-muster-v0-q235b-52842-v6
Waiting for inference service chaiml-muster-v0-q235b-52842-v6 to be ready
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-muster-v0-q235b-52842-v6 ready after 681.1703681945801s
Pipeline stage VLLMDeployer completed in 684.42s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.040964365005493s
Received healthy response to inference request in 2.138357639312744s
Received healthy response to inference request in 2.1285195350646973s
Received healthy response to inference request in 2.4840595722198486s
Received healthy response to inference request in 3.35284686088562s
Received healthy response to inference request in 2.350490093231201s
Received healthy response to inference request in 2.1899333000183105s
Received healthy response to inference request in 2.2009236812591553s
Received healthy response to inference request in 2.225278854370117s
Received healthy response to inference request in 2.0802488327026367s
Received healthy response to inference request in 2.1349170207977295s
Received healthy response to inference request in 2.2719240188598633s
Received healthy response to inference request in 2.213517904281616s
Received healthy response to inference request in 2.682164430618286s
Received healthy response to inference request in 2.3154847621917725s
Received healthy response to inference request in 2.293148994445801s
Received healthy response to inference request in 2.5054774284362793s
Received healthy response to inference request in 2.1872472763061523s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.4327330589294434s
Received healthy response to inference request in 2.281363010406494s
Received healthy response to inference request in 2.2755777835845947s
Received healthy response to inference request in 2.0567359924316406s
Received healthy response to inference request in 1.9863755702972412s
Received healthy response to inference request in 2.0307648181915283s
Received healthy response to inference request in 2.039573907852173s
Received healthy response to inference request in 2.46384334564209s
Received healthy response to inference request in 1.8822503089904785s
Received healthy response to inference request in 2.1974494457244873s
Received healthy response to inference request in 2.0089073181152344s
Received healthy response to inference request in 1.9537112712860107s
30 requests
0 failed requests
5th percentile: 1.9684102058410644
10th percentile: 2.006654143333435
20th percentile: 2.040686273574829
30th percentile: 2.114038324356079
40th percentile: 2.1676914215087892
50th percentile: 2.1991865634918213
60th percentile: 2.2439369201660155
70th percentile: 2.284898805618286
80th percentile: 2.3669386863708497
90th percentile: 2.486201357841492
95th percentile: 2.6026552796363824
99th percentile: 3.1583489561080937
mean time: 2.246826346715291
Pipeline stage StressChecker completed in 78.85s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.67s
Shutdown handler de-registered
chaiml-muster-v0-q235b-_52842_v6 status is now deployed due to DeploymentManager action
chaiml-muster-v0-q235b-_52842_v6 status is now inactive due to auto deactivation removed underperforming models
chaiml-muster-v0-q235b-_52842_v6 status is now torndown due to DeploymentManager action