Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-muster-v0d-lr1e5-93728-v6-uploader
Waiting for job on chaiml-muster-v0d-lr1e5-93728-v6-uploader to finish
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
chaiml-muster-v0d-lr1e5-93728-v6-uploader: Using quantization_mode: w4a16
chaiml-muster-v0d-lr1e5-93728-v6-uploader: Checking if ChaiML/muster-v0d-lr1e5ep2r64g4b01-W4A16 already exists in ChaiML
chaiml-muster-v0d-lr1e5-93728-v6-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-muster-v0d-lr1e5-93728-v6-uploader: Downloading snapshot of ChaiML/muster-v0d-lr1e5ep2r64g4b01-W4A16...
chaiml-muster-v0d-lr1e5-93728-v6-uploader:
Fetching 39 files: 0%| | 0/39 [00:00<?, ?it/s]
Fetching 39 files: 3%|▎ | 1/39 [00:00<00:09, 3.82it/s]
Fetching 39 files: 8%|▊ | 3/39 [00:00<00:03, 9.46it/s]
Fetching 39 files: 18%|█▊ | 7/39 [00:14<01:20, 2.50s/it]
Fetching 39 files: 38%|███▊ | 15/39 [00:22<00:34, 1.44s/it]
Fetching 39 files: 41%|████ | 16/39 [00:24<00:34, 1.51s/it]
Fetching 39 files: 46%|████▌ | 18/39 [00:26<00:29, 1.43s/it]
Fetching 39 files: 49%|████▊ | 19/39 [00:27<00:28, 1.43s/it]
Fetching 39 files: 51%|█████▏ | 20/39 [00:29<00:27, 1.45s/it]
Fetching 39 files: 59%|█████▉ | 23/39 [00:34<00:24, 1.56s/it]
Fetching 39 files: 62%|██████▏ | 24/39 [00:36<00:23, 1.55s/it]
Fetching 39 files: 64%|██████▍ | 25/39 [00:36<00:18, 1.35s/it]
Fetching 39 files: 67%|██████▋ | 26/39 [00:38<00:19, 1.51s/it]
Fetching 39 files: 69%|██████▉ | 27/39 [00:41<00:21, 1.82s/it]
Fetching 39 files: 72%|███████▏ | 28/39 [00:41<00:16, 1.47s/it]
Fetching 39 files: 74%|███████▍ | 29/39 [00:42<00:11, 1.18s/it]
Fetching 39 files: 77%|███████▋ | 30/39 [00:42<00:08, 1.12it/s]
Fetching 39 files: 79%|███████▉ | 31/39 [00:43<00:07, 1.10it/s]
Fetching 39 files: 82%|████████▏ | 32/39 [00:44<00:06, 1.11it/s]
Fetching 39 files: 100%|██████████| 39/39 [00:44<00:00, 1.14s/it]
chaiml-muster-v0d-lr1e5-93728-v6-uploader: Downloaded in 44.483s
chaiml-muster-v0d-lr1e5-93728-v6-uploader: Processed model ChaiML/muster-v0d-lr1e5ep2r64g4b01 in 45.025s
chaiml-muster-v0d-lr1e5-93728-v6-uploader: creating bucket guanaco-vllm-models
chaiml-muster-v0d-lr1e5-93728-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v6-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-muster-v0d-lr1e5-93728-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-muster-v0d-lr1e5-93728-v6-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-muster-v0d-lr1e5-93728-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v6-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-muster-v0d-lr1e5-93728-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v6-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-muster-v0d-lr1e5-93728-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v6-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-muster-v0d-lr1e5-93728-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v6-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-muster-v0d-lr1e5-93728-v6-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-muster-v0d-lr1e5-93728-v6-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-muster-v0d-lr1e5-93728-v6-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-muster-v0d-lr1e5-93728-v6-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-muster-v0d-lr1e5-93728-v6-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-muster-v0d-lr1e5-93728-v6-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/.gitattributes
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/chat_template.jinja
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/generation_config.json
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/added_tokens.json
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/special_tokens_map.json
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/quantization_config.json
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/config.json
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/tokenizer_config.json
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model.safetensors.index.json
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/vocab.json
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/tokenizer.json
Retrying (%r) after connection broken by '%r': %s
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00027-of-00027.safetensors
HTTP Request: %s %s "%s %d %s"
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00006-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00023-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00013-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00019-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00008-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00003-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00015-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00009-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00020-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00016-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00025-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00021-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00002-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00024-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00026-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00004-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00001-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00022-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00005-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00012-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00010-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00017-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00014-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00011-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00007-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v6-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v6/model-00018-of-00027.safetensors
Job chaiml-muster-v0d-lr1e5-93728-v6-uploader completed after 950.88s with status: succeeded
Stopping job with name chaiml-muster-v0d-lr1e5-93728-v6-uploader
Pipeline stage VLLMUploader completed in 951.35s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-muster-v0d-lr1e5-93728-v6
Waiting for inference service chaiml-muster-v0d-lr1e5-93728-v6 to be ready
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-muster-v0d-lr1e5-93728-v6 ready after 1030.6475791931152s
Pipeline stage VLLMDeployer completed in 1031.04s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.281466007232666s
Received healthy response to inference request in 1.974853277206421s
Received healthy response to inference request in 2.116328716278076s
Received healthy response to inference request in 2.0602593421936035s
Received healthy response to inference request in 2.230879783630371s
Received healthy response to inference request in 1.9994394779205322s
Received healthy response to inference request in 1.9238426685333252s
Received healthy response to inference request in 2.1131362915039062s
Received healthy response to inference request in 1.8844518661499023s
Received healthy response to inference request in 2.000431776046753s
Received healthy response to inference request in 1.999312162399292s
Received healthy response to inference request in 2.049391269683838s
Received healthy response to inference request in 2.15158748626709s
Received healthy response to inference request in 1.985304832458496s
Received healthy response to inference request in 2.5766305923461914s
Received healthy response to inference request in 2.0968639850616455s
Received healthy response to inference request in 1.9225409030914307s
Received healthy response to inference request in 2.2826600074768066s
Received healthy response to inference request in 1.9627830982208252s
Received healthy response to inference request in 1.905799388885498s
Received healthy response to inference request in 2.058389663696289s
Received healthy response to inference request in 1.9573733806610107s
Received healthy response to inference request in 1.9567816257476807s
Received healthy response to inference request in 2.1219398975372314s
Received healthy response to inference request in 2.33064603805542s
Received healthy response to inference request in 2.1086158752441406s
Received healthy response to inference request in 1.894547700881958s
Received healthy response to inference request in 1.9782721996307373s
Received healthy response to inference request in 2.1227633953094482s
Received healthy response to inference request in 2.068185329437256s
30 requests
0 failed requests
5th percentile: 1.899610960483551
10th percentile: 1.9208667516708373
20th percentile: 1.9572550296783446
30th percentile: 1.9772465229034424
40th percentile: 1.9993885517120362
50th percentile: 2.0538904666900635
60th percentile: 2.079656791687012
70th percentile: 2.1140940189361572
80th percentile: 2.1285282135009767
90th percentile: 2.2815854072570803
95th percentile: 2.3090523242950436
99th percentile: 2.505295071601868
mean time: 2.0705159346262616
Pipeline stage StressChecker completed in 65.54s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.66s
Shutdown handler de-registered
chaiml-muster-v0d-lr1e5_93728_v6 status is now deployed due to DeploymentManager action
chaiml-muster-v0d-lr1e5_93728_v6 status is now inactive due to auto deactivation removed underperforming models