Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-muster-v0d-lr1e5-93728-v3-uploader
Waiting for job on chaiml-muster-v0d-lr1e5-93728-v3-uploader to finish
chaiml-muster-v0d-lr1e5-93728-v3-uploader: Using quantization_mode: w4a16
chaiml-muster-v0d-lr1e5-93728-v3-uploader: Checking if ChaiML/muster-v0d-lr1e5ep2r64g4b01-W4A16 already exists in ChaiML
chaiml-muster-v0d-lr1e5-93728-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-muster-v0d-lr1e5-93728-v3-uploader: Downloading snapshot of ChaiML/muster-v0d-lr1e5ep2r64g4b01-W4A16...
chaiml-muster-v0d-lr1e5-93728-v3-uploader:
Fetching 39 files: 0%| | 0/39 [00:00<?, ?it/s]
Fetching 39 files: 3%|▎ | 1/39 [00:00<00:10, 3.78it/s]
Fetching 39 files: 18%|█▊ | 7/39 [00:16<01:17, 2.43s/it]
Fetching 39 files: 21%|██ | 8/39 [00:19<01:20, 2.60s/it]
Fetching 39 files: 31%|███ | 12/39 [00:20<00:36, 1.36s/it]
Fetching 39 files: 38%|███▊ | 15/39 [00:33<00:59, 2.47s/it]
Fetching 39 files: 41%|████ | 16/39 [00:36<00:58, 2.54s/it]
Fetching 39 files: 44%|████▎ | 17/39 [00:37<00:48, 2.21s/it]
Fetching 39 files: 49%|████▊ | 19/39 [00:39<00:36, 1.81s/it]
Fetching 39 files: 51%|█████▏ | 20/39 [00:41<00:34, 1.81s/it]
Fetching 39 files: 54%|█████▍ | 21/39 [00:41<00:28, 1.57s/it]
Fetching 39 files: 59%|█████▉ | 23/39 [00:53<00:49, 3.09s/it]
Fetching 39 files: 62%|██████▏ | 24/39 [00:54<00:42, 2.83s/it]
Fetching 39 files: 64%|██████▍ | 25/39 [00:57<00:38, 2.74s/it]
Fetching 39 files: 69%|██████▉ | 27/39 [00:58<00:22, 1.90s/it]
Fetching 39 files: 72%|███████▏ | 28/39 [01:00<00:19, 1.82s/it]
Fetching 39 files: 74%|███████▍ | 29/39 [01:00<00:14, 1.49s/it]
Fetching 39 files: 77%|███████▋ | 30/39 [01:00<00:10, 1.17s/it]
Fetching 39 files: 79%|███████▉ | 31/39 [01:03<00:13, 1.69s/it]
Fetching 39 files: 100%|██████████| 39/39 [01:03<00:00, 1.64s/it]
chaiml-muster-v0d-lr1e5-93728-v3-uploader: Downloaded in 64.052s
chaiml-muster-v0d-lr1e5-93728-v3-uploader: Processed model ChaiML/muster-v0d-lr1e5ep2r64g4b01 in 64.581s
chaiml-muster-v0d-lr1e5-93728-v3-uploader: creating bucket guanaco-vllm-models
chaiml-muster-v0d-lr1e5-93728-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-muster-v0d-lr1e5-93728-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-muster-v0d-lr1e5-93728-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-muster-v0d-lr1e5-93728-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-muster-v0d-lr1e5-93728-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-muster-v0d-lr1e5-93728-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-muster-v0d-lr1e5-93728-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0d-lr1e5-93728-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-muster-v0d-lr1e5-93728-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-muster-v0d-lr1e5-93728-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-muster-v0d-lr1e5-93728-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-muster-v0d-lr1e5-93728-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-muster-v0d-lr1e5-93728-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-muster-v0d-lr1e5-93728-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/added_tokens.json
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/chat_template.jinja
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/tokenizer_config.json
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/config.json
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/quantization_config.json
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/merges.txt
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/vocab.json
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/special_tokens_map.json
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/generation_config.json
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/.gitattributes
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model.safetensors.index.json
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/tokenizer.json
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00027-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00025-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00007-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00016-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00023-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00005-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00009-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00024-of-00027.safetensors
HTTP Request: %s %s "%s %d %s"
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00013-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00021-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00001-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00004-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00022-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00014-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00002-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00003-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00019-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00011-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00008-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00020-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00015-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00010-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00026-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00012-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00018-of-00027.safetensors
chaiml-muster-v0d-lr1e5-93728-v3-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0d-lr1e5-93728-v3/model-00006-of-00027.safetensors
Job chaiml-muster-v0d-lr1e5-93728-v3-uploader completed after 637.27s with status: succeeded
Stopping job with name chaiml-muster-v0d-lr1e5-93728-v3-uploader
Pipeline stage VLLMUploader completed in 637.72s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-muster-v0d-lr1e5-93728-v3
Waiting for inference service chaiml-muster-v0d-lr1e5-93728-v3 to be ready
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-muster-v0d-lr1e5-93728-v3 ready after 692.0221788883209s
Pipeline stage VLLMDeployer completed in 692.38s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.084733009338379s
Received healthy response to inference request in 2.1152169704437256s
Received healthy response to inference request in 2.6183090209960938s
Received healthy response to inference request in 2.1744182109832764s
Received healthy response to inference request in 2.3255412578582764s
Received healthy response to inference request in 2.2467076778411865s
Received healthy response to inference request in 2.1954874992370605s
Received healthy response to inference request in 2.996521472930908s
Received healthy response to inference request in 1.9919583797454834s
Received healthy response to inference request in 1.931269884109497s
Received healthy response to inference request in 2.130032539367676s
Received healthy response to inference request in 2.5248160362243652s
Received healthy response to inference request in 2.079612970352173s
Received healthy response to inference request in 2.054462432861328s
Received healthy response to inference request in 2.093092918395996s
Received healthy response to inference request in 2.0711495876312256s
Received healthy response to inference request in 2.283942222595215s
Received healthy response to inference request in 2.1905901432037354s
Received healthy response to inference request in 2.1319351196289062s
Received healthy response to inference request in 2.2715706825256348s
Received healthy response to inference request in 2.008235454559326s
Received healthy response to inference request in 2.071763753890991s
Received healthy response to inference request in 2.235868215560913s
Received healthy response to inference request in 2.0375277996063232s
Received healthy response to inference request in 1.9653558731079102s
Received healthy response to inference request in 1.988879919052124s
Received healthy response to inference request in 2.0509142875671387s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.2696006298065186s
Received healthy response to inference request in 2.4161410331726074s
Received healthy response to inference request in 2.88078236579895s
30 requests
0 failed requests
5th percentile: 1.9759416937828065
10th percentile: 1.9916505336761474
20th percentile: 2.0482369899749755
30th percentile: 2.0715795040130613
40th percentile: 2.0897489547729493
50th percentile: 2.130983829498291
60th percentile: 2.1925490856170655
70th percentile: 2.253575563430786
80th percentile: 2.2922620296478273
90th percentile: 2.5341653347015383
95th percentile: 2.762669360637664
99th percentile: 2.9629571318626406
mean time: 2.2145479122797647
Pipeline stage StressChecker completed in 70.39s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.74s
Shutdown handler de-registered
chaiml-muster-v0d-lr1e5_93728_v3 status is now deployed due to DeploymentManager action
chaiml-muster-v0d-lr1e5_93728_v3 status is now inactive due to auto deactivation removed underperforming models
chaiml-muster-v0d-lr1e5_93728_v3 status is now torndown due to DeploymentManager action