Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-muster-v0e-lr1e5-71769-v3-uploader
Waiting for job on chaiml-muster-v0e-lr1e5-71769-v3-uploader to finish
chaiml-muster-v0e-lr1e5-71769-v3-uploader: Using quantization_mode: w4a16
chaiml-muster-v0e-lr1e5-71769-v3-uploader: Checking if ChaiML/muster-v0e-lr1e5ep2r64g4b01-W4A16 already exists in ChaiML
chaiml-muster-v0e-lr1e5-71769-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-muster-v0e-lr1e5-71769-v3-uploader: Downloading snapshot of ChaiML/muster-v0e-lr1e5ep2r64g4b01-W4A16...
chaiml-muster-v0e-lr1e5-71769-v3-uploader:
Fetching 39 files: 0%| | 0/39 [00:00<?, ?it/s]
Fetching 39 files: 3%|▎ | 1/39 [00:00<00:08, 4.72it/s]
Fetching 39 files: 18%|█▊ | 7/39 [00:16<01:18, 2.46s/it]
Fetching 39 files: 21%|██ | 8/39 [00:17<01:06, 2.16s/it]
Fetching 39 files: 28%|██▊ | 11/39 [00:17<00:35, 1.28s/it]
Fetching 39 files: 38%|███▊ | 15/39 [00:25<00:37, 1.55s/it]
Fetching 39 files: 41%|████ | 16/39 [00:27<00:39, 1.72s/it]
Fetching 39 files: 44%|████▎ | 17/39 [00:33<00:53, 2.42s/it]
Fetching 39 files: 59%|█████▉ | 23/39 [00:40<00:26, 1.66s/it]
Fetching 39 files: 62%|██████▏ | 24/39 [00:42<00:25, 1.69s/it]
Fetching 39 files: 64%|██████▍ | 25/39 [00:45<00:26, 1.88s/it]
Fetching 39 files: 67%|██████▋ | 26/39 [00:47<00:23, 1.82s/it]
Fetching 39 files: 69%|██████▉ | 27/39 [00:47<00:18, 1.57s/it]
Fetching 39 files: 72%|███████▏ | 28/39 [00:49<00:17, 1.59s/it]
Fetching 39 files: 74%|███████▍ | 29/39 [00:49<00:12, 1.26s/it]
Fetching 39 files: 77%|███████▋ | 30/39 [00:49<00:09, 1.04s/it]
Fetching 39 files: 79%|███████▉ | 31/39 [00:51<00:08, 1.12s/it]
Fetching 39 files: 82%|████████▏ | 32/39 [00:51<00:07, 1.01s/it]
Fetching 39 files: 100%|██████████| 39/39 [00:51<00:00, 1.33s/it]
chaiml-muster-v0e-lr1e5-71769-v3-uploader: Downloaded in 52.000s
chaiml-muster-v0e-lr1e5-71769-v3-uploader: Processed model ChaiML/muster-v0e-lr1e5ep2r64g4b01 in 52.668s
chaiml-muster-v0e-lr1e5-71769-v3-uploader: creating bucket guanaco-vllm-models
chaiml-muster-v0e-lr1e5-71769-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0e-lr1e5-71769-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-muster-v0e-lr1e5-71769-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-muster-v0e-lr1e5-71769-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-muster-v0e-lr1e5-71769-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0e-lr1e5-71769-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-muster-v0e-lr1e5-71769-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0e-lr1e5-71769-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-muster-v0e-lr1e5-71769-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0e-lr1e5-71769-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-muster-v0e-lr1e5-71769-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0e-lr1e5-71769-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-muster-v0e-lr1e5-71769-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-muster-v0e-lr1e5-71769-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-muster-v0e-lr1e5-71769-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-muster-v0e-lr1e5-71769-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-muster-v0e-lr1e5-71769-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-muster-v0e-lr1e5-71769-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/added_tokens.json
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/.gitattributes
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/generation_config.json
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/tokenizer_config.json
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/config.json
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/quantization_config.json
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/chat_template.jinja
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/special_tokens_map.json
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/merges.txt
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/vocab.json
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model.safetensors.index.json
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/tokenizer.json
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00027-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00006-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00019-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00010-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00004-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00021-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00025-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00005-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00013-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00026-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00008-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00018-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00024-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00007-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00022-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00014-of-00027.safetensors
chaiml-muster-v0e-lr1e5-71769-v3-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0e-lr1e5-71769-v3/model-00015-of-00027.safetensors
Job chaiml-muster-v0e-lr1e5-71769-v3-uploader completed after 690.48s with status: succeeded
Stopping job with name chaiml-muster-v0e-lr1e5-71769-v3-uploader
Pipeline stage VLLMUploader completed in 690.86s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-muster-v0e-lr1e5-71769-v3
Waiting for inference service chaiml-muster-v0e-lr1e5-71769-v3 to be ready
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-muster-v0e-lr1e5-71769-v3 ready after 672.8282823562622s
Pipeline stage VLLMDeployer completed in 673.17s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.1852753162384033s
Received healthy response to inference request in 2.0621819496154785s
Received healthy response to inference request in 1.953869104385376s
Failed to get response for submission chaiml-csfs-v3-3-dpo-lr_86358_v2: ('http://guanaco-model-mesh-load-balancer.model-mesh.k2.chaiverse.com/models/chaiml-csfs-v3-3-dpo-lr_86358_v2/predict', '{"detail":"503, message=\'Attempt to decode JSON with unexpected mimetype: text/plain\', url=\'http://10.1.124.176:8080/models/chaiml-csfs-v3-3-dpo-lr_86358_v2/predict\'"}')
Received healthy response to inference request in 1.983518123626709s
Received healthy response to inference request in 2.3776838779449463s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 1.9094805717468262s
Received healthy response to inference request in 2.2486226558685303s
Received healthy response to inference request in 2.482966899871826s
Received healthy response to inference request in 2.0116114616394043s
Received healthy response to inference request in 2.092593193054199s
Received healthy response to inference request in 1.97731351852417s
Received healthy response to inference request in 2.2279114723205566s
Received healthy response to inference request in 1.8969495296478271s
Received healthy response to inference request in 2.0974249839782715s
Received healthy response to inference request in 2.0060269832611084s
Received healthy response to inference request in 2.148690700531006s
Received healthy response to inference request in 2.4950828552246094s
Received healthy response to inference request in 1.9056000709533691s
Received healthy response to inference request in 1.887833833694458s
Received healthy response to inference request in 2.11030912399292s
Received healthy response to inference request in 1.9227879047393799s
Received healthy response to inference request in 1.8933145999908447s
Received healthy response to inference request in 1.8956167697906494s
Received healthy response to inference request in 2.0516610145568848s
Received healthy response to inference request in 2.5824077129364014s
Received healthy response to inference request in 2.263392925262451s
Received healthy response to inference request in 2.325875759124756s
Received healthy response to inference request in 1.9624559879302979s
Received healthy response to inference request in 1.9066660404205322s
Received healthy response to inference request in 2.110189437866211s
30 requests
0 failed requests
5th percentile: 1.8943505764007569
10th percentile: 1.8968162536621094
20th percentile: 1.9089176654815674
30th percentile: 1.9598799228668213
40th percentile: 1.9970234394073487
50th percentile: 2.0569214820861816
60th percentile: 2.102530765533447
70th percentile: 2.159666085243225
80th percentile: 2.2515767097473143
90th percentile: 2.3882121801376344
95th percentile: 2.489630675315857
99th percentile: 2.5570835041999818
mean time: 2.0991771459579467
Pipeline stage StressChecker completed in 66.25s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.71s
Shutdown handler de-registered
chaiml-muster-v0e-lr1e5_71769_v3 status is now deployed due to DeploymentManager action
chaiml-muster-v0e-lr1e5_71769_v3 status is now inactive due to auto deactivation removed underperforming models
chaiml-muster-v0e-lr1e5_71769_v3 status is now torndown due to DeploymentManager action