Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-muster-v0a-lr1e5-44160-v2-uploader
Waiting for job on chaiml-muster-v0a-lr1e5-44160-v2-uploader to finish
chaiml-muster-v0a-lr1e5-44160-v2-uploader: Using quantization_mode: w4a16
chaiml-muster-v0a-lr1e5-44160-v2-uploader: Checking if ChaiML/muster-v0a-lr1e5ep2r64g4b01-W4A16 already exists in ChaiML
chaiml-muster-v0a-lr1e5-44160-v2-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-muster-v0a-lr1e5-44160-v2-uploader: Downloading snapshot of ChaiML/muster-v0a-lr1e5ep2r64g4b01-W4A16...
Unable to record family friendly update due to error: Invalid JSON input: Expecting value: line 1 column 1 (char 0)
chaiml-muster-v0a-lr1e5-44160-v2-uploader:
Fetching 39 files: 0%| | 0/39 [00:00<?, ?it/s]
Fetching 39 files: 3%|▎ | 1/39 [00:00<00:12, 2.99it/s]
Fetching 39 files: 18%|█▊ | 7/39 [00:16<01:17, 2.43s/it]
Fetching 39 files: 21%|██ | 8/39 [00:17<01:06, 2.15s/it]
Fetching 39 files: 38%|███▊ | 15/39 [00:27<00:42, 1.76s/it]
Fetching 39 files: 44%|████▎ | 17/39 [00:28<00:32, 1.46s/it]
Fetching 39 files: 49%|████▊ | 19/39 [00:30<00:26, 1.35s/it]
Fetching 39 files: 51%|█████▏ | 20/39 [00:31<00:24, 1.27s/it]
Fetching 39 files: 54%|█████▍ | 21/39 [00:31<00:19, 1.11s/it]
Fetching 39 files: 56%|█████▋ | 22/39 [00:32<00:17, 1.02s/it]
Fetching 39 files: 59%|█████▉ | 23/39 [00:41<00:43, 2.73s/it]
Fetching 39 files: 62%|██████▏ | 24/39 [00:42<00:35, 2.38s/it]
Fetching 39 files: 64%|██████▍ | 25/39 [00:43<00:27, 1.99s/it]
Fetching 39 files: 69%|██████▉ | 27/39 [00:44<00:16, 1.41s/it]
Fetching 39 files: 72%|███████▏ | 28/39 [00:45<00:13, 1.25s/it]
Fetching 39 files: 74%|███████▍ | 29/39 [00:46<00:11, 1.14s/it]
Fetching 39 files: 77%|███████▋ | 30/39 [00:46<00:08, 1.04it/s]
Fetching 39 files: 79%|███████▉ | 31/39 [00:48<00:10, 1.28s/it]
Fetching 39 files: 82%|████████▏ | 32/39 [00:49<00:08, 1.19s/it]
Fetching 39 files: 100%|██████████| 39/39 [00:49<00:00, 1.27s/it]
chaiml-muster-v0a-lr1e5-44160-v2-uploader: Downloaded in 49.645s
chaiml-muster-v0a-lr1e5-44160-v2-uploader: Processed model ChaiML/muster-v0a-lr1e5ep2r64g4b01 in 50.194s
chaiml-muster-v0a-lr1e5-44160-v2-uploader: creating bucket guanaco-vllm-models
chaiml-muster-v0a-lr1e5-44160-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0a-lr1e5-44160-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-muster-v0a-lr1e5-44160-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-muster-v0a-lr1e5-44160-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-muster-v0a-lr1e5-44160-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0a-lr1e5-44160-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-muster-v0a-lr1e5-44160-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0a-lr1e5-44160-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-muster-v0a-lr1e5-44160-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0a-lr1e5-44160-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-muster-v0a-lr1e5-44160-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-muster-v0a-lr1e5-44160-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-muster-v0a-lr1e5-44160-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-muster-v0a-lr1e5-44160-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-muster-v0a-lr1e5-44160-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-muster-v0a-lr1e5-44160-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-muster-v0a-lr1e5-44160-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-muster-v0a-lr1e5-44160-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/added_tokens.json
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/.gitattributes
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/generation_config.json
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/chat_template.jinja
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/special_tokens_map.json
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/tokenizer_config.json
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/quantization_config.json
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/config.json
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/merges.txt
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/vocab.json
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/tokenizer.json
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model.safetensors.index.json
HTTP Request: %s %s "%s %d %s"
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00027-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00023-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00003-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00012-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00002-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00007-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00016-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00015-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00022-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00014-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00009-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00024-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00017-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00004-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00001-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00011-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00020-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00021-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00006-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00013-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00010-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00019-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00026-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00018-of-00027.safetensors
chaiml-muster-v0a-lr1e5-44160-v2-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-muster-v0a-lr1e5-44160-v2/model-00008-of-00027.safetensors
Job chaiml-muster-v0a-lr1e5-44160-v2-uploader completed after 622.28s with status: succeeded
Stopping job with name chaiml-muster-v0a-lr1e5-44160-v2-uploader
Pipeline stage VLLMUploader completed in 625.70s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-muster-v0a-lr1e5-44160-v2
Waiting for inference service chaiml-muster-v0a-lr1e5-44160-v2 to be ready
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission blend_honas_2026-02-03: ('http://guanaco-model-mesh-load-balancer.model-mesh.k2.chaiverse.com/models/chaiml-7b07-69d4-linear-w01_v7/predict', '{"detail":"1 validation error for RuntimeResponse\\npredictions\\n Field required [type=missing, input_value={\'detail\': \\"503, message=...linear-w01_v7/predict\'\\"}, input_type=dict]\\n For further information visit https://errors.pydantic.dev/2.12/v/missing"}')
Inference service chaiml-muster-v0a-lr1e5-44160-v2 ready after 690.815685749054s
Pipeline stage VLLMDeployer completed in 694.24s
run pipeline stage %s
Running pipeline stage StressChecker
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.2819857597351074s
Received healthy response to inference request in 2.8554329872131348s
Received healthy response to inference request in 2.109201192855835s
Received healthy response to inference request in 1.9721753597259521s
Received healthy response to inference request in 2.1390273571014404s
Received healthy response to inference request in 2.126926898956299s
Received healthy response to inference request in 1.9143495559692383s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.1306815147399902s
Received healthy response to inference request in 2.0726358890533447s
Received healthy response to inference request in 2.3750996589660645s
Received healthy response to inference request in 3.003962993621826s
Received healthy response to inference request in 2.6548147201538086s
Received healthy response to inference request in 2.347714424133301s
Received healthy response to inference request in 2.331400156021118s
Received healthy response to inference request in 2.6753530502319336s
Received healthy response to inference request in 2.2639312744140625s
Received healthy response to inference request in 2.2136127948760986s
Received healthy response to inference request in 2.2584927082061768s
Received healthy response to inference request in 2.2486798763275146s
Received healthy response to inference request in 2.041882038116455s
Received healthy response to inference request in 2.1297483444213867s
Received healthy response to inference request in 2.183638334274292s
Received healthy response to inference request in 2.4662630558013916s
Received healthy response to inference request in 2.0006508827209473s
Received healthy response to inference request in 2.0770859718322754s
Received healthy response to inference request in 2.47784161567688s
Received healthy response to inference request in 2.356351375579834s
Received healthy response to inference request in 1.9457659721374512s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 1.9159200191497803s
Received healthy response to inference request in 2.686100482940674s
30 requests
0 failed requests
5th percentile: 1.929350697994232
10th percentile: 1.969534420967102
20th percentile: 2.066485118865967
30th percentile: 2.1216091871261598
40th percentile: 2.1356890201568604
50th percentile: 2.2311463356018066
60th percentile: 2.2711530685424806
70th percentile: 2.350305509567261
80th percentile: 2.4685787677764894
90th percentile: 2.676427793502808
95th percentile: 2.779233360290527
99th percentile: 2.960889291763306
mean time: 2.275224208831787
Pipeline stage StressChecker completed in 79.80s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.65s
Shutdown handler de-registered
chaiml-muster-v0a-lr1e5_44160_v2 status is now deployed due to DeploymentManager action
chaiml-muster-v0a-lr1e5_44160_v2 status is now inactive due to auto deactivation removed underperforming models
chaiml-muster-v0a-lr1e5_44160_v2 status is now torndown due to DeploymentManager action