Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-02f4-69d4-linear-w01-v56-uploader
Waiting for job on chaiml-02f4-69d4-linear-w01-v56-uploader to finish
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-02f4-69d4-linear-30131-v2-uploader
Waiting for job on chaiml-02f4-69d4-linear-30131-v2-uploader to finish
chaiml-02f4-69d4-linear-30131-v2-uploader: Using quantization_mode: none
chaiml-02f4-69d4-linear-30131-v2-uploader: Downloading snapshot of ChaiML/02f4-69d4-linear-w01-FP8...
chaiml-02f4-69d4-linear-30131-v2-uploader:
Fetching 14 files: 0%| | 0/14 [00:00<?, ?it/s]
Fetching 14 files: 7%|▋ | 1/14 [00:00<00:04, 3.23it/s]
Fetching 14 files: 29%|██▊ | 4/14 [00:10<00:29, 2.93s/it]
Fetching 14 files: 43%|████▎ | 6/14 [00:11<00:13, 1.71s/it]
Fetching 14 files: 100%|██████████| 14/14 [00:11<00:00, 1.25it/s]
chaiml-02f4-69d4-linear-30131-v2-uploader: Downloaded in 37.665s
chaiml-02f4-69d4-linear-30131-v2-uploader: creating bucket guanaco-vllm-models
chaiml-02f4-69d4-linear-30131-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-02f4-69d4-linear-30131-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-02f4-69d4-linear-30131-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-02f4-69d4-linear-30131-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-30131-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-30131-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-30131-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-30131-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-02f4-69d4-linear-30131-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-02f4-69d4-linear-30131-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-02f4-69d4-linear-30131-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-02f4-69d4-linear-w01-v56-uploader: Using quantization_mode: none
chaiml-02f4-69d4-linear-30131-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-02f4-69d4-linear-w01-v56-uploader: Downloading snapshot of ChaiML/02f4-69d4-linear-w01...
chaiml-02f4-69d4-linear-30131-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v2
chaiml-02f4-69d4-linear-30131-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v2/config.json
chaiml-02f4-69d4-linear-30131-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v2/generation_config.json
chaiml-02f4-69d4-linear-30131-v2-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v2/.gitattributes
chaiml-02f4-69d4-linear-30131-v2-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v2/special_tokens_map.json
chaiml-02f4-69d4-linear-30131-v2-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v2/model.safetensors.index.json
chaiml-02f4-69d4-linear-30131-v2-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v2/recipe.yaml
chaiml-02f4-69d4-linear-w01-v56-uploader:
Fetching 19 files: 0%| | 0/19 [00:00<?, ?it/s]
Fetching 19 files: 5%|▌ | 1/19 [00:00<00:05, 3.47it/s]
Fetching 19 files: 32%|███▏ | 6/19 [00:13<00:30, 2.33s/it]
Fetching 19 files: 37%|███▋ | 7/19 [00:15<00:27, 2.28s/it]
Fetching 19 files: 58%|█████▊ | 11/19 [00:15<00:08, 1.11s/it]
Fetching 19 files: 63%|██████▎ | 12/19 [00:16<00:07, 1.03s/it]
Fetching 19 files: 74%|███████▎ | 14/19 [00:19<00:05, 1.13s/it]
Fetching 19 files: 100%|██████████| 19/19 [00:19<00:00, 1.01s/it]
chaiml-02f4-69d4-linear-30131-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v2/tokenizer_config.json
chaiml-02f4-69d4-linear-w01-v56-uploader: Downloaded in 19.226s
chaiml-02f4-69d4-linear-30131-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v2/tokenizer.json
chaiml-02f4-69d4-linear-w01-v56-uploader: Processed model ChaiML/02f4-69d4-linear-w01 in 36.086s
chaiml-02f4-69d4-linear-w01-v56-uploader: creating bucket guanaco-vllm-models
chaiml-02f4-69d4-linear-w01-v56-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-w01-v56-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-02f4-69d4-linear-w01-v56-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-02f4-69d4-linear-w01-v56-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-02f4-69d4-linear-w01-v56-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
Job chaiml-02f4-69d4-linear-30131-v2-uploader completed after 170.16s with status: succeeded
chaiml-02f4-69d4-linear-w01-v56-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
Stopping job with name chaiml-02f4-69d4-linear-30131-v2-uploader
chaiml-02f4-69d4-linear-w01-v56-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
Pipeline stage VLLMUploader completed in 181.74s
chaiml-02f4-69d4-linear-w01-v56-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
run pipeline stage %s
chaiml-02f4-69d4-linear-w01-v56-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
Running pipeline stage VLLMTemplater
chaiml-02f4-69d4-linear-w01-v56-uploader: if re.search("-\.", bucket, re.UNICODE):
Pipeline stage VLLMTemplater completed in 2.44s
chaiml-02f4-69d4-linear-w01-v56-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
run pipeline stage %s
chaiml-02f4-69d4-linear-w01-v56-uploader: if re.search("\.\.", bucket, re.UNICODE):
Running pipeline stage VLLMDeployer
chaiml-02f4-69d4-linear-w01-v56-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-02f4-69d4-linear-w01-v56-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
Creating inference service chaiml-02f4-69d4-linear-30131-v2
chaiml-02f4-69d4-linear-w01-v56-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
Waiting for inference service chaiml-02f4-69d4-linear-30131-v2 to be ready
chaiml-02f4-69d4-linear-w01-v56-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-02f4-69d4-linear-w01-v56-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-02f4-69d4-linear-w01-v56-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v56
chaiml-02f4-69d4-linear-w01-v56-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v56/.gitattributes
chaiml-02f4-69d4-linear-w01-v56-uploader: cp /dev/shm/model_output/mergekit_config.yaml s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v56/mergekit_config.yaml
chaiml-02f4-69d4-linear-w01-v56-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v56/README.md
chaiml-02f4-69d4-linear-w01-v56-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v56/config.json
chaiml-02f4-69d4-linear-w01-v56-uploader: cp /dev/shm/model_output/mergekit_config.yml s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v56/mergekit_config.yml
chaiml-02f4-69d4-linear-w01-v56-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v56/special_tokens_map.json
chaiml-02f4-69d4-linear-w01-v56-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v56/model.safetensors.index.json
chaiml-02f4-69d4-linear-w01-v56-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v56/tokenizer_config.json
chaiml-02f4-69d4-linear-w01-v56-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-w01-v56/tokenizer.json
Job chaiml-02f4-69d4-linear-w01-v56-uploader completed after 257.82s with status: succeeded
Stopping job with name chaiml-02f4-69d4-linear-w01-v56-uploader
Pipeline stage VLLMUploader completed in 266.41s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.49s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-02f4-69d4-linear-w01-v56
Waiting for inference service chaiml-02f4-69d4-linear-w01-v56 to be ready
Inference service chaiml-02f4-69d4-linear-30131-v2 ready after 163.73802399635315s
Pipeline stage VLLMDeployer completed in 180.11s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.6503000259399414s
Received healthy response to inference request in 2.5741207599639893s
Received healthy response to inference request in 2.091458797454834s
Received healthy response to inference request in 1.6614148616790771s
Received healthy response to inference request in 1.5194005966186523s
Received healthy response to inference request in 1.8755812644958496s
Received healthy response to inference request in 1.6247835159301758s
Received healthy response to inference request in 2.039361000061035s
Received healthy response to inference request in 1.7076213359832764s
Received healthy response to inference request in 1.591637134552002s
Received healthy response to inference request in 2.7502384185791016s
Received healthy response to inference request in 1.6831371784210205s
Received healthy response to inference request in 2.161167860031128s
Received healthy response to inference request in 1.905245304107666s
Received healthy response to inference request in 1.7441961765289307s
Received healthy response to inference request in 1.8364107608795166s
Received healthy response to inference request in 3.012455463409424s
Received healthy response to inference request in 2.07863450050354s
Received healthy response to inference request in 2.01979660987854s
Received healthy response to inference request in 2.690178394317627s
Inference service chaiml-02f4-69d4-linear-w01-v56 ready after 224.0956678390503s
Received healthy response to inference request in 3.6747512817382812s
Pipeline stage VLLMDeployer completed in 239.39s
run pipeline stage %s
Received healthy response to inference request in 3.5098979473114014s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.8626322746276855s
Received healthy response to inference request in 3.2310097217559814s
Received healthy response to inference request in 1.6706583499908447s
Received healthy response to inference request in 2.1348652839660645s
Received healthy response to inference request in 3.8365604877471924s
Received healthy response to inference request in 1.823758602142334s
Received healthy response to inference request in 3.141829252243042s
Received healthy response to inference request in 1.5872788429260254s
Received healthy response to inference request in 3.1963045597076416s
Received healthy response to inference request in 5.048647165298462s
Received healthy response to inference request in 2.6236460208892822s
Received healthy response to inference request in 1.9655511379241943s
Received healthy response to inference request in 4.159865856170654s
Received healthy response to inference request in 1.8958725929260254s
30 requests
Received healthy response to inference request in 2.7796483039855957s
0 failed requests
5th percentile: 1.5892400741577148
Received healthy response to inference request in 3.643183708190918s
10th percentile: 1.6214688777923585
20th percentile: 1.6806414127349854
Received healthy response to inference request in 5.800509691238403s
30th percentile: 1.799889874458313
40th percentile: 1.870401668548584
Received healthy response to inference request in 4.864508390426636s
50th percentile: 1.9353982210159302
Received healthy response to inference request in 3.417607307434082s
60th percentile: 2.055070400238037
Received healthy response to inference request in 6.909717798233032s
70th percentile: 2.1427560567855832
80th percentile: 2.6582756996154786
Received healthy response to inference request in 4.150631427764893s
90th percentile: 3.0621997117996225
95th percentile: 3.600567281246185
Received healthy response to inference request in 3.4494199752807617s
99th percentile: 4.65021735906601
mean time: 2.2130351146062215
Received healthy response to inference request in 3.448068618774414s
Pipeline stage StressChecker completed in 323.66s
run pipeline stage %s
Received healthy response to inference request in 3.248379707336426s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
Received healthy response to inference request in 2.7336905002593994s
run_pipeline:run_in_cloud %s
Received healthy response to inference request in 2.5012333393096924s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Received healthy response to inference request in 3.6839301586151123s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 26.98s
Shutdown handler de-registered
Received healthy response to inference request in 2.5767123699188232s
chaiml-02f4-69d4-linear_30131_v2 status is now deployed due to DeploymentManager action
Received healthy response to inference request in 2.5806689262390137s
Received healthy response to inference request in 2.8686485290527344s
Received healthy response to inference request in 2.620502233505249s
Received healthy response to inference request in 2.40229868888855s
Received healthy response to inference request in 3.620204448699951s
Received healthy response to inference request in 3.9787709712982178s
Received healthy response to inference request in 2.746119737625122s
Received healthy response to inference request in 2.881425142288208s
Received healthy response to inference request in 2.7042763233184814s
Received healthy response to inference request in 3.058840751647949s
30 requests
0 failed requests
5th percentile: 2.5351989030838014
10th percentile: 2.5802732706069946
20th percentile: 2.6881502628326417
30th percentile: 2.769589734077454
40th percentile: 2.987874507904053
50th percentile: 3.2136571407318115
60th percentile: 3.4297918319702148
70th percentile: 3.627098226547241
80th percentile: 3.865002584457398
90th percentile: 4.230330109596253
95th percentile: 5.379309105873105
99th percentile: 6.588047447204591
mean time: 3.4286070982615153
Pipeline stage StressChecker completed in 324.91s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 6.64s
Shutdown handler de-registered
chaiml-02f4-69d4-linear-w01_v56 status is now deployed due to DeploymentManager action
chaiml-02f4-69d4-linear-w01_v56 status is now inactive due to auto deactivation removed underperforming models
chaiml-02f4-69d4-linear-w01_v56 status is now torndown due to DeploymentManager action