Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-4d70-fd43-linear-51732-v1-uploader
Waiting for job on chaiml-4d70-fd43-linear-51732-v1-uploader to finish
chaiml-4d70-fd43-linear-51732-v1-uploader: Using quantization_mode: none
chaiml-4d70-fd43-linear-51732-v1-uploader: Downloading snapshot of ChaiML/4d70-fd43-linear-w01-FP8...
chaiml-4d70-fd43-linear-51732-v1-uploader:
Fetching 12 files: 0%| | 0/12 [00:00<?, ?it/s]
Fetching 12 files: 8%|▊ | 1/12 [00:00<00:03, 3.12it/s]
Fetching 12 files: 42%|████▏ | 5/12 [00:07<00:11, 1.66s/it]
Fetching 12 files: 100%|██████████| 12/12 [00:07<00:00, 1.52it/s]
chaiml-4d70-fd43-linear-51732-v1-uploader: Downloaded in 8.041s
chaiml-4d70-fd43-linear-51732-v1-uploader: Processed model ChaiML/4d70-fd43-linear-w01-FP8 in 13.057s
chaiml-4d70-fd43-linear-51732-v1-uploader: creating bucket guanaco-vllm-models
chaiml-4d70-fd43-linear-51732-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-4d70-fd43-linear-51732-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-4d70-fd43-linear-51732-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-4d70-fd43-linear-51732-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linear-51732-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linear-51732-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linear-51732-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linear-51732-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-4d70-fd43-linear-51732-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-4d70-fd43-linear-51732-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-4d70-fd43-linear-51732-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-4d70-fd43-linear-51732-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-4d70-fd43-linear-51732-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v1
chaiml-4d70-fd43-linear-51732-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v1/tokenizer.json
chaiml-4d70-fd43-linear-51732-v1-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v1/model-00003-of-00003.safetensors
chaiml-4d70-fd43-linear-51732-v1-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v1/model-00001-of-00003.safetensors
chaiml-4d70-fd43-linear-51732-v1-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v1/model-00002-of-00003.safetensors
Job chaiml-4d70-fd43-linear-51732-v1-uploader completed after 82.65s with status: succeeded
Stopping job with name chaiml-4d70-fd43-linear-51732-v1-uploader
Pipeline stage VLLMUploader completed in 83.32s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-4d70-fd43-linear-51732-v1
Waiting for inference service chaiml-4d70-fd43-linear-51732-v1 to be ready
Inference service chaiml-4d70-fd43-linear-51732-v1 ready after 160.7756848335266s
Pipeline stage VLLMDeployer completed in 161.33s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.1417567729949951s
Received healthy response to inference request in 1.2081024646759033s
Received healthy response to inference request in 1.1576480865478516s
Received healthy response to inference request in 1.3659217357635498s
Received healthy response to inference request in 1.4003627300262451s
Received healthy response to inference request in 1.3192813396453857s
Received healthy response to inference request in 1.0893921852111816s
Received healthy response to inference request in 1.0621933937072754s
Received healthy response to inference request in 1.040743112564087s
Received healthy response to inference request in 3.202197313308716s
Received healthy response to inference request in 1.0104446411132812s
Received healthy response to inference request in 1.4794745445251465s
Received healthy response to inference request in 1.4877915382385254s
Received healthy response to inference request in 1.089794635772705s
Received healthy response to inference request in 1.1085052490234375s
Received healthy response to inference request in 1.4326810836791992s
Received healthy response to inference request in 1.1742806434631348s
Received healthy response to inference request in 1.1915173530578613s
Received healthy response to inference request in 1.1350178718566895s
Received healthy response to inference request in 1.1308791637420654s
Received healthy response to inference request in 1.1584489345550537s
Received healthy response to inference request in 1.2346599102020264s
Received healthy response to inference request in 1.294858694076538s
Received healthy response to inference request in 1.2654073238372803s
Received healthy response to inference request in 1.2140309810638428s
Received healthy response to inference request in 1.0496163368225098s
Received healthy response to inference request in 1.0612480640411377s
Received healthy response to inference request in 1.3321964740753174s
Received healthy response to inference request in 1.1374094486236572s
Received healthy response to inference request in 1.2751822471618652s
30 requests
0 failed requests
5th percentile: 1.0447360634803773
10th percentile: 1.0600848913192749
20th percentile: 1.0897141456604005
30th percentile: 1.1337762594223022
40th percentile: 1.151291561126709
50th percentile: 1.182898998260498
60th percentile: 1.2222825527191161
70th percentile: 1.2810851812362671
80th percentile: 1.338941526412964
90th percentile: 1.437360429763794
95th percentile: 1.484048891067505
99th percentile: 2.7050196385383622
mean time: 1.2750348091125487
Pipeline stage StressChecker completed in 41.42s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.68s
Shutdown handler de-registered
chaiml-4d70-fd43-linear_51732_v1 status is now deployed due to DeploymentManager action
chaiml-4d70-fd43-linear_51732_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-4d70-fd43-linear_51732_v1 status is now torndown due to DeploymentManager action