Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-4d70-fd43-linear-51732-v9-uploader
Waiting for job on chaiml-4d70-fd43-linear-51732-v9-uploader to finish
chaiml-4d70-fd43-linear-51732-v9-uploader: Using quantization_mode: fp8
chaiml-4d70-fd43-linear-51732-v9-uploader: Repo ChaiML/4d70-fd43-linear-w01-FP8 already ends in FP8. Skipping...
chaiml-4d70-fd43-linear-51732-v9-uploader: Checking if ChaiML/4d70-fd43-linear-w01-FP8 already exists in ChaiML
chaiml-4d70-fd43-linear-51732-v9-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-4d70-fd43-linear-51732-v9-uploader: Downloading snapshot of ChaiML/4d70-fd43-linear-w01-FP8...
chaiml-4d70-fd43-linear-51732-v9-uploader: Downloaded in 8.073s
chaiml-4d70-fd43-linear-51732-v9-uploader: Processed model ChaiML/4d70-fd43-linear-w01-FP8 in 11.546s
chaiml-4d70-fd43-linear-51732-v9-uploader: creating bucket guanaco-vllm-models
chaiml-4d70-fd43-linear-51732-v9-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v9-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-4d70-fd43-linear-51732-v9-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-4d70-fd43-linear-51732-v9-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-4d70-fd43-linear-51732-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v9-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linear-51732-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v9-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-4d70-fd43-linear-51732-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v9-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linear-51732-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-4d70-fd43-linear-51732-v9-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-4d70-fd43-linear-51732-v9-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-4d70-fd43-linear-51732-v9-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-4d70-fd43-linear-51732-v9-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-4d70-fd43-linear-51732-v9-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-4d70-fd43-linear-51732-v9-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-4d70-fd43-linear-51732-v9-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/chat_template.jinja
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/recipe.yaml
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/tokenizer_config.json
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/special_tokens_map.json
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/config.json
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/.gitattributes
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/model.safetensors.index.json
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/generation_config.json
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/tokenizer.json
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/model-00003-of-00003.safetensors
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/model-00001-of-00003.safetensors
chaiml-4d70-fd43-linear-51732-v9-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-4d70-fd43-linear-51732-v9/default/model-00002-of-00003.safetensors
Job chaiml-4d70-fd43-linear-51732-v9-uploader completed after 73.64s with status: succeeded
Stopping job with name chaiml-4d70-fd43-linear-51732-v9-uploader
Pipeline stage VLLMUploader completed in 74.29s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.85s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-4d70-fd43-linear-51732-v9
Waiting for inference service chaiml-4d70-fd43-linear-51732-v9 to be ready
Inference service chaiml-4d70-fd43-linear-51732-v9 ready after 160.49285411834717s
Pipeline stage VLLMDeployer completed in 161.11s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.9230780601501465s
Received healthy response to inference request in 2.1582512855529785s
Received healthy response to inference request in 2.416457414627075s
Received healthy response to inference request in 2.2327799797058105s
Received healthy response to inference request in 2.022096872329712s
Received healthy response to inference request in 2.208059787750244s
Received healthy response to inference request in 3.7377562522888184s
Received healthy response to inference request in 2.3040406703948975s
Received healthy response to inference request in 1.8237736225128174s
Received healthy response to inference request in 2.706845283508301s
Received healthy response to inference request in 2.4505176544189453s
Received healthy response to inference request in 1.8250417709350586s
Received healthy response to inference request in 1.9687385559082031s
Received healthy response to inference request in 2.2880475521087646s
Received healthy response to inference request in 2.265293598175049s
Received healthy response to inference request in 1.8794939517974854s
Received healthy response to inference request in 2.0315964221954346s
Received healthy response to inference request in 2.1608800888061523s
Received healthy response to inference request in 1.9999232292175293s
Received healthy response to inference request in 2.0990703105926514s
Received healthy response to inference request in 2.130281925201416s
Received healthy response to inference request in 1.9881811141967773s
Received healthy response to inference request in 1.8210937976837158s
Received healthy response to inference request in 2.4780375957489014s
Received healthy response to inference request in 1.9182779788970947s
Received healthy response to inference request in 1.8886349201202393s
Received healthy response to inference request in 1.9477005004882812s
Received healthy response to inference request in 1.9172396659851074s
Received healthy response to inference request in 2.304407835006714s
Received healthy response to inference request in 2.1748807430267334s
30 requests
0 failed requests
5th percentile: 1.8243442893028259
10th percentile: 1.8740487337112426
20th percentile: 1.9180703163146973
30th percentile: 1.9624271392822266
40th percentile: 2.013227415084839
50th percentile: 2.1146761178970337
60th percentile: 2.1664803504943846
70th percentile: 2.242534065246582
80th percentile: 2.3041141033172607
90th percentile: 2.4532696485519407
95th percentile: 2.6038818240165704
99th percentile: 3.438792071342469
mean time: 2.169015947977702
Pipeline stage StressChecker completed in 72.30s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.71s
Shutdown handler de-registered
chaiml-4d70-fd43-linear_51732_v9 status is now deployed due to DeploymentManager action
chaiml-4d70-fd43-linear_51732_v9 status is now inactive due to admin request