Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-ca18-c13f-linear-w01-v47-uploader
Waiting for job on chaiml-ca18-c13f-linear-w01-v47-uploader to finish
chaiml-ca18-c13f-linear-w01-v47-uploader: Using quantization_mode: fp8
chaiml-ca18-c13f-linear-w01-v47-uploader: Checking if ChaiML/ca18-c13f-linear-w01-FP8 already exists in ChaiML
chaiml-ca18-c13f-linear-w01-v47-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-ca18-c13f-linear-w01-v47-uploader: Downloading snapshot of ChaiML/ca18-c13f-linear-w01-FP8...
chaiml-ca18-c13f-linear-w01-v47-uploader: Downloaded in 7.718s
chaiml-ca18-c13f-linear-w01-v47-uploader: Processed model ChaiML/ca18-c13f-linear-w01 in 11.269s
chaiml-ca18-c13f-linear-w01-v47-uploader: creating bucket guanaco-vllm-models
chaiml-ca18-c13f-linear-w01-v47-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linear-w01-v47-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-ca18-c13f-linear-w01-v47-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-ca18-c13f-linear-w01-v47-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-ca18-c13f-linear-w01-v47-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linear-w01-v47-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-ca18-c13f-linear-w01-v47-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linear-w01-v47-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-ca18-c13f-linear-w01-v47-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linear-w01-v47-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-ca18-c13f-linear-w01-v47-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linear-w01-v47-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-ca18-c13f-linear-w01-v47-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-ca18-c13f-linear-w01-v47-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-ca18-c13f-linear-w01-v47-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-ca18-c13f-linear-w01-v47-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-ca18-c13f-linear-w01-v47-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-ca18-c13f-linear-w01-v47-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v47/default
chaiml-ca18-c13f-linear-w01-v47-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v47/default/special_tokens_map.json
chaiml-ca18-c13f-linear-w01-v47-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v47/default/chat_template.jinja
chaiml-ca18-c13f-linear-w01-v47-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v47/default/config.json
chaiml-ca18-c13f-linear-w01-v47-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v47/default/model.safetensors.index.json
chaiml-ca18-c13f-linear-w01-v47-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v47/default/recipe.yaml
chaiml-ca18-c13f-linear-w01-v47-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v47/default/tokenizer_config.json
chaiml-ca18-c13f-linear-w01-v47-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v47/default/generation_config.json
chaiml-ca18-c13f-linear-w01-v47-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v47/default/.gitattributes
chaiml-ca18-c13f-linear-w01-v47-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v47/default/tokenizer.json
Job chaiml-ca18-c13f-linear-w01-v47-uploader completed after 116.71s with status: succeeded
Stopping job with name chaiml-ca18-c13f-linear-w01-v47-uploader
Pipeline stage VLLMUploader completed in 123.31s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.38s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-ca18-c13f-linear-w01-v47
Waiting for inference service chaiml-ca18-c13f-linear-w01-v47 to be ready
Inference service chaiml-ca18-c13f-linear-w01-v47 ready after 160.3968243598938s
Pipeline stage VLLMDeployer completed in 162.20s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.1475179195404053s
Received healthy response to inference request in 1.527254343032837s
Received healthy response to inference request in 2.154909133911133s
Received healthy response to inference request in 1.884526252746582s
Received healthy response to inference request in 1.8864977359771729s
Received healthy response to inference request in 2.0627496242523193s
Received healthy response to inference request in 2.5067074298858643s
Received healthy response to inference request in 2.0304431915283203s
Received healthy response to inference request in 1.8052992820739746s
Received healthy response to inference request in 1.7013800144195557s
Received healthy response to inference request in 1.918950080871582s
Received healthy response to inference request in 1.64363694190979s
Received healthy response to inference request in 1.719468116760254s
Received healthy response to inference request in 1.6071765422821045s
Received healthy response to inference request in 2.0728585720062256s
Received healthy response to inference request in 2.1811258792877197s
Received healthy response to inference request in 2.373910903930664s
Received healthy response to inference request in 1.798529863357544s
Received healthy response to inference request in 1.5175490379333496s
Received healthy response to inference request in 1.5187902450561523s
Received healthy response to inference request in 1.7509946823120117s
Received healthy response to inference request in 1.7238922119140625s
Received healthy response to inference request in 1.9610810279846191s
Received healthy response to inference request in 1.8375225067138672s
Received healthy response to inference request in 1.8162274360656738s
Received healthy response to inference request in 1.6227619647979736s
Received healthy response to inference request in 1.5870702266693115s
Received healthy response to inference request in 1.629265546798706s
Received healthy response to inference request in 1.9718637466430664s
Received healthy response to inference request in 1.9944217205047607s
30 requests
0 failed requests
5th percentile: 1.5225990891456604
10th percentile: 1.5810886383056642
20th percentile: 1.6279648303985597
30th percentile: 1.7140416860580445
40th percentile: 1.779515790939331
50th percentile: 1.8268749713897705
60th percentile: 1.8994786739349365
70th percentile: 1.9786311388015747
80th percentile: 2.0647714138031006
90th percentile: 2.1575308084487914
95th percentile: 2.2871576428413385
99th percentile: 2.4681964373588565
mean time: 1.8651460727055869
Pipeline stage StressChecker completed in 83.53s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 4.30s
Shutdown handler de-registered
chaiml-ca18-c13f-linear-w01_v47 status is now deployed due to DeploymentManager action
chaiml-ca18-c13f-linear-w01_v47 status is now inactive due to auto deactivation removed underperforming models