Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-2fe5-c13f-linear-57126-v5-uploader
Waiting for job on chaiml-2fe5-c13f-linear-57126-v5-uploader to finish
chaiml-2fe5-c13f-linear-57126-v5-uploader: Using quantization_mode: fp8
chaiml-2fe5-c13f-linear-57126-v5-uploader: Repo ChaiML/2fe5-c13f-linear-w01-FP8 already ends in FP8. Skipping...
chaiml-2fe5-c13f-linear-57126-v5-uploader: Checking if ChaiML/2fe5-c13f-linear-w01-FP8 already exists in ChaiML
chaiml-2fe5-c13f-linear-57126-v5-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-2fe5-c13f-linear-57126-v5-uploader: Downloading snapshot of ChaiML/2fe5-c13f-linear-w01-FP8...
chaiml-2fe5-c13f-linear-57126-v5-uploader: Downloaded in 8.999s
chaiml-2fe5-c13f-linear-57126-v5-uploader: Processed model ChaiML/2fe5-c13f-linear-w01-FP8 in 12.472s
chaiml-2fe5-c13f-linear-57126-v5-uploader: creating bucket guanaco-vllm-models
chaiml-2fe5-c13f-linear-57126-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v5-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-2fe5-c13f-linear-57126-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-2fe5-c13f-linear-57126-v5-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-2fe5-c13f-linear-57126-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v5-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-2fe5-c13f-linear-57126-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v5-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-2fe5-c13f-linear-57126-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v5-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-2fe5-c13f-linear-57126-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v5-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-2fe5-c13f-linear-57126-v5-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-2fe5-c13f-linear-57126-v5-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-2fe5-c13f-linear-57126-v5-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-2fe5-c13f-linear-57126-v5-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-2fe5-c13f-linear-57126-v5-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-2fe5-c13f-linear-57126-v5-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/.gitattributes
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/config.json
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/generation_config.json
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/chat_template.jinja
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/special_tokens_map.json
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/recipe.yaml
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/tokenizer_config.json
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/model.safetensors.index.json
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/tokenizer.json
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/model-00003-of-00003.safetensors
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/model-00002-of-00003.safetensors
chaiml-2fe5-c13f-linear-57126-v5-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v5/default/model-00001-of-00003.safetensors
Job chaiml-2fe5-c13f-linear-57126-v5-uploader completed after 93.85s with status: succeeded
Stopping job with name chaiml-2fe5-c13f-linear-57126-v5-uploader
Pipeline stage VLLMUploader completed in 94.43s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-2fe5-c13f-linear-57126-v5
Waiting for inference service chaiml-2fe5-c13f-linear-57126-v5 to be ready
Inference service chaiml-2fe5-c13f-linear-57126-v5 ready after 151.349223613739s
Pipeline stage VLLMDeployer completed in 152.22s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.5351431369781494s
Received healthy response to inference request in 1.9806017875671387s
Received healthy response to inference request in 1.6564040184020996s
Received healthy response to inference request in 1.6696341037750244s
Received healthy response to inference request in 1.5796425342559814s
Received healthy response to inference request in 1.8296246528625488s
Received healthy response to inference request in 2.3236968517303467s
Received healthy response to inference request in 2.0959150791168213s
Received healthy response to inference request in 1.6271746158599854s
Received healthy response to inference request in 1.9610247611999512s
Received healthy response to inference request in 1.820206880569458s
Received healthy response to inference request in 1.6458027362823486s
Received healthy response to inference request in 1.730614423751831s
Received healthy response to inference request in 1.697455644607544s
Received healthy response to inference request in 1.6999592781066895s
Received healthy response to inference request in 1.8072190284729004s
Received healthy response to inference request in 2.2713911533355713s
Received healthy response to inference request in 1.8011856079101562s
Received healthy response to inference request in 1.9237298965454102s
Received healthy response to inference request in 1.767742395401001s
Received healthy response to inference request in 1.624032974243164s
Received healthy response to inference request in 1.8114407062530518s
Received healthy response to inference request in 1.742370367050171s
Received healthy response to inference request in 1.7615275382995605s
Received healthy response to inference request in 1.9002463817596436s
Received healthy response to inference request in 1.6789982318878174s
Received healthy response to inference request in 1.8104004859924316s
Received healthy response to inference request in 1.6908271312713623s
Received healthy response to inference request in 1.7385351657867432s
Received healthy response to inference request in 2.108783483505249s
30 requests
0 failed requests
5th percentile: 1.5996182322502137
10th percentile: 1.6268604516983032
20th percentile: 1.6669880867004394
30th percentile: 1.6954670906066895
40th percentile: 1.7353668689727784
50th percentile: 1.7646349668502808
60th percentile: 1.8084916114807128
70th percentile: 1.8230322122573852
80th percentile: 1.9311888694763184
90th percentile: 2.097201919555664
95th percentile: 2.198217701911926
99th percentile: 2.308528199195862
mean time: 1.8097110350926717
Pipeline stage StressChecker completed in 56.90s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.65s
Shutdown handler de-registered
chaiml-2fe5-c13f-linear_57126_v5 status is now deployed due to DeploymentManager action
chaiml-2fe5-c13f-linear_57126_v5 status is now inactive due to system request