Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-2fe5-c13f-linear-w01-v44-uploader
Waiting for job on chaiml-2fe5-c13f-linear-w01-v44-uploader to finish
chaiml-2fe5-c13f-linear-w01-v44-uploader: Using quantization_mode: fp8
chaiml-2fe5-c13f-linear-w01-v44-uploader: Checking if ChaiML/2fe5-c13f-linear-w01-FP8 already exists in ChaiML
chaiml-2fe5-c13f-linear-w01-v44-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-2fe5-c13f-linear-w01-v44-uploader: Downloading snapshot of ChaiML/2fe5-c13f-linear-w01-FP8...
chaiml-2fe5-c13f-linear-w01-v44-uploader: Downloaded in 7.556s
chaiml-2fe5-c13f-linear-w01-v44-uploader: Processed model ChaiML/2fe5-c13f-linear-w01 in 11.022s
chaiml-2fe5-c13f-linear-w01-v44-uploader: creating bucket guanaco-vllm-models
chaiml-2fe5-c13f-linear-w01-v44-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-w01-v44-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-2fe5-c13f-linear-w01-v44-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-2fe5-c13f-linear-w01-v44-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-2fe5-c13f-linear-w01-v44-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-w01-v44-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-2fe5-c13f-linear-w01-v44-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-w01-v44-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-2fe5-c13f-linear-w01-v44-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-w01-v44-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-2fe5-c13f-linear-w01-v44-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-w01-v44-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-2fe5-c13f-linear-w01-v44-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-2fe5-c13f-linear-w01-v44-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-2fe5-c13f-linear-w01-v44-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-2fe5-c13f-linear-w01-v44-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-2fe5-c13f-linear-w01-v44-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-2fe5-c13f-linear-w01-v44-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/.gitattributes
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/config.json
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/chat_template.jinja
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/generation_config.json
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/recipe.yaml
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/special_tokens_map.json
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/model.safetensors.index.json
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/tokenizer_config.json
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/tokenizer.json
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/model-00003-of-00003.safetensors
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/model-00002-of-00003.safetensors
chaiml-2fe5-c13f-linear-w01-v44-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-w01-v44/default/model-00001-of-00003.safetensors
Job chaiml-2fe5-c13f-linear-w01-v44-uploader completed after 109.58s with status: succeeded
Stopping job with name chaiml-2fe5-c13f-linear-w01-v44-uploader
Pipeline stage VLLMUploader completed in 110.95s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.29s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-2fe5-c13f-linear-w01-v44
Waiting for inference service chaiml-2fe5-c13f-linear-w01-v44 to be ready
Inference service chaiml-2fe5-c13f-linear-w01-v44 ready after 152.57308888435364s
Pipeline stage VLLMDeployer completed in 153.98s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.568195104598999s
Received healthy response to inference request in 1.8086671829223633s
Received healthy response to inference request in 1.8022739887237549s
Received healthy response to inference request in 1.8635454177856445s
Received healthy response to inference request in 1.834484338760376s
Received healthy response to inference request in 2.042574644088745s
Received healthy response to inference request in 1.765796184539795s
Received healthy response to inference request in 1.8384120464324951s
Received healthy response to inference request in 1.6840529441833496s
Received healthy response to inference request in 1.806227445602417s
Received healthy response to inference request in 1.9656319618225098s
Received healthy response to inference request in 1.9882698059082031s
Received healthy response to inference request in 2.249314546585083s
Received healthy response to inference request in 1.7716138362884521s
Received healthy response to inference request in 1.7584145069122314s
Received healthy response to inference request in 1.93540358543396s
Received healthy response to inference request in 1.7153916358947754s
Received healthy response to inference request in 1.8377630710601807s
Received healthy response to inference request in 1.8724148273468018s
Received healthy response to inference request in 1.8723115921020508s
Received healthy response to inference request in 1.8813073635101318s
Received healthy response to inference request in 1.845057487487793s
Received healthy response to inference request in 1.7109754085540771s
Received healthy response to inference request in 2.0193777084350586s
Received healthy response to inference request in 1.6920108795166016s
Received healthy response to inference request in 1.6943285465240479s
Received healthy response to inference request in 1.8820180892944336s
Received healthy response to inference request in 1.777608871459961s
Received healthy response to inference request in 1.6886723041534424s
Received healthy response to inference request in 1.7713611125946045s
30 requests
0 failed requests
5th percentile: 1.6861316561698914
10th percentile: 1.6916770219802857
20th percentile: 1.7145083904266358
30th percentile: 1.7696916341781617
40th percentile: 1.7924079418182373
50th percentile: 1.8215757608413696
60th percentile: 1.8410702228546143
70th percentile: 1.8723425626754762
80th percentile: 1.892695188522339
90th percentile: 1.9913805961608888
95th percentile: 2.032136023044586
99th percentile: 2.189359974861145
mean time: 1.8314492146174113
Pipeline stage StressChecker completed in 61.24s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.00s
Shutdown handler de-registered
chaiml-2fe5-c13f-linear-w01_v44 status is now deployed due to DeploymentManager action
chaiml-2fe5-c13f-linear-w01_v44 status is now inactive due to auto deactivation removed underperforming models
chaiml-2fe5-c13f-linear-w01_v44 status is now torndown due to DeploymentManager action