Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-2a6f-69d4-linear-43777-v7-uploader
Waiting for job on chaiml-2a6f-69d4-linear-43777-v7-uploader to finish
chaiml-2a6f-69d4-linear-43777-v7-uploader: Using quantization_mode: fp8
chaiml-2a6f-69d4-linear-43777-v7-uploader: Repo ChaiML/2a6f-69d4-linear-w01-FP8 already ends in FP8. Skipping...
chaiml-2a6f-69d4-linear-43777-v7-uploader: Checking if ChaiML/2a6f-69d4-linear-w01-FP8 already exists in ChaiML
chaiml-2a6f-69d4-linear-43777-v7-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-2a6f-69d4-linear-43777-v7-uploader: Downloading snapshot of ChaiML/2a6f-69d4-linear-w01-FP8...
chaiml-2a6f-69d4-linear-43777-v7-uploader: Downloaded in 12.645s
chaiml-2a6f-69d4-linear-43777-v7-uploader: Processed model ChaiML/2a6f-69d4-linear-w01-FP8 in 16.247s
chaiml-2a6f-69d4-linear-43777-v7-uploader: creating bucket guanaco-vllm-models
chaiml-2a6f-69d4-linear-43777-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-43777-v7-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-2a6f-69d4-linear-43777-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-2a6f-69d4-linear-43777-v7-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-2a6f-69d4-linear-43777-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-43777-v7-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-2a6f-69d4-linear-43777-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-43777-v7-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-2a6f-69d4-linear-43777-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-43777-v7-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-2a6f-69d4-linear-43777-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-43777-v7-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-2a6f-69d4-linear-43777-v7-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-2a6f-69d4-linear-43777-v7-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-2a6f-69d4-linear-43777-v7-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-2a6f-69d4-linear-43777-v7-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-2a6f-69d4-linear-43777-v7-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-2a6f-69d4-linear-43777-v7-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/config.json
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/generation_config.json
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/special_tokens_map.json
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/tokenizer_config.json
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/chat_template.jinja
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/.gitattributes
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/recipe.yaml
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/model.safetensors.index.json
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/tokenizer.json
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/model-00006-of-00006.safetensors
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/model-00005-of-00006.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/model-00005-of-00006.safetensors
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/model-00004-of-00006.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/model-00004-of-00006.safetensors
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/model-00001-of-00006.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/model-00001-of-00006.safetensors
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/model-00002-of-00006.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/model-00002-of-00006.safetensors
chaiml-2a6f-69d4-linear-43777-v7-uploader: cp /dev/shm/model_output/model-00003-of-00006.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v7/default/model-00003-of-00006.safetensors
Job chaiml-2a6f-69d4-linear-43777-v7-uploader completed after 72.7s with status: succeeded
Stopping job with name chaiml-2a6f-69d4-linear-43777-v7-uploader
Pipeline stage VLLMUploader completed in 73.34s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.49s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-2a6f-69d4-linear-43777-v7
Waiting for inference service chaiml-2a6f-69d4-linear-43777-v7 to be ready
Inference service chaiml-2a6f-69d4-linear-43777-v7 ready after 160.79854679107666s
Pipeline stage VLLMDeployer completed in 161.58s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.2190170288085938s
Received healthy response to inference request in 2.748729944229126s
Received healthy response to inference request in 3.213837146759033s
Received healthy response to inference request in 2.854907274246216s
Received healthy response to inference request in 2.8175997734069824s
Received healthy response to inference request in 4.693814992904663s
Received healthy response to inference request in 2.8912363052368164s
Received healthy response to inference request in 2.7885169982910156s
Received healthy response to inference request in 3.227036952972412s
Received healthy response to inference request in 2.716493606567383s
Received healthy response to inference request in 2.7812671661376953s
Received healthy response to inference request in 2.949990749359131s
Received healthy response to inference request in 2.7022736072540283s
Received healthy response to inference request in 3.3085081577301025s
Received healthy response to inference request in 2.772217035293579s
Received healthy response to inference request in 3.0487186908721924s
Received healthy response to inference request in 2.7570531368255615s
Received healthy response to inference request in 3.1527223587036133s
Received healthy response to inference request in 3.4073545932769775s
Received healthy response to inference request in 2.7380173206329346s
Received healthy response to inference request in 2.997957706451416s
Received healthy response to inference request in 2.704132080078125s
Received healthy response to inference request in 2.77705979347229s
Received healthy response to inference request in 2.7930920124053955s
Received healthy response to inference request in 3.5110909938812256s
Received healthy response to inference request in 2.7719194889068604s
Received healthy response to inference request in 3.2638800144195557s
Received healthy response to inference request in 2.917030096054077s
Received healthy response to inference request in 2.705597162246704s
Received healthy response to inference request in 2.705296754837036s
30 requests
0 failed requests
5th percentile: 2.704656183719635
10th percentile: 2.7055671215057373
20th percentile: 2.7465874195098876
30th percentile: 2.7721277713775634
40th percentile: 2.7856170654296877
50th percentile: 2.836253523826599
60th percentile: 2.9302143573760984
70th percentile: 3.0799197912216183
80th percentile: 3.2206210136413573
90th percentile: 3.3183928012847903
95th percentile: 3.464409613609314
99th percentile: 4.350825033187867
mean time: 2.9978789647420245
Pipeline stage StressChecker completed in 93.06s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.66s
Shutdown handler de-registered
chaiml-2a6f-69d4-linear_43777_v7 status is now deployed due to DeploymentManager action
chaiml-2a6f-69d4-linear_43777_v7 status is now inactive due to system request