Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-2fe5-c13f-linear-57126-v1-uploader
Waiting for job on chaiml-2fe5-c13f-linear-57126-v1-uploader to finish
%s, retrying in %s seconds...
chaiml-2fe5-c13f-linear-57126-v1-uploader: Using quantization_mode: none
chaiml-2fe5-c13f-linear-57126-v1-uploader: Downloading snapshot of ChaiML/2fe5-c13f-linear-w01-FP8...
chaiml-2fe5-c13f-linear-57126-v1-uploader:
Fetching 12 files: 0%| | 0/12 [00:00<?, ?it/s]
Fetching 12 files: 8%|▊ | 1/12 [00:00<00:03, 2.80it/s]
Fetching 12 files: 42%|████▏ | 5/12 [00:07<00:10, 1.55s/it]
Fetching 12 files: 100%|██████████| 12/12 [00:07<00:00, 1.62it/s]
chaiml-2fe5-c13f-linear-57126-v1-uploader: Downloaded in 7.509s
chaiml-2fe5-c13f-linear-57126-v1-uploader: Processed model ChaiML/2fe5-c13f-linear-w01-FP8 in 12.645s
chaiml-2fe5-c13f-linear-57126-v1-uploader: creating bucket guanaco-vllm-models
chaiml-2fe5-c13f-linear-57126-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-2fe5-c13f-linear-57126-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-2fe5-c13f-linear-57126-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-2fe5-c13f-linear-57126-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-2fe5-c13f-linear-57126-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-2fe5-c13f-linear-57126-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-2fe5-c13f-linear-57126-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-2fe5-c13f-linear-57126-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-2fe5-c13f-linear-57126-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-2fe5-c13f-linear-57126-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-2fe5-c13f-linear-57126-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-2fe5-c13f-linear-57126-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-2fe5-c13f-linear-57126-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-2fe5-c13f-linear-57126-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/chat_template.jinja
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/special_tokens_map.json
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/generation_config.json
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/recipe.yaml
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/.gitattributes
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/model.safetensors.index.json
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/tokenizer_config.json
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/config.json
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/tokenizer.json
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/model-00003-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/model-00003-of-00003.safetensors
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/model-00001-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/model-00001-of-00003.safetensors
chaiml-2fe5-c13f-linear-57126-v1-uploader: cp /dev/shm/model_output/model-00002-of-00003.safetensors s3://guanaco-vllm-models/chaiml-2fe5-c13f-linear-57126-v1/model-00002-of-00003.safetensors
Job chaiml-2fe5-c13f-linear-57126-v1-uploader completed after 111.66s with status: succeeded
Stopping job with name chaiml-2fe5-c13f-linear-57126-v1-uploader
Pipeline stage VLLMUploader completed in 112.15s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.18s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-2fe5-c13f-linear-57126-v1
Waiting for inference service chaiml-2fe5-c13f-linear-57126-v1 to be ready
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-2fe5-c13f-linear-57126-v1 ready after 164.06228303909302s
Pipeline stage VLLMDeployer completed in 164.64s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 0.9179263114929199s
Received healthy response to inference request in 1.3097724914550781s
Received healthy response to inference request in 0.9751079082489014s
Received healthy response to inference request in 0.9045665264129639s
Received healthy response to inference request in 1.308544635772705s
Received healthy response to inference request in 0.9487736225128174s
Received healthy response to inference request in 1.104506492614746s
Received healthy response to inference request in 0.9037041664123535s
Received healthy response to inference request in 0.9895884990692139s
Received healthy response to inference request in 0.8999514579772949s
Received healthy response to inference request in 1.0653605461120605s
Received healthy response to inference request in 1.0224721431732178s
Received healthy response to inference request in 0.8944747447967529s
Received healthy response to inference request in 1.2866554260253906s
Received healthy response to inference request in 1.0589182376861572s
Received healthy response to inference request in 0.9069194793701172s
Received healthy response to inference request in 0.9023337364196777s
Received healthy response to inference request in 1.0738615989685059s
Received healthy response to inference request in 1.0018832683563232s
Received healthy response to inference request in 0.8668186664581299s
Received healthy response to inference request in 1.0264818668365479s
Received healthy response to inference request in 0.9050338268280029s
Received healthy response to inference request in 1.3702855110168457s
Received healthy response to inference request in 0.9835608005523682s
Received healthy response to inference request in 0.9453845024108887s
Received healthy response to inference request in 1.1936728954315186s
Received healthy response to inference request in 1.2557001113891602s
Received healthy response to inference request in 0.9664862155914307s
Received healthy response to inference request in 0.8931877613067627s
Received healthy response to inference request in 1.153641700744629s
30 requests
0 failed requests
5th percentile: 0.8937669038772583
10th percentile: 0.8994037866592407
20th percentile: 0.9043940544128418
30th percentile: 0.9146242618560791
40th percentile: 0.9594011783599854
50th percentile: 0.986574649810791
60th percentile: 1.0240760326385498
70th percentile: 1.067910861968994
80th percentile: 1.161647939682007
90th percentile: 1.288844347000122
95th percentile: 1.3092199563980103
99th percentile: 1.3527367353439332
mean time: 1.0345191717147828
Pipeline stage StressChecker completed in 33.73s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.86s
Shutdown handler de-registered
chaiml-2fe5-c13f-linear_57126_v1 status is now deployed due to DeploymentManager action
chaiml-2fe5-c13f-linear_57126_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-2fe5-c13f-linear_57126_v1 status is now torndown due to DeploymentManager action