Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-7b07-69d4-linear-w01-v40-uploader
Waiting for job on chaiml-7b07-69d4-linear-w01-v40-uploader to finish
chaiml-7b07-69d4-linear-w01-v40-uploader: Using quantization_mode: fp8
chaiml-7b07-69d4-linear-w01-v40-uploader: Checking if ChaiML/7b07-69d4-linear-w01-FP8 already exists in ChaiML
chaiml-7b07-69d4-linear-w01-v40-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-7b07-69d4-linear-w01-v40-uploader: Downloading snapshot of ChaiML/7b07-69d4-linear-w01-FP8...
chaiml-7b07-69d4-linear-w01-v40-uploader: Downloaded in 12.107s
chaiml-7b07-69d4-linear-w01-v40-uploader: Processed model ChaiML/7b07-69d4-linear-w01 in 15.563s
chaiml-7b07-69d4-linear-w01-v40-uploader: creating bucket guanaco-vllm-models
chaiml-7b07-69d4-linear-w01-v40-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-7b07-69d4-linear-w01-v40-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-7b07-69d4-linear-w01-v40-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-7b07-69d4-linear-w01-v40-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-7b07-69d4-linear-w01-v40-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-7b07-69d4-linear-w01-v40-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-7b07-69d4-linear-w01-v40-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-7b07-69d4-linear-w01-v40-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-7b07-69d4-linear-w01-v40-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-7b07-69d4-linear-w01-v40-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-7b07-69d4-linear-w01-v40-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-7b07-69d4-linear-w01-v40-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-7b07-69d4-linear-w01-v40-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-7b07-69d4-linear-w01-v40-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-7b07-69d4-linear-w01-v40-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-7b07-69d4-linear-w01-v40-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-7b07-69d4-linear-w01-v40-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-7b07-69d4-linear-w01-v40-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/.gitattributes
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/config.json
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/recipe.yaml
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/generation_config.json
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/model.safetensors.index.json
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/special_tokens_map.json
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/tokenizer_config.json
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/tokenizer.json
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/model-00006-of-00006.safetensors
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/model-00005-of-00006.safetensors s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/model-00005-of-00006.safetensors
HTTP Request: %s %s "%s %d %s"
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/model-00001-of-00006.safetensors s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/model-00001-of-00006.safetensors
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/model-00002-of-00006.safetensors s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/model-00002-of-00006.safetensors
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/model-00004-of-00006.safetensors s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/model-00004-of-00006.safetensors
chaiml-7b07-69d4-linear-w01-v40-uploader: cp /dev/shm/model_output/model-00003-of-00006.safetensors s3://guanaco-vllm-models/chaiml-7b07-69d4-linear-w01-v40/default/model-00003-of-00006.safetensors
Job chaiml-7b07-69d4-linear-w01-v40-uploader completed after 94.07s with status: succeeded
Stopping job with name chaiml-7b07-69d4-linear-w01-v40-uploader
Pipeline stage VLLMUploader completed in 94.56s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-7b07-69d4-linear-w01-v40
Waiting for inference service chaiml-7b07-69d4-linear-w01-v40 to be ready
Unable to record family friendly update due to error: Invalid JSON input: Expecting value: line 1 column 1 (char 0)
Inference service chaiml-7b07-69d4-linear-w01-v40 ready after 684.8230772018433s
Pipeline stage VLLMDeployer completed in 685.46s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.7579374313354492s
Received healthy response to inference request in 1.6803860664367676s
Received healthy response to inference request in 1.305621862411499s
Received healthy response to inference request in 1.7213995456695557s
Received healthy response to inference request in 1.283454418182373s
Received healthy response to inference request in 1.3071873188018799s
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Received healthy response to inference request in 1.3352270126342773s
Received healthy response to inference request in 1.329200029373169s
Received healthy response to inference request in 1.5768089294433594s
Received healthy response to inference request in 1.2770774364471436s
Received healthy response to inference request in 1.409416675567627s
Received healthy response to inference request in 1.2682502269744873s
Received healthy response to inference request in 1.3347482681274414s
Received healthy response to inference request in 1.3426897525787354s
Received healthy response to inference request in 1.2642648220062256s
Received healthy response to inference request in 1.3235993385314941s
Received healthy response to inference request in 1.3056588172912598s
Received healthy response to inference request in 1.2563748359680176s
Received healthy response to inference request in 1.3481669425964355s
Received healthy response to inference request in 1.2987442016601562s
Received healthy response to inference request in 1.2647757530212402s
Received healthy response to inference request in 1.2578420639038086s
Received healthy response to inference request in 1.3388378620147705s
Received healthy response to inference request in 1.2873337268829346s
Received healthy response to inference request in 1.2804691791534424s
Received healthy response to inference request in 1.2820565700531006s
Received healthy response to inference request in 1.265355110168457s
Received healthy response to inference request in 1.2693815231323242s
Received healthy response to inference request in 1.2944962978363037s
Received healthy response to inference request in 1.3993721008300781s
30 requests
0 failed requests
5th percentile: 1.2607323050498962
10th percentile: 1.2647246599197388
20th percentile: 1.2691552639007568
30th percentile: 1.281580352783203
40th percentile: 1.291631269454956
50th percentile: 1.3056403398513794
60th percentile: 1.3258396148681642
70th percentile: 1.3363102674484253
80th percentile: 1.3584079742431643
90th percentile: 1.5871666431427003
95th percentile: 1.702943480014801
99th percentile: 1.74734144449234
mean time: 1.3555378039677939
Pipeline stage StressChecker completed in 43.59s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.62s
Shutdown handler de-registered
chaiml-7b07-69d4-linear-w01_v40 status is now deployed due to DeploymentManager action
chaiml-7b07-69d4-linear-w01_v40 status is now inactive due to system request