Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-ca18-c13f-linear-w01-v39-uploader
Waiting for job on chaiml-ca18-c13f-linear-w01-v39-uploader to finish
chaiml-ca18-c13f-linear-w01-v39-uploader: Using quantization_mode: none
chaiml-ca18-c13f-linear-w01-v39-uploader: Downloading snapshot of ChaiML/ca18-c13f-linear-w01...
chaiml-ca18-c13f-linear-w01-v39-uploader:
Fetching 14 files: 0%| | 0/14 [00:00<?, ?it/s]
Fetching 14 files: 7%|▋ | 1/14 [00:00<00:03, 3.78it/s]
Fetching 14 files: 43%|████▎ | 6/14 [00:10<00:14, 1.85s/it]
Fetching 14 files: 50%|█████ | 7/14 [00:11<00:11, 1.69s/it]
Fetching 14 files: 64%|██████▍ | 9/14 [00:12<00:06, 1.21s/it]
Fetching 14 files: 71%|███████▏ | 10/14 [00:12<00:04, 1.04s/it]
Fetching 14 files: 100%|██████████| 14/14 [00:12<00:00, 1.09it/s]
chaiml-ca18-c13f-linear-w01-v39-uploader: Downloaded in 12.917s
chaiml-ca18-c13f-linear-w01-v39-uploader: Processed model ChaiML/ca18-c13f-linear-w01 in 22.053s
chaiml-ca18-c13f-linear-w01-v39-uploader: creating bucket guanaco-vllm-models
chaiml-ca18-c13f-linear-w01-v39-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linear-w01-v39-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-ca18-c13f-linear-w01-v39-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-ca18-c13f-linear-w01-v39-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-ca18-c13f-linear-w01-v39-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linear-w01-v39-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-ca18-c13f-linear-w01-v39-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linear-w01-v39-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-ca18-c13f-linear-w01-v39-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linear-w01-v39-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-ca18-c13f-linear-w01-v39-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-ca18-c13f-linear-w01-v39-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-ca18-c13f-linear-w01-v39-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-ca18-c13f-linear-w01-v39-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-ca18-c13f-linear-w01-v39-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-ca18-c13f-linear-w01-v39-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-ca18-c13f-linear-w01-v39-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-ca18-c13f-linear-w01-v39-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/.gitattributes
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/README.md
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/mergekit_config.yml s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/mergekit_config.yml
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/config.json
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/model.safetensors.index.json
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/special_tokens_map.json
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/tokenizer_config.json
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/mergekit_config.yaml s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/mergekit_config.yaml
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/tokenizer.json
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/model-00005-of-00005.safetensors s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/model-00005-of-00005.safetensors
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/model-00001-of-00005.safetensors s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/model-00001-of-00005.safetensors
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/model-00002-of-00005.safetensors s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/model-00002-of-00005.safetensors
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/model-00004-of-00005.safetensors s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/model-00004-of-00005.safetensors
chaiml-ca18-c13f-linear-w01-v39-uploader: cp /dev/shm/model_output/model-00003-of-00005.safetensors s3://guanaco-vllm-models/chaiml-ca18-c13f-linear-w01-v39/model-00003-of-00005.safetensors
Job chaiml-ca18-c13f-linear-w01-v39-uploader completed after 380.42s with status: succeeded
Stopping job with name chaiml-ca18-c13f-linear-w01-v39-uploader
Pipeline stage VLLMUploader completed in 381.69s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-ca18-c13f-linear-w01-v39
Waiting for inference service chaiml-ca18-c13f-linear-w01-v39 to be ready
HTTP Request: %s %s "%s %d %s"
Unable to record family friendly update due to error: Invalid JSON input: Expecting value: line 1 column 1 (char 0)
Inference service chaiml-ca18-c13f-linear-w01-v39 ready after 804.93514585495s
Pipeline stage VLLMDeployer completed in 806.21s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.9447178840637207s
Received healthy response to inference request in 1.7034823894500732s
Received healthy response to inference request in 1.3998777866363525s
Received healthy response to inference request in 1.4017183780670166s
Received healthy response to inference request in 1.283501386642456s
Received healthy response to inference request in 1.3175137042999268s
Received healthy response to inference request in 1.8152976036071777s
Received healthy response to inference request in 1.5142333507537842s
Received healthy response to inference request in 1.5571837425231934s
Received healthy response to inference request in 2.0205800533294678s
Received healthy response to inference request in 1.6079199314117432s
Received healthy response to inference request in 1.4123921394348145s
Received healthy response to inference request in 1.213613748550415s
Received healthy response to inference request in 1.4991199970245361s
Received healthy response to inference request in 1.4299561977386475s
Received healthy response to inference request in 1.485001802444458s
Received healthy response to inference request in 1.2518115043640137s
Received healthy response to inference request in 1.5980563163757324s
Received healthy response to inference request in 1.4573006629943848s
Received healthy response to inference request in 1.5870893001556396s
Received healthy response to inference request in 1.6115822792053223s
Received healthy response to inference request in 1.4957516193389893s
Received healthy response to inference request in 1.3745453357696533s
Received healthy response to inference request in 1.531660556793213s
Received healthy response to inference request in 1.5340752601623535s
Received healthy response to inference request in 1.3584315776824951s
Received healthy response to inference request in 1.5582995414733887s
Received healthy response to inference request in 1.6086413860321045s
Received healthy response to inference request in 1.7862484455108643s
Received healthy response to inference request in 1.6294820308685303s
30 requests
0 failed requests
5th percentile: 1.2660719513893128
10th percentile: 1.3141124725341797
20th percentile: 1.3948112964630126
30th percentile: 1.4246869802474975
40th percentile: 1.4914516925811767
50th percentile: 1.5229469537734985
60th percentile: 1.5576300621032715
70th percentile: 1.6010154008865356
80th percentile: 1.615162229537964
90th percentile: 1.7891533613204957
95th percentile: 1.886478757858276
99th percentile: 1.9985800242424012
mean time: 1.5329695304234823
Pipeline stage StressChecker completed in 51.45s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.98s
Shutdown handler de-registered
chaiml-ca18-c13f-linear-w01_v39 status is now deployed due to DeploymentManager action
chaiml-ca18-c13f-linear-w01_v39 status is now inactive due to auto deactivation removed underperforming models
chaiml-ca18-c13f-linear-w01_v39 status is now torndown due to DeploymentManager action