Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-02f4-69d4-linear-30131-v5-uploader
Waiting for job on chaiml-02f4-69d4-linear-30131-v5-uploader to finish
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
HTTP Request: %s %s "%s %d %s"
chaiml-02f4-69d4-linear-30131-v5-uploader: Using quantization_mode: none
chaiml-02f4-69d4-linear-30131-v5-uploader: Downloading snapshot of ChaiML/02f4-69d4-linear-w01-FP8...
chaiml-02f4-69d4-linear-30131-v5-uploader:
Fetching 14 files: 0%| | 0/14 [00:00<?, ?it/s]
Fetching 14 files: 7%|▋ | 1/14 [00:00<00:03, 3.30it/s]
Fetching 14 files: 29%|██▊ | 4/14 [00:13<00:37, 3.75s/it]
Fetching 14 files: 100%|██████████| 14/14 [00:13<00:00, 1.00it/s]
chaiml-02f4-69d4-linear-30131-v5-uploader: Downloaded in 14.080s
chaiml-02f4-69d4-linear-30131-v5-uploader: Processed model ChaiML/02f4-69d4-linear-w01-FP8 in 23.140s
chaiml-02f4-69d4-linear-30131-v5-uploader: creating bucket guanaco-vllm-models
chaiml-02f4-69d4-linear-30131-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v5-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-02f4-69d4-linear-30131-v5-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-02f4-69d4-linear-30131-v5-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-02f4-69d4-linear-30131-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v5-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-30131-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v5-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-02f4-69d4-linear-30131-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v5-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-30131-v5-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-02f4-69d4-linear-30131-v5-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-02f4-69d4-linear-30131-v5-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-02f4-69d4-linear-30131-v5-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-02f4-69d4-linear-30131-v5-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-02f4-69d4-linear-30131-v5-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-02f4-69d4-linear-30131-v5-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-02f4-69d4-linear-30131-v5-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v5
chaiml-02f4-69d4-linear-30131-v5-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v5/tokenizer.json
chaiml-02f4-69d4-linear-30131-v5-uploader: cp /dev/shm/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v5/model-00006-of-00006.safetensors
chaiml-02f4-69d4-linear-30131-v5-uploader: cp /dev/shm/model_output/model-00005-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v5/model-00005-of-00006.safetensors
chaiml-02f4-69d4-linear-30131-v5-uploader: cp /dev/shm/model_output/model-00001-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v5/model-00001-of-00006.safetensors
chaiml-02f4-69d4-linear-30131-v5-uploader: cp /dev/shm/model_output/model-00002-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v5/model-00002-of-00006.safetensors
chaiml-02f4-69d4-linear-30131-v5-uploader: cp /dev/shm/model_output/model-00004-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v5/model-00004-of-00006.safetensors
chaiml-02f4-69d4-linear-30131-v5-uploader: cp /dev/shm/model_output/model-00003-of-00006.safetensors s3://guanaco-vllm-models/chaiml-02f4-69d4-linear-30131-v5/model-00003-of-00006.safetensors
Job chaiml-02f4-69d4-linear-30131-v5-uploader completed after 277.41s with status: succeeded
Stopping job with name chaiml-02f4-69d4-linear-30131-v5-uploader
Pipeline stage VLLMUploader completed in 277.93s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-02f4-69d4-linear-30131-v5
Waiting for inference service chaiml-02f4-69d4-linear-30131-v5 to be ready
Inference service chaiml-02f4-69d4-linear-30131-v5 ready after 282.61776852607727s
Pipeline stage VLLMDeployer completed in 283.35s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.0047404766082764s
Received healthy response to inference request in 3.3885765075683594s
Received healthy response to inference request in 2.6965410709381104s
Received healthy response to inference request in 2.9479243755340576s
Received healthy response to inference request in 2.8332231044769287s
Received healthy response to inference request in 2.690808057785034s
Received healthy response to inference request in 3.040398359298706s
Received healthy response to inference request in 2.7827305793762207s
Received healthy response to inference request in 3.005295753479004s
Received healthy response to inference request in 2.8021459579467773s
Received healthy response to inference request in 3.1260128021240234s
Received healthy response to inference request in 3.0568859577178955s
Received healthy response to inference request in 2.7669293880462646s
Received healthy response to inference request in 2.987229585647583s
Received healthy response to inference request in 2.955198049545288s
Received healthy response to inference request in 2.8532280921936035s
Received healthy response to inference request in 2.970982074737549s
Received healthy response to inference request in 2.786452531814575s
Received healthy response to inference request in 2.721888542175293s
Received healthy response to inference request in 2.9743666648864746s
Received healthy response to inference request in 2.7866151332855225s
Received healthy response to inference request in 2.778668165206909s
Received healthy response to inference request in 3.790447950363159s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.8152027130126953s
Received healthy response to inference request in 2.8191869258880615s
Received healthy response to inference request in 3.691176652908325s
Received healthy response to inference request in 2.799060344696045s
Received healthy response to inference request in 2.959798812866211s
Received healthy response to inference request in 2.872004270553589s
Received healthy response to inference request in 3.0406951904296875s
30 requests
0 failed requests
5th percentile: 2.7079474329948425
10th percentile: 2.7624253034591675
20th percentile: 2.7857081413269045
30th percentile: 2.8012202739715577
40th percentile: 2.8276086330413817
50th percentile: 2.9099643230438232
60th percentile: 2.964272117614746
70th percentile: 2.9924828529357907
80th percentile: 3.0404577255249023
90th percentile: 3.1522691726684573
95th percentile: 3.5550065875053396
99th percentile: 3.7616592741012576
mean time: 2.958147136370341
Pipeline stage StressChecker completed in 94.29s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.60s
Shutdown handler de-registered
chaiml-02f4-69d4-linear_30131_v5 status is now deployed due to DeploymentManager action
chaiml-02f4-69d4-linear_30131_v5 status is now inactive due to system request
chaiml-02f4-69d4-linear_30131_v5 status is now inactive due to auto deactivation removed underperforming models