Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v1-q235b-l-99625-v12-uploader
Waiting for job on chaiml-pony-v1-q235b-l-99625-v12-uploader to finish
chaiml-pony-v1-q235b-l-99625-v12-uploader: Using quantization_mode: w4a16
chaiml-pony-v1-q235b-l-99625-v12-uploader: Checking if ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16 already exists in ChaiML
chaiml-pony-v1-q235b-l-99625-v12-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v1-q235b-l-99625-v12-uploader: Downloading snapshot of ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16...
chaiml-pony-v1-q235b-l-99625-v12-uploader: Downloaded in 49.081s
chaiml-pony-v1-q235b-l-99625-v12-uploader: Processed model ChaiML/pony-v1-q235b-lr1e4ep1r64g4 in 49.774s
chaiml-pony-v1-q235b-l-99625-v12-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v1-q235b-l-99625-v12-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-l-99625-v12-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v1-q235b-l-99625-v12-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v1-q235b-l-99625-v12-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v1-q235b-l-99625-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-l-99625-v12-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-l-99625-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-l-99625-v12-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-l-99625-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-l-99625-v12-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-l-99625-v12-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-l-99625-v12-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-l-99625-v12-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v1-q235b-l-99625-v12-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v1-q235b-l-99625-v12-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v1-q235b-l-99625-v12-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v1-q235b-l-99625-v12-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v1-q235b-l-99625-v12-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/.gitattributes
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/special_tokens_map.json
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/tokenizer_config.json
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/generation_config.json
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/merges.txt
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/added_tokens.json
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/config.json
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/quantization_config.json
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/chat_template.jinja
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/vocab.json
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/tokenizer.json
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model.safetensors.index.json
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00027-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00003-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00004-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00011-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00018-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00015-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00010-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00006-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00016-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00002-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00017-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00020-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00013-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00022-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00009-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00007-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00001-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00026-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00014-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00012-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00025-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00023-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00021-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00005-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v12-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v12/default/model-00024-of-00027.safetensors
Job chaiml-pony-v1-q235b-l-99625-v12-uploader completed after 167.74s with status: succeeded
Stopping job with name chaiml-pony-v1-q235b-l-99625-v12-uploader
Pipeline stage VLLMUploader completed in 170.64s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.33s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v1-q235b-l-99625-v12
Waiting for inference service chaiml-pony-v1-q235b-l-99625-v12 to be ready
Inference service chaiml-pony-v1-q235b-l-99625-v12 ready after 473.6887176036835s
Pipeline stage VLLMDeployer completed in 474.17s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.0082156658172607s
Received healthy response to inference request in 2.046173572540283s
Received healthy response to inference request in 2.3827648162841797s
Received healthy response to inference request in 1.993332862854004s
Received healthy response to inference request in 1.9538660049438477s
Received healthy response to inference request in 2.0087673664093018s
Received healthy response to inference request in 1.9000298976898193s
Received healthy response to inference request in 2.2439615726470947s
Received healthy response to inference request in 2.407149314880371s
Received healthy response to inference request in 2.2558555603027344s
Received healthy response to inference request in 2.0180485248565674s
Received healthy response to inference request in 2.230998992919922s
Received healthy response to inference request in 2.041224479675293s
Received healthy response to inference request in 1.9715802669525146s
Received healthy response to inference request in 2.2532448768615723s
Received healthy response to inference request in 1.944329023361206s
Received healthy response to inference request in 1.963134765625s
Received healthy response to inference request in 2.1805055141448975s
Received healthy response to inference request in 1.8730461597442627s
Received healthy response to inference request in 2.1326260566711426s
Received healthy response to inference request in 1.9220504760742188s
Received healthy response to inference request in 2.232940673828125s
Received healthy response to inference request in 1.8773798942565918s
Received healthy response to inference request in 2.1617815494537354s
Received healthy response to inference request in 2.225290060043335s
Received healthy response to inference request in 2.1247975826263428s
Received healthy response to inference request in 2.0291881561279297s
Received healthy response to inference request in 2.3887948989868164s
Received healthy response to inference request in 2.0103282928466797s
Received healthy response to inference request in 2.2148854732513428s
30 requests
0 failed requests
5th percentile: 1.8875723958015442
10th percentile: 1.919848418235779
20th percentile: 1.9612810134887695
30th percentile: 2.0037508249282836
40th percentile: 2.014960432052612
50th percentile: 2.043699026107788
60th percentile: 2.14428825378418
70th percentile: 2.2180068492889404
80th percentile: 2.235144853591919
90th percentile: 2.2685464859008793
95th percentile: 2.38608136177063
99th percentile: 2.4018265342712404
mean time: 2.0998764117558797
Pipeline stage StressChecker completed in 72.49s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.69s
Shutdown handler de-registered
chaiml-pony-v1-q235b-l_99625_v12 status is now deployed due to DeploymentManager action
chaiml-pony-v1-q235b-l_99625_v12 status is now inactive due to auto deactivation removed underperforming models