Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v1-q235b-l-99625-v11-uploader
Waiting for job on chaiml-pony-v1-q235b-l-99625-v11-uploader to finish
chaiml-pony-v1-q235b-l-99625-v11-uploader: Using quantization_mode: w4a16
chaiml-pony-v1-q235b-l-99625-v11-uploader: Checking if ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16 already exists in ChaiML
chaiml-pony-v1-q235b-l-99625-v11-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v1-q235b-l-99625-v11-uploader: Downloading snapshot of ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16...
chaiml-pony-v1-q235b-l-99625-v11-uploader: Downloaded in 54.293s
chaiml-pony-v1-q235b-l-99625-v11-uploader: Processed model ChaiML/pony-v1-q235b-lr1e4ep1r64g4 in 54.821s
chaiml-pony-v1-q235b-l-99625-v11-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v1-q235b-l-99625-v11-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-l-99625-v11-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v1-q235b-l-99625-v11-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v1-q235b-l-99625-v11-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v1-q235b-l-99625-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-l-99625-v11-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-l-99625-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-l-99625-v11-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-l-99625-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-l-99625-v11-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-l-99625-v11-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-l-99625-v11-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-l-99625-v11-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v1-q235b-l-99625-v11-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v1-q235b-l-99625-v11-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v1-q235b-l-99625-v11-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v1-q235b-l-99625-v11-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v1-q235b-l-99625-v11-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/.gitattributes
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/chat_template.jinja
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/special_tokens_map.json
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/tokenizer_config.json
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/quantization_config.json
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/generation_config.json
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/config.json
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/added_tokens.json
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/merges.txt
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model.safetensors.index.json
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/vocab.json
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/tokenizer.json
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00027-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00019-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00023-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00002-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00008-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00007-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00014-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00021-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00009-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00017-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00006-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00005-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00003-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00010-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00004-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00024-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00011-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00022-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00016-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00026-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00012-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00025-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00018-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00013-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00020-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00001-of-00027.safetensors
chaiml-pony-v1-q235b-l-99625-v11-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-l-99625-v11/default/model-00015-of-00027.safetensors
Job chaiml-pony-v1-q235b-l-99625-v11-uploader completed after 175.83s with status: succeeded
Stopping job with name chaiml-pony-v1-q235b-l-99625-v11-uploader
Pipeline stage VLLMUploader completed in 176.23s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.56s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v1-q235b-l-99625-v11
Waiting for inference service chaiml-pony-v1-q235b-l-99625-v11 to be ready
Inference service chaiml-pony-v1-q235b-l-99625-v11 ready after 402.96444153785706s
Pipeline stage VLLMDeployer completed in 403.32s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.0933849811553955s
Received healthy response to inference request in 2.105041980743408s
Received healthy response to inference request in 2.215055227279663s
Received healthy response to inference request in 2.011380910873413s
Received healthy response to inference request in 1.9924981594085693s
Received healthy response to inference request in 1.910923719406128s
Received healthy response to inference request in 2.1827569007873535s
Received healthy response to inference request in 2.1528544425964355s
Received healthy response to inference request in 1.9379825592041016s
Received healthy response to inference request in 1.9302988052368164s
Received healthy response to inference request in 1.873861312866211s
Received healthy response to inference request in 1.993499517440796s
Received healthy response to inference request in 1.9693703651428223s
Received healthy response to inference request in 1.9814128875732422s
Received healthy response to inference request in 1.9240076541900635s
Received healthy response to inference request in 1.9677739143371582s
Received healthy response to inference request in 1.882566213607788s
Received healthy response to inference request in 2.0184326171875s
Received healthy response to inference request in 2.003267765045166s
Received healthy response to inference request in 2.0167763233184814s
Received healthy response to inference request in 2.0851778984069824s
Received healthy response to inference request in 2.1107938289642334s
Received healthy response to inference request in 2.0946688652038574s
Received healthy response to inference request in 1.9735078811645508s
Received healthy response to inference request in 1.914736270904541s
Received healthy response to inference request in 1.9625334739685059s
Received healthy response to inference request in 1.9376184940338135s
Received healthy response to inference request in 2.0682425498962402s
Received healthy response to inference request in 1.9225971698760986s
Received healthy response to inference request in 1.9419357776641846s
30 requests
0 failed requests
5th percentile: 1.895327091217041
10th percentile: 1.9143550157546998
20th percentile: 1.9290405750274657
30th percentile: 1.9407498121261597
40th percentile: 1.9687317848205566
50th percentile: 1.9869555234909058
60th percentile: 2.0065130233764648
70th percentile: 2.0333755970001217
80th percentile: 2.093641757965088
90th percentile: 2.1149998903274536
95th percentile: 2.1693007946014404
99th percentile: 2.2056887125968934
mean time: 2.0058319489161174
Pipeline stage StressChecker completed in 63.04s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.74s
Shutdown handler de-registered
chaiml-pony-v1-q235b-l_99625_v11 status is now deployed due to DeploymentManager action
chaiml-pony-v1-q235b-l_99625_v11 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v1-q235b-l_99625_v11 status is now torndown due to DeploymentManager action