Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
run pipeline stage %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d1-q235b-pv-39858-v2-uploader
Waiting for job on chaiml-pony-d1-q235b-pv-39858-v2-uploader to finish
chaiml-pony-d1-q235b-pv-39858-v2-uploader: Using quantization_mode: w4a16
chaiml-pony-d1-q235b-pv-39858-v2-uploader: Checking if ChaiML/pony-d1-q235b-pv2-lr5e6ep2r64g4-W4A16 already exists in ChaiML
chaiml-pony-d1-q235b-pv-39858-v2-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d1-q235b-pv-39858-v2-uploader: Downloading snapshot of ChaiML/pony-d1-q235b-pv2-lr5e6ep2r64g4-W4A16...
chaiml-pony-d1-q235b-pv-39858-v2-uploader: Downloaded in 42.436s
chaiml-pony-d1-q235b-pv-39858-v2-uploader: Processed model ChaiML/pony-d1-q235b-pv2-lr5e6ep2r64g4 in 42.982s
chaiml-pony-d1-q235b-pv-39858-v2-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d1-q235b-pv-39858-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d1-q235b-pv-39858-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d1-q235b-pv-39858-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d1-q235b-pv-39858-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d1-q235b-pv-39858-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d1-q235b-pv-39858-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d1-q235b-pv-39858-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d1-q235b-pv-39858-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d1-q235b-pv-39858-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d1-q235b-pv-39858-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d1-q235b-pv-39858-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d1-q235b-pv-39858-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d1-q235b-pv-39858-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d1-q235b-pv-39858-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d1-q235b-pv-39858-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d1-q235b-pv-39858-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d1-q235b-pv-39858-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d1-q235b-pv-39858-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/.gitattributes
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/chat_template.jinja
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/tokenizer_config.json
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/quantization_config.json
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/generation_config.json
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/config.json
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/special_tokens_map.json
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model.safetensors.index.json
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/added_tokens.json
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/merges.txt
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/vocab.json
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/tokenizer.json
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00027-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00017-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00002-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00013-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00011-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00016-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00026-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00005-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00023-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00018-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00003-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00014-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00019-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00009-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00004-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00001-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00010-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00021-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00007-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00025-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00022-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00015-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00012-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00024-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00006-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00020-of-00027.safetensors
chaiml-pony-d1-q235b-pv-39858-v2-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d1-q235b-pv-39858-v2/default/model-00008-of-00027.safetensors
Job chaiml-pony-d1-q235b-pv-39858-v2-uploader completed after 177.56s with status: succeeded
Stopping job with name chaiml-pony-d1-q235b-pv-39858-v2-uploader
Pipeline stage VLLMUploader completed in 178.37s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.54s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d1-q235b-pv-39858-v2
Waiting for inference service chaiml-pony-d1-q235b-pv-39858-v2 to be ready
Inference service chaiml-pony-d1-q235b-pv-39858-v2 ready after 423.7805857658386s
Pipeline stage VLLMDeployer completed in 424.33s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.2188901901245117s
Received healthy response to inference request in 2.030675172805786s
Received healthy response to inference request in 2.0253849029541016s
Received healthy response to inference request in 1.955425500869751s
Received healthy response to inference request in 1.8827135562896729s
Received healthy response to inference request in 2.010509967803955s
Received healthy response to inference request in 1.9880824089050293s
Received healthy response to inference request in 1.9865472316741943s
Received healthy response to inference request in 2.0623631477355957s
Received healthy response to inference request in 2.0783655643463135s
Received healthy response to inference request in 2.0727524757385254s
Received healthy response to inference request in 1.9950380325317383s
Received healthy response to inference request in 2.15142822265625s
Received healthy response to inference request in 1.840580940246582s
Received healthy response to inference request in 2.370170831680298s
Received healthy response to inference request in 2.2817962169647217s
Received healthy response to inference request in 2.2036893367767334s
Received healthy response to inference request in 2.022946834564209s
Received healthy response to inference request in 1.935469150543213s
Received healthy response to inference request in 1.9498696327209473s
Received healthy response to inference request in 2.15969181060791s
Received healthy response to inference request in 1.8715100288391113s
Received healthy response to inference request in 2.056483268737793s
Received healthy response to inference request in 1.9072556495666504s
Received healthy response to inference request in 2.2198867797851562s
Received healthy response to inference request in 2.1654610633850098s
Received healthy response to inference request in 2.193885087966919s
Received healthy response to inference request in 2.0285279750823975s
Received healthy response to inference request in 2.3609020709991455s
Received healthy response to inference request in 1.9661834239959717s
30 requests
0 failed requests
5th percentile: 1.876551616191864
10th percentile: 1.9048014402389526
20th percentile: 1.9543143272399903
30th percentile: 1.9876218557357788
40th percentile: 2.0179720878601075
50th percentile: 2.029601573944092
60th percentile: 2.0665188789367677
70th percentile: 2.153907299041748
80th percentile: 2.195845937728882
90th percentile: 2.226077723503113
95th percentile: 2.3253044366836546
99th percentile: 2.367482891082764
mean time: 2.0664162158966066
Pipeline stage StressChecker completed in 66.21s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.31s
Shutdown handler de-registered
chaiml-pony-d1-q235b-pv_39858_v2 status is now deployed due to DeploymentManager action
chaiml-pony-d1-q235b-pv_39858_v2 status is now inactive due to auto deactivation removed underperforming models