Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d2-q235b-pv-37537-v3-uploader
Waiting for job on chaiml-pony-d2-q235b-pv-37537-v3-uploader to finish
chaiml-pony-d2-q235b-pv-37537-v3-uploader: Using quantization_mode: w4a16
chaiml-pony-d2-q235b-pv-37537-v3-uploader: Checking if ChaiML/pony-d2-q235b-pv2-lr5e6ep2r64g4-W4A16 already exists in ChaiML
chaiml-pony-d2-q235b-pv-37537-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d2-q235b-pv-37537-v3-uploader: Downloading snapshot of ChaiML/pony-d2-q235b-pv2-lr5e6ep2r64g4-W4A16...
chaiml-pony-d2-q235b-pv-37537-v3-uploader: Downloaded in 57.214s
chaiml-pony-d2-q235b-pv-37537-v3-uploader: Processed model ChaiML/pony-d2-q235b-pv2-lr5e6ep2r64g4 in 57.964s
chaiml-pony-d2-q235b-pv-37537-v3-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d2-q235b-pv-37537-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d2-q235b-pv-37537-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d2-q235b-pv-37537-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d2-q235b-pv-37537-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d2-q235b-pv-37537-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d2-q235b-pv-37537-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d2-q235b-pv-37537-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d2-q235b-pv-37537-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d2-q235b-pv-37537-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d2-q235b-pv-37537-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d2-q235b-pv-37537-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d2-q235b-pv-37537-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d2-q235b-pv-37537-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d2-q235b-pv-37537-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d2-q235b-pv-37537-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d2-q235b-pv-37537-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d2-q235b-pv-37537-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d2-q235b-pv-37537-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/.gitattributes
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/merges.txt
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/generation_config.json
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/config.json
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/special_tokens_map.json
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/added_tokens.json
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/chat_template.jinja
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/tokenizer_config.json
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/vocab.json
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model.safetensors.index.json
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/tokenizer.json
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/quantization_config.json
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00027-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00008-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00011-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00020-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00010-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00005-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00006-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00019-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00009-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00003-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00022-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00025-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00026-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00007-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00018-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00012-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00024-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00002-of-00027.safetensors
chaiml-pony-d2-q235b-pv-37537-v3-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-d2-q235b-pv-37537-v3/default/model-00013-of-00027.safetensors
Job chaiml-pony-d2-q235b-pv-37537-v3-uploader completed after 156.63s with status: succeeded
Stopping job with name chaiml-pony-d2-q235b-pv-37537-v3-uploader
Pipeline stage VLLMUploader completed in 157.13s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.27s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d2-q235b-pv-37537-v3
Waiting for inference service chaiml-pony-d2-q235b-pv-37537-v3 to be ready
Inference service chaiml-pony-d2-q235b-pv-37537-v3 ready after 470.38603258132935s
Pipeline stage VLLMDeployer completed in 470.88s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 7.396138668060303s
Received healthy response to inference request in 12.491618156433105s
Received healthy response to inference request in 2.171924114227295s
Received healthy response to inference request in 2.5040042400360107s
Received healthy response to inference request in 2.218446969985962s
Received healthy response to inference request in 2.22048020362854s
Received healthy response to inference request in 2.0684974193573s
Received healthy response to inference request in 2.1270155906677246s
Received healthy response to inference request in 2.1824331283569336s
Received healthy response to inference request in 2.4501967430114746s
Received healthy response to inference request in 2.1020538806915283s
Received healthy response to inference request in 2.250802516937256s
Received healthy response to inference request in 2.134626865386963s
Received healthy response to inference request in 2.2305970191955566s
Received healthy response to inference request in 2.4907801151275635s
Received healthy response to inference request in 2.5208616256713867s
Received healthy response to inference request in 2.1392805576324463s
Received healthy response to inference request in 2.1364879608154297s
Received healthy response to inference request in 17.326833486557007s
Received healthy response to inference request in 2.2934587001800537s
Received healthy response to inference request in 2.2292866706848145s
Received healthy response to inference request in 2.293201446533203s
Received healthy response to inference request in 2.2662312984466553s
Received healthy response to inference request in 2.2196552753448486s
Retrying (%r) after connection broken by '%r': %s
Received healthy response to inference request in 4.603009223937988s
Received healthy response to inference request in 8.04674506187439s
Received healthy response to inference request in 2.2085940837860107s
Received healthy response to inference request in 2.8519797325134277s
Received healthy response to inference request in 2.5956552028656006s
Received healthy response to inference request in 2.416835308074951s
30 requests
0 failed requests
5th percentile: 2.1132866501808167
10th percentile: 2.133865737915039
20th percentile: 2.165395402908325
30th percentile: 2.2154911041259764
40th percentile: 2.2257640838623045
50th percentile: 2.2585169076919556
60th percentile: 2.3428093433380126
70th percentile: 2.494747352600098
80th percentile: 2.646920108795167
90th percentile: 7.461199307441713
95th percentile: 10.49142526388167
99th percentile: 15.92462104082108
mean time: 3.5729243755340576
Pipeline stage StressChecker completed in 126.47s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.13s
Shutdown handler de-registered
chaiml-pony-d2-q235b-pv_37537_v3 status is now deployed due to DeploymentManager action
chaiml-pony-d2-q235b-pv_37537_v3 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d2-q235b-pv_37537_v3 status is now inactive due to Froze recruitment for AB test 0305_zzai