Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v1-q235b-lr-99625-v9-uploader
Waiting for job on chaiml-pony-v1-q235b-lr-99625-v9-uploader to finish
chaiml-pony-v1-q235b-lr-99625-v9-uploader: Using quantization_mode: w4a16
chaiml-pony-v1-q235b-lr-99625-v9-uploader: Checking if ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16 already exists in ChaiML
chaiml-pony-v1-q235b-lr-99625-v9-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v1-q235b-lr-99625-v9-uploader: Downloading snapshot of ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16...
chaiml-pony-v1-q235b-lr-99625-v9-uploader: Downloaded in 47.990s
chaiml-pony-v1-q235b-lr-99625-v9-uploader: Processed model ChaiML/pony-v1-q235b-lr1e4ep1r64g4 in 48.649s
chaiml-pony-v1-q235b-lr-99625-v9-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v1-q235b-lr-99625-v9-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v9-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v1-q235b-lr-99625-v9-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v1-q235b-lr-99625-v9-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v1-q235b-lr-99625-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v9-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v9-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v9-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-lr-99625-v9-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v9-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-lr-99625-v9-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v1-q235b-lr-99625-v9-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v9-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v1-q235b-lr-99625-v9-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v1-q235b-lr-99625-v9-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v1-q235b-lr-99625-v9-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/config.json
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/.gitattributes
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/quantization_config.json
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/tokenizer_config.json
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/added_tokens.json
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/special_tokens_map.json
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/generation_config.json
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/vocab.json
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model.safetensors.index.json
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/merges.txt
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/chat_template.jinja
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/tokenizer.json
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00027-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00001-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00023-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00013-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00025-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00021-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00007-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00022-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00004-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00009-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00026-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00011-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00014-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00012-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00020-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00024-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00002-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00010-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00018-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00016-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00017-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00019-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00008-of-00027.safetensors
chaiml-pony-v1-q235b-lr-99625-v9-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v9/default/model-00005-of-00027.safetensors
Job chaiml-pony-v1-q235b-lr-99625-v9-uploader completed after 188.32s with status: succeeded
Stopping job with name chaiml-pony-v1-q235b-lr-99625-v9-uploader
Pipeline stage VLLMUploader completed in 188.89s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.31s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v1-q235b-lr-99625-v9
Waiting for inference service chaiml-pony-v1-q235b-lr-99625-v9 to be ready
Inference service chaiml-pony-v1-q235b-lr-99625-v9 ready after 610.411315202713s
Pipeline stage VLLMDeployer completed in 610.80s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.1237781047821045s
Received healthy response to inference request in 1.894866943359375s
Received healthy response to inference request in 2.0453803539276123s
Received healthy response to inference request in 1.9848647117614746s
Received healthy response to inference request in 2.0022923946380615s
Received healthy response to inference request in 1.8571197986602783s
Received healthy response to inference request in 2.2886621952056885s
Received healthy response to inference request in 2.0089058876037598s
Received healthy response to inference request in 1.9043428897857666s
Received healthy response to inference request in 2.118269920349121s
Received healthy response to inference request in 2.0023159980773926s
Received healthy response to inference request in 2.0040786266326904s
Received healthy response to inference request in 1.8783559799194336s
Received healthy response to inference request in 1.8967022895812988s
Received healthy response to inference request in 2.084850549697876s
Received healthy response to inference request in 1.9708383083343506s
Received healthy response to inference request in 2.002136707305908s
Received healthy response to inference request in 2.0292086601257324s
Received healthy response to inference request in 1.9316847324371338s
Received healthy response to inference request in 2.032411813735962s
Received healthy response to inference request in 2.0806968212127686s
Received healthy response to inference request in 1.8885600566864014s
Received healthy response to inference request in 2.110222816467285s
Received healthy response to inference request in 2.288205146789551s
Received healthy response to inference request in 2.218407154083252s
Received healthy response to inference request in 2.1711153984069824s
Received healthy response to inference request in 1.9940979480743408s
Received healthy response to inference request in 2.08475399017334s
Received healthy response to inference request in 2.0775740146636963s
Received healthy response to inference request in 1.9395153522491455s
30 requests
0 failed requests
5th percentile: 1.882947814464569
10th percentile: 1.8942362546920777
20th percentile: 1.9262163639068604
30th percentile: 1.9806567907333374
40th percentile: 2.0022301197052004
50th percentile: 2.006492257118225
60th percentile: 2.037599229812622
70th percentile: 2.08191397190094
80th percentile: 2.1118322372436524
90th percentile: 2.1758445739746093
95th percentile: 2.256796050071716
99th percentile: 2.2885296511650086
mean time: 2.0304738521575927
Pipeline stage StressChecker completed in 66.25s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.00s
Shutdown handler de-registered
chaiml-pony-v1-q235b-lr_99625_v9 status is now deployed due to DeploymentManager action
chaiml-pony-v1-q235b-lr_99625_v9 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v1-q235b-lr_99625_v9 status is now torndown due to DeploymentManager action