Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v1-q235b-lr-99625-v3-uploader
Waiting for job on chaiml-pony-v1-q235b-lr-99625-v3-uploader to finish
chaiml-pony-v1-q235b-lr-99625-v3-uploader: Using quantization_mode: w4a16
chaiml-pony-v1-q235b-lr-99625-v3-uploader: Checking if ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16 already exists in ChaiML
chaiml-pony-v1-q235b-lr-99625-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v1-q235b-lr-99625-v3-uploader: Downloading snapshot of ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16...
chaiml-pony-v1-q235b-lr-99625-v3-uploader: Downloaded in 47.377s
chaiml-pony-v1-q235b-lr-99625-v3-uploader: Processed model ChaiML/pony-v1-q235b-lr1e4ep1r64g4 in 47.922s
chaiml-pony-v1-q235b-lr-99625-v3-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v1-q235b-lr-99625-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v1-q235b-lr-99625-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v1-q235b-lr-99625-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v1-q235b-lr-99625-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-lr-99625-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
chaiml-pony-v1-q235b-lr-99625-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-lr-99625-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v1-q235b-lr-99625-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v1-q235b-lr-99625-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v1-q235b-lr-99625-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v1-q235b-lr-99625-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/generation_config.json
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/special_tokens_map.json
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/config.json
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/quantization_config.json
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/.gitattributes
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/chat_template.jinja
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/added_tokens.json
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/tokenizer_config.json
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/vocab.json
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/merges.txt
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/model.safetensors.index.json
chaiml-pony-v1-q235b-lr-99625-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v3/default/tokenizer.json
Job chaiml-pony-v1-q235b-lr-99625-v3-uploader completed after 251.44s with status: succeeded
Stopping job with name chaiml-pony-v1-q235b-lr-99625-v3-uploader
Pipeline stage VLLMUploader completed in 256.45s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.39s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v1-q235b-lr-99625-v3
Waiting for inference service chaiml-pony-v1-q235b-lr-99625-v3 to be ready
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Inference service chaiml-pony-v1-q235b-lr-99625-v3 ready after 451.32388401031494s
Pipeline stage VLLMDeployer completed in 461.39s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 6.462724208831787s
Received healthy response to inference request in 3.480492353439331s
Received healthy response to inference request in 6.004939556121826s
Received healthy response to inference request in 4.038223743438721s
Received healthy response to inference request in 5.649416208267212s
Received healthy response to inference request in 2.2187705039978027s
Received healthy response to inference request in 2.946141242980957s
Received healthy response to inference request in 2.0041821002960205s
Received healthy response to inference request in 3.2620155811309814s
Received healthy response to inference request in 1.7564146518707275s
Received healthy response to inference request in 1.7766048908233643s
Received healthy response to inference request in 2.1316323280334473s
Received healthy response to inference request in 3.099661350250244s
Received healthy response to inference request in 2.1356937885284424s
Received healthy response to inference request in 2.3896117210388184s
Received healthy response to inference request in 1.7654292583465576s
Received healthy response to inference request in 2.4120125770568848s
Received healthy response to inference request in 1.800262689590454s
Received healthy response to inference request in 1.8038034439086914s
Received healthy response to inference request in 2.2868175506591797s
Received healthy response to inference request in 2.168541431427002s
Received healthy response to inference request in 1.8144128322601318s
Received healthy response to inference request in 1.7954752445220947s
Received healthy response to inference request in 1.851532220840454s
Received healthy response to inference request in 2.1844840049743652s
Received healthy response to inference request in 1.9073028564453125s
Received healthy response to inference request in 1.8207340240478516s
Received healthy response to inference request in 1.79337739944458s
Received healthy response to inference request in 1.9461445808410645s
Received healthy response to inference request in 2.6162195205688477s
30 requests
0 failed requests
5th percentile: 1.7704582929611206
10th percentile: 1.7917001485824584
20th percentile: 1.803095293045044
30th percentile: 1.8422927618026734
40th percentile: 1.9809670925140381
50th percentile: 2.152117609977722
60th percentile: 2.2459893226623535
70th percentile: 2.473274660110473
80th percentile: 3.132132196426392
90th percentile: 4.199342989921572
95th percentile: 5.8449540495872485
99th percentile: 6.3299666595458985
mean time: 2.644102462132772
Pipeline stage StressChecker completed in 178.87s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 15.46s
Shutdown handler de-registered
chaiml-pony-v1-q235b-lr_99625_v3 status is now deployed due to DeploymentManager action
chaiml-pony-v1-q235b-lr_99625_v3 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v1-q235b-lr_99625_v3 status is now torndown due to DeploymentManager action