Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v1-q235b-lr-99625-v6-uploader
Waiting for job on chaiml-pony-v1-q235b-lr-99625-v6-uploader to finish
chaiml-pony-v1-q235b-lr-99625-v6-uploader: Using quantization_mode: w4a16
chaiml-pony-v1-q235b-lr-99625-v6-uploader: Checking if ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16 already exists in ChaiML
chaiml-pony-v1-q235b-lr-99625-v6-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v1-q235b-lr-99625-v6-uploader: Downloading snapshot of ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16...
chaiml-pony-v1-q235b-lr-99625-v6-uploader: Downloaded in 40.031s
chaiml-pony-v1-q235b-lr-99625-v6-uploader: Processed model ChaiML/pony-v1-q235b-lr1e4ep1r64g4 in 40.584s
chaiml-pony-v1-q235b-lr-99625-v6-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v1-q235b-lr-99625-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v6-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
chaiml-pony-v1-q235b-lr-99625-v6-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
chaiml-pony-v1-q235b-lr-99625-v6-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
chaiml-pony-v1-q235b-lr-99625-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
chaiml-pony-v1-q235b-lr-99625-v6-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v6-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v6-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-lr-99625-v6-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v6-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-lr-99625-v6-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v1-q235b-lr-99625-v6-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v6-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v1-q235b-lr-99625-v6-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v1-q235b-lr-99625-v6-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v1-q235b-lr-99625-v6-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/added_tokens.json
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/.gitattributes
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/generation_config.json
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/tokenizer_config.json
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/config.json
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/merges.txt
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/chat_template.jinja
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/quantization_config.json
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/tokenizer.json
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/model.safetensors.index.json
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/special_tokens_map.json
chaiml-pony-v1-q235b-lr-99625-v6-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v6/default/vocab.json
Job chaiml-pony-v1-q235b-lr-99625-v6-uploader completed after 206.25s with status: succeeded
Stopping job with name chaiml-pony-v1-q235b-lr-99625-v6-uploader
Pipeline stage VLLMUploader completed in 228.92s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 6.23s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v1-q235b-lr-99625-v6
Waiting for inference service chaiml-pony-v1-q235b-lr-99625-v6 to be ready
Inference service chaiml-pony-v1-q235b-lr-99625-v6 ready after 502.5408251285553s
Pipeline stage VLLMDeployer completed in 519.91s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 6.8991968631744385s
Received healthy response to inference request in 3.2761642932891846s
Received healthy response to inference request in 2.505444288253784s
Received healthy response to inference request in 3.541168212890625s
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Received healthy response to inference request in 5.640349626541138s
Received healthy response to inference request in 3.4551780223846436s
Received healthy response to inference request in 2.5915775299072266s
Received healthy response to inference request in 3.0514156818389893s
Received healthy response to inference request in 4.737245082855225s
Received healthy response to inference request in 6.707557201385498s
Received healthy response to inference request in 5.152626991271973s
Received healthy response to inference request in 2.6071572303771973s
Received healthy response to inference request in 4.063845157623291s
Received healthy response to inference request in 7.878023147583008s
Received healthy response to inference request in 4.6074464321136475s
Received healthy response to inference request in 5.191849231719971s
Received healthy response to inference request in 5.284888505935669s
Received healthy response to inference request in 3.90673565864563s
Received healthy response to inference request in 3.7755279541015625s
Received healthy response to inference request in 4.766114950180054s
Received healthy response to inference request in 6.929657697677612s
Received healthy response to inference request in 4.232197999954224s
Received healthy response to inference request in 2.9064459800720215s
Received healthy response to inference request in 3.665740728378296s
Received healthy response to inference request in 4.429396629333496s
Received healthy response to inference request in 3.4312310218811035s
Received healthy response to inference request in 2.111057758331299s
Received healthy response to inference request in 2.485949993133545s
Received healthy response to inference request in 2.025247573852539s
Received healthy response to inference request in 1.8559765815734863s
30 requests
0 failed requests
5th percentile: 2.063862156867981
10th percentile: 2.4484607696533205
20th percentile: 2.6040412902832033
30th percentile: 3.2087397098541257
40th percentile: 3.5067721366882325
50th percentile: 3.841131806373596
60th percentile: 4.311077451705932
70th percentile: 4.745906043052673
80th percentile: 5.21045708656311
90th percentile: 6.726721167564392
95th percentile: 6.915950322151184
99th percentile: 7.602997167110444
mean time: 4.1237471342086796
Pipeline stage StressChecker completed in 218.08s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.63s
Shutdown handler de-registered
chaiml-pony-v1-q235b-lr_99625_v6 status is now deployed due to DeploymentManager action
chaiml-pony-v1-q235b-lr_99625_v6 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v1-q235b-lr_99625_v6 status is now torndown due to DeploymentManager action