Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Starting job with name chaiml-pony-v1-q235b-lr-32150-v2-uploader
Waiting for job on chaiml-pony-v1-q235b-lr-32150-v2-uploader to finish
chaiml-pony-v1-q235b-lr-32150-v2-uploader: Using quantization_mode: w4a16
chaiml-pony-v1-q235b-lr-32150-v2-uploader: Checking if ChaiML/pony-v1-q235b-lr1e4ep1r64g8-W4A16 already exists in ChaiML
chaiml-pony-v1-q235b-lr-32150-v2-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v1-q235b-lr-32150-v2-uploader: Downloading snapshot of ChaiML/pony-v1-q235b-lr1e4ep1r64g8-W4A16...
chaiml-pony-v1-q235b-lr-32150-v2-uploader: Downloaded in 41.803s
chaiml-pony-v1-q235b-lr-32150-v2-uploader: Processed model ChaiML/pony-v1-q235b-lr1e4ep1r64g8 in 42.421s
chaiml-pony-v1-q235b-lr-32150-v2-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v1-q235b-lr-32150-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-32150-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v1-q235b-lr-32150-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v1-q235b-lr-32150-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v1-q235b-lr-32150-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-32150-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-32150-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-32150-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-32150-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-32150-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-lr-32150-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-32150-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-lr-32150-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v1-q235b-lr-32150-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v1-q235b-lr-32150-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v1-q235b-lr-32150-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v1-q235b-lr-32150-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v1-q235b-lr-32150-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/generation_config.json
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/chat_template.jinja
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/.gitattributes
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/config.json
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/added_tokens.json
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/quantization_config.json
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/special_tokens_map.json
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model.safetensors.index.json
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/merges.txt
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/tokenizer_config.json
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/vocab.json
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/tokenizer.json
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00027-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00005-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00001-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00001-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00018-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00007-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00020-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00003-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00023-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00024-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00006-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00012-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00016-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00008-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00002-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00009-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00015-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00015-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00022-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00025-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00019-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00021-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00004-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00011-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00017-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00010-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00013-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00014-of-00027.safetensors
chaiml-pony-v1-q235b-lr-32150-v2-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-32150-v2/default/model-00026-of-00027.safetensors
Job chaiml-pony-v1-q235b-lr-32150-v2-uploader completed after 107.44s with status: succeeded
Stopping job with name chaiml-pony-v1-q235b-lr-32150-v2-uploader
Pipeline stage VLLMUploader completed in 110.64s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v1-q235b-lr-32150-v2
Waiting for inference service chaiml-pony-v1-q235b-lr-32150-v2 to be ready
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Inference service chaiml-pony-v1-q235b-lr-32150-v2 ready after 390.2678084373474s
Pipeline stage VLLMDeployer completed in 390.78s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.0526251792907715s
Received healthy response to inference request in 2.0683529376983643s
Received healthy response to inference request in 2.0071778297424316s
Received healthy response to inference request in 2.0828258991241455s
Received healthy response to inference request in 2.0669262409210205s
Received healthy response to inference request in 1.8735034465789795s
Received healthy response to inference request in 1.9467692375183105s
Received healthy response to inference request in 2.0609209537506104s
Received healthy response to inference request in 2.3729188442230225s
Received healthy response to inference request in 1.9908998012542725s
Received healthy response to inference request in 2.236743211746216s
Received healthy response to inference request in 2.0181541442871094s
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Received healthy response to inference request in 1.9618778228759766s
Received healthy response to inference request in 2.308513641357422s
Received healthy response to inference request in 1.912660837173462s
Received healthy response to inference request in 1.9167637825012207s
Received healthy response to inference request in 1.853074312210083s
Received healthy response to inference request in 1.9905872344970703s
Received healthy response to inference request in 2.1841883659362793s
Received healthy response to inference request in 1.9957597255706787s
Received healthy response to inference request in 1.9327380657196045s
Received healthy response to inference request in 1.9750933647155762s
Received healthy response to inference request in 2.172356128692627s
Received healthy response to inference request in 2.089750289916992s
Received healthy response to inference request in 1.997464895248413s
Received healthy response to inference request in 2.010117530822754s
Received healthy response to inference request in 1.9135174751281738s
Received healthy response to inference request in 2.112720251083374s
Received healthy response to inference request in 1.909377098083496s
Received healthy response to inference request in 2.0612423419952393s
30 requests
0 failed requests
5th percentile: 1.889646589756012
10th percentile: 1.9123324632644654
20th percentile: 1.9295432090759277
30th percentile: 1.9711287021636963
40th percentile: 1.9938157558441163
50th percentile: 2.0086476802825928
60th percentile: 2.055943489074707
70th percentile: 2.0673542499542235
80th percentile: 2.0943442821502685
90th percentile: 2.189443850517273
95th percentile: 2.276216948032379
99th percentile: 2.354241335391998
mean time: 2.0358540296554564
Pipeline stage StressChecker completed in 65.21s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.67s
Shutdown handler de-registered
chaiml-pony-v1-q235b-lr_32150_v2 status is now deployed due to DeploymentManager action
chaiml-pony-v1-q235b-lr_32150_v2 status is now inactive due to auto deactivation removed underperforming models