Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v2-q32b-lr1-61519-v3-uploader
Waiting for job on chaiml-pony-v2-q32b-lr1-61519-v3-uploader to finish
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: Using quantization_mode: none
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: Downloading snapshot of ChaiML/pony-v2-q32b-lr1e4ep2r64g8...
Failed to get response for submission chaiml-pony-v1-q32b-2k_v4: HTTPConnectionPool(host='chaiml-pony-v1-q32b-2k-v4-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: Downloaded in 26.901s
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: Processed model ChaiML/pony-v2-q32b-lr1e4ep2r64g8 in 51.203s
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/chat_template.jinja
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/tokenizer_config.json
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model.safetensors.index.json
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/.gitattributes
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/args.json s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/args.json
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/config.json
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/merges.txt
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/training_args.bin s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/training_args.bin
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/special_tokens_map.json
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/trainer_state.json s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/trainer_state.json
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/generation_config.json
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/added_tokens.json
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/vocab.json
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/tokenizer.json
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00014-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00014-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00006-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00006-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00012-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00012-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00001-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00001-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00003-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00003-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00011-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00011-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00008-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00008-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00005-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00005-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00010-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00010-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00009-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00009-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00002-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00002-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00007-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00007-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00013-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00013-of-00014.safetensors
chaiml-pony-v2-q32b-lr1-61519-v3-uploader: cp /dev/shm/model_output/model-00004-of-00014.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-q32b-lr1-61519-v3/default/model-00004-of-00014.safetensors
Job chaiml-pony-v2-q32b-lr1-61519-v3-uploader completed after 86.08s with status: succeeded
Stopping job with name chaiml-pony-v2-q32b-lr1-61519-v3-uploader
Pipeline stage VLLMUploader completed in 86.79s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.52s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v2-q32b-lr1-61519-v3
Waiting for inference service chaiml-pony-v2-q32b-lr1-61519-v3 to be ready
Failed to get response for submission chaiml-sft-qwen-235b-ro_85751_v1: ('http://chaiml-sft-qwen-235b-ro-85751-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/completions', '')
Inference service chaiml-pony-v2-q32b-lr1-61519-v3 ready after 473.5468838214874s
Pipeline stage VLLMDeployer completed in 474.84s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.9960904121398926s
Received healthy response to inference request in 4.25727105140686s
Received healthy response to inference request in 3.809130907058716s
Received healthy response to inference request in 3.876007080078125s
Received healthy response to inference request in 3.7744832038879395s
Received healthy response to inference request in 3.8713748455047607s
Received healthy response to inference request in 3.7642228603363037s
Received healthy response to inference request in 4.220532178878784s
Received healthy response to inference request in 3.7783381938934326s
Received healthy response to inference request in 4.020586967468262s
Received healthy response to inference request in 3.9307920932769775s
Received healthy response to inference request in 3.490635395050049s
Received healthy response to inference request in 3.8214690685272217s
Received healthy response to inference request in 3.771698236465454s
Received healthy response to inference request in 4.007018566131592s
Received healthy response to inference request in 3.7940456867218018s
Received healthy response to inference request in 3.9145772457122803s
Received healthy response to inference request in 3.7978317737579346s
Received healthy response to inference request in 4.008546352386475s
Received healthy response to inference request in 3.775916337966919s
Received healthy response to inference request in 3.9500269889831543s
Received healthy response to inference request in 3.7879483699798584s
Received healthy response to inference request in 4.040325880050659s
Received healthy response to inference request in 3.8781301975250244s
Received healthy response to inference request in 3.763770580291748s
Received healthy response to inference request in 3.8691985607147217s
Received healthy response to inference request in 3.7886619567871094s
Received healthy response to inference request in 3.8190019130706787s
Received healthy response to inference request in 3.832637071609497s
Received healthy response to inference request in 3.7650957107543945s
30 requests
0 failed requests
5th percentile: 3.763974106311798
10th percentile: 3.7650084257125855
20th percentile: 3.7756297111511232
30th percentile: 3.788447880744934
40th percentile: 3.804611253738403
50th percentile: 3.8270530700683594
60th percentile: 3.8732277393341064
70th percentile: 3.9194416999816895
80th percentile: 3.9982760429382322
90th percentile: 4.022560858726502
95th percentile: 4.139439344406127
99th percentile: 4.246616778373718
mean time: 3.872512189547221
Pipeline stage StressChecker completed in 120.62s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.91s
Shutdown handler de-registered
chaiml-pony-v2-q32b-lr1_61519_v3 status is now deployed due to DeploymentManager action
chaiml-pony-v2-q32b-lr1_61519_v3 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v2-q32b-lr1_61519_v3 status is now torndown due to DeploymentManager action