Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v1-q235b-lr-99625-v4-uploader
Waiting for job on chaiml-pony-v1-q235b-lr-99625-v4-uploader to finish
chaiml-pony-v1-q235b-lr-99625-v4-uploader: Using quantization_mode: w4a16
chaiml-pony-v1-q235b-lr-99625-v4-uploader: Checking if ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16 already exists in ChaiML
chaiml-pony-v1-q235b-lr-99625-v4-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v1-q235b-lr-99625-v4-uploader: Downloading snapshot of ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16...
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
chaiml-pony-v1-q235b-lr-99625-v4-uploader: Downloaded in 56.759s
chaiml-pony-v1-q235b-lr-99625-v4-uploader: Processed model ChaiML/pony-v1-q235b-lr1e4ep1r64g4 in 57.334s
chaiml-pony-v1-q235b-lr-99625-v4-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v1-q235b-lr-99625-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v4-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v1-q235b-lr-99625-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v1-q235b-lr-99625-v4-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v1-q235b-lr-99625-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v4-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v4-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v4-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-lr-99625-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v4-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-lr-99625-v4-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v1-q235b-lr-99625-v4-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v4-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v1-q235b-lr-99625-v4-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v1-q235b-lr-99625-v4-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v1-q235b-lr-99625-v4-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/.gitattributes
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/generation_config.json
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/config.json
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/chat_template.jinja
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/added_tokens.json
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/tokenizer_config.json
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/quantization_config.json
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/special_tokens_map.json
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/vocab.json
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/merges.txt
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/tokenizer.json
chaiml-pony-v1-q235b-lr-99625-v4-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v4/default/model.safetensors.index.json
Job chaiml-pony-v1-q235b-lr-99625-v4-uploader completed after 183.88s with status: succeeded
Stopping job with name chaiml-pony-v1-q235b-lr-99625-v4-uploader
Pipeline stage VLLMUploader completed in 185.38s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
HTTP Request: %s %s "%s %d %s"
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v1-q235b-lr-99625-v4
Waiting for inference service chaiml-pony-v1-q235b-lr-99625-v4 to be ready
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v3: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Inference service chaiml-pony-v1-q235b-lr-99625-v4 ready after 500.5367696285248s
Pipeline stage VLLMDeployer completed in 513.15s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.968294143676758s
Received healthy response to inference request in 5.261852264404297s
Received healthy response to inference request in 7.252310514450073s
Received healthy response to inference request in 4.027486324310303s
Received healthy response to inference request in 5.337197542190552s
Received healthy response to inference request in 4.690927267074585s
Received healthy response to inference request in 2.528343677520752s
Received healthy response to inference request in 3.102827548980713s
Received healthy response to inference request in 3.6397979259490967s
Received healthy response to inference request in 2.2685892581939697s
Received healthy response to inference request in 2.2591044902801514s
Received healthy response to inference request in 1.908543586730957s
Received healthy response to inference request in 1.882277250289917s
Received healthy response to inference request in 1.8893077373504639s
Received healthy response to inference request in 2.328324556350708s
Received healthy response to inference request in 2.7035939693450928s
Received healthy response to inference request in 2.9825170040130615s
Received healthy response to inference request in 1.974245309829712s
Received healthy response to inference request in 3.393720865249634s
Received healthy response to inference request in 4.319453001022339s
Received healthy response to inference request in 5.183995485305786s
Received healthy response to inference request in 5.520408868789673s
Received healthy response to inference request in 4.367236852645874s
Received healthy response to inference request in 2.2981436252593994s
Received healthy response to inference request in 3.0484793186187744s
Received healthy response to inference request in 3.3875033855438232s
Received healthy response to inference request in 2.6086316108703613s
Received healthy response to inference request in 4.031998634338379s
Received healthy response to inference request in 3.3280081748962402s
Received healthy response to inference request in 2.8590805530548096s
30 requests
0 failed requests
5th percentile: 1.8979638695716858
10th percentile: 1.9676751375198365
20th percentile: 2.2922327518463135
30th percentile: 2.5845452308654786
40th percentile: 2.9246087074279785
50th percentile: 3.0756534337997437
60th percentile: 3.3899903774261473
70th percentile: 4.028840017318726
80th percentile: 4.431974935531617
90th percentile: 5.269386792182923
95th percentile: 5.437963771820068
99th percentile: 6.750059037208558
mean time: 3.445073358217875
Pipeline stage StressChecker completed in 233.53s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 28.67s
Shutdown handler de-registered
chaiml-pony-v1-q235b-lr_99625_v4 status is now deployed due to DeploymentManager action
chaiml-pony-v1-q235b-lr_99625_v4 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v1-q235b-lr_99625_v4 status is now torndown due to DeploymentManager action