Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v3a-q27b-lr-21575-v3-uploader
Waiting for job on chaiml-pony-v3a-q27b-lr-21575-v3-uploader to finish
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: Using quantization_mode: fp8
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: Checking if ChaiML/pony-v3a-q27b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: Downloading snapshot of ChaiML/pony-v3a-q27b-lr5e6ep2g8-FP8...
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: Downloaded in 29.987s
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: Processed model ChaiML/pony-v3a-q27b-lr5e6ep2g8 in 32.796s
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v3/default
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v3/default/.gitattributes
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v3/default/recipe.yaml
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v3/default/chat_template.jinja
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v3/default/config.json
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v3/default/generation_config.json
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v3/default/tokenizer_config.json
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v3/default/tokenizer.json
2026-03-29T10:03:49.663650+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v3
chaiml-pony-v3a-q27b-lr-21575-v3-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-v3a-q27b-lr-21575-v3/default/model.safetensors
Job chaiml-pony-v3a-q27b-lr-21575-v3-uploader completed after 112.8s with status: succeeded
Stopping job with name chaiml-pony-v3a-q27b-lr-21575-v3-uploader
Pipeline stage VLLMUploader completed in 113.34s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.68s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v3a-q27b-lr-21575-v3
Waiting for inference service chaiml-pony-v3a-q27b-lr-21575-v3 to be ready
2026-03-29T10:04:49.808395+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v3
2026-03-29T10:05:49.959272+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v3
2026-03-29T10:06:50.073018+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v3
Inference service chaiml-pony-v3a-q27b-lr-21575-v3 ready after 180.5306577682495s
Pipeline stage VLLMDeployer completed in 181.92s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-29T10:07:50.233622+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-29T10:08:50.413892+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 12.757486820220947s
2026-03-29T10:09:50.590197+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.632915735244751s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.3698596954345703s
Received healthy response to inference request in 2.3857429027557373s
Received healthy response to inference request in 1.8042843341827393s
Received healthy response to inference request in 4.138739347457886s
Received healthy response to inference request in 1.9885637760162354s
2026-03-29T10:10:50.701774+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v3
Received healthy response to inference request in 7.358596563339233s
Received healthy response to inference request in 2.1110265254974365s
Received healthy response to inference request in 1.995107650756836s
Received healthy response to inference request in 1.9903993606567383s
Received healthy response to inference request in 2.1126081943511963s
Received healthy response to inference request in 4.256978273391724s
Received healthy response to inference request in 1.965087890625s
Received healthy response to inference request in 2.094943046569824s
Received healthy response to inference request in 2.296771287918091s
Received healthy response to inference request in 2.290614604949951s
Received healthy response to inference request in 2.0429391860961914s
Received healthy response to inference request in 1.9814178943634033s
Received healthy response to inference request in 1.9974136352539062s
Received healthy response to inference request in 1.9422650337219238s
Received healthy response to inference request in 1.9795105457305908s
Received healthy response to inference request in 2.2528979778289795s
30 requests
7 failed requests
5th percentile: 1.9525353193283081
10th percentile: 1.9780682802200318
20th percentile: 1.9900322437286377
30th percentile: 2.0292815208435058
40th percentile: 2.1119755268096925
50th percentile: 2.293692946434021
60th percentile: 3.086941480636594
70th percentile: 5.4506199836730875
80th percentile: 20.136400365829466
90th percentile: 20.155486583709717
95th percentile: 20.35369896888733
99th percentile: 24.560666129589087
mean time: 7.27416554292043
%s, retrying in %s seconds...
Received healthy response to inference request in 1.7409732341766357s
Received healthy response to inference request in 1.9523165225982666s
Received healthy response to inference request in 1.8106951713562012s
Received healthy response to inference request in 1.850841999053955s
Received healthy response to inference request in 1.7920763492584229s
Received healthy response to inference request in 2.4719760417938232s
Received healthy response to inference request in 1.8855769634246826s
Received healthy response to inference request in 2.3817694187164307s
Received healthy response to inference request in 2.4185097217559814s
Received healthy response to inference request in 1.689504861831665s
2026-03-29T10:11:50.846453+00:00 monitor updated for chaiml-pony-v3a-q27b-lr_21575_v3
Received healthy response to inference request in 1.8275396823883057s
Received healthy response to inference request in 1.964707612991333s
Received healthy response to inference request in 1.891934871673584s
Failed to get response for submission chaiml-q235b-judge-dpo-_47447_v1: HTTPConnectionPool(host='chaiml-q235b-judge-dpo-47447-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received healthy response to inference request in 1.9175560474395752s
Received healthy response to inference request in 1.862117052078247s
Received healthy response to inference request in 2.167579174041748s
Received healthy response to inference request in 2.496691942214966s
Received healthy response to inference request in 1.908271312713623s
Received healthy response to inference request in 2.2123301029205322s
Received healthy response to inference request in 1.9112255573272705s
Received healthy response to inference request in 2.1649084091186523s
Received healthy response to inference request in 1.975308895111084s
Received healthy response to inference request in 2.1687681674957275s
Received healthy response to inference request in 2.0313875675201416s
Received healthy response to inference request in 1.9565560817718506s
Received healthy response to inference request in 1.9722647666931152s
Received healthy response to inference request in 2.0305471420288086s
Received healthy response to inference request in 2.077390432357788s
Received healthy response to inference request in 2.113111972808838s
Received healthy response to inference request in 2.4371206760406494s
30 requests
0 failed requests
5th percentile: 1.76396963596344
10th percentile: 1.8088332891464234
20th percentile: 1.8598620414733886
30th percentile: 1.9033703804016113
40th percentile: 1.9384123325347902
50th percentile: 1.9684861898422241
60th percentile: 2.030883312225342
70th percentile: 2.1286509037017822
80th percentile: 2.1774805545806886
90th percentile: 2.4203708171844482
95th percentile: 2.456291127204895
99th percentile: 2.4895243310928343
mean time: 2.0360519250233966
Pipeline stage StressChecker completed in 285.77s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.32s
Shutdown handler de-registered
chaiml-pony-v3a-q27b-lr_21575_v3 status is now deployed due to DeploymentManager action
chaiml-pony-v3a-q27b-lr_21575_v3 status is now inactive due to auto deactivation removed underperforming models