Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v3-q27b-lr5-22882-v7-uploader
Waiting for job on chaiml-pony-v3-q27b-lr5-22882-v7-uploader to finish
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: Using quantization_mode: fp8
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: Checking if ChaiML/pony-v3-q27b-lr5e6ep1g8-FP8 already exists in ChaiML
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: Downloading snapshot of ChaiML/pony-v3-q27b-lr5e6ep1g8-FP8...
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: Downloaded in 33.390s
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: Processed model ChaiML/pony-v3-q27b-lr5e6ep1g8 in 36.269s
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v7/default
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v7/default/.gitattributes
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v7/default/tokenizer_config.json
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v7/default/recipe.yaml
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v7/default/config.json
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v7/default/generation_config.json
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v7/default/chat_template.jinja
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v7/default/tokenizer.json
2026-03-29T10:12:14.026281+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v7
Failed to get response for submission chaiml-q235b-judge-dpo-_74524_v1: ('http://chaiml-q235b-judge-dpo-74524-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
chaiml-pony-v3-q27b-lr5-22882-v7-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-22882-v7/default/model.safetensors
Job chaiml-pony-v3-q27b-lr5-22882-v7-uploader completed after 112.98s with status: succeeded
Stopping job with name chaiml-pony-v3-q27b-lr5-22882-v7-uploader
Pipeline stage VLLMUploader completed in 113.78s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.13s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 4.83s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v3-q27b-lr5-22882-v7
Waiting for inference service chaiml-pony-v3-q27b-lr5-22882-v7 to be ready
2026-03-29T10:13:14.123863+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v7
Failed to get response for submission chaiml-q235b-judge-dpo-_47447_v1: ('http://chaiml-q235b-judge-dpo-47447-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
2026-03-29T10:14:14.257825+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v7
2026-03-29T10:15:14.838282+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v7
Inference service chaiml-pony-v3-q27b-lr5-22882-v7 ready after 151.4791750907898s
Pipeline stage VLLMDeployer completed in 151.96s
run pipeline stage %s
Running pipeline stage StressChecker
Failed to get response for submission chaiml-q235b-judge-dpo-_47447_v1: ('http://chaiml-q235b-judge-dpo-47447-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Failed to get response for submission chaiml-q235b-judge-dpo-_47447_v1: ('http://chaiml-q235b-judge-dpo-47447-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
2026-03-29T10:16:15.620616+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v7
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-29T10:17:15.710686+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v7
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 11.555993556976318s
Received healthy response to inference request in 1.8719005584716797s
Failed to get response for submission chaiml-q235b-judge-dpo-_47447_v1: ('http://chaiml-q235b-judge-dpo-47447-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
2026-03-29T10:18:15.817173+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v7
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.743683338165283s
Received healthy response to inference request in 4.385267019271851s
Received healthy response to inference request in 2.114776372909546s
Received healthy response to inference request in 2.1036107540130615s
Received healthy response to inference request in 7.080408334732056s
Received healthy response to inference request in 1.863074541091919s
Received healthy response to inference request in 2.1327295303344727s
Received healthy response to inference request in 2.0062248706817627s
Received healthy response to inference request in 1.8766376972198486s
Received healthy response to inference request in 2.2602615356445312s
Received healthy response to inference request in 1.9293174743652344s
Received healthy response to inference request in 1.8673524856567383s
Received healthy response to inference request in 2.0274863243103027s
Received healthy response to inference request in 1.8993408679962158s
Received healthy response to inference request in 2.2065281867980957s
Received healthy response to inference request in 2.5558457374572754s
Received healthy response to inference request in 1.9156584739685059s
Received healthy response to inference request in 2.4040493965148926s
Received healthy response to inference request in 1.9663755893707275s
Received healthy response to inference request in 2.3495593070983887s
Received healthy response to inference request in 2.017439603805542s
30 requests
7 failed requests
5th percentile: 1.869399118423462
10th percentile: 1.8761639833450316
20th percentile: 1.9265856742858887
30th percentile: 2.014075183868408
40th percentile: 2.110310125350952
50th percentile: 2.2333948612213135
60th percentile: 2.4647679328918457
70th percentile: 5.1938094139099045
80th percentile: 20.1144811630249
90th percentile: 20.155668640136717
95th percentile: 20.16143946647644
99th percentile: 20.454286046028137
mean time: 6.885150074958801
%s, retrying in %s seconds...
2026-03-29T10:19:15.939073+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v7
Received healthy response to inference request in 1.7583038806915283s
Received healthy response to inference request in 1.7636218070983887s
Received healthy response to inference request in 1.8191182613372803s
Received healthy response to inference request in 1.7885639667510986s
Received healthy response to inference request in 2.0481154918670654s
Received healthy response to inference request in 1.8083970546722412s
Received healthy response to inference request in 1.8095982074737549s
Received healthy response to inference request in 1.870323657989502s
Received healthy response to inference request in 1.9275672435760498s
Received healthy response to inference request in 1.8306539058685303s
Received healthy response to inference request in 2.060086488723755s
Received healthy response to inference request in 2.1132187843322754s
Received healthy response to inference request in 2.4585113525390625s
Received healthy response to inference request in 2.5111706256866455s
Received healthy response to inference request in 1.8433218002319336s
Received healthy response to inference request in 2.1063380241394043s
Received healthy response to inference request in 1.9496655464172363s
Received healthy response to inference request in 1.8785419464111328s
Received healthy response to inference request in 1.8297877311706543s
Received healthy response to inference request in 1.9633095264434814s
Received healthy response to inference request in 1.9427123069763184s
Received healthy response to inference request in 2.26631498336792s
Received healthy response to inference request in 1.9554693698883057s
Received healthy response to inference request in 2.6671090126037598s
Received healthy response to inference request in 2.3988492488861084s
Received healthy response to inference request in 2.2524728775024414s
Received healthy response to inference request in 1.9228179454803467s
Received healthy response to inference request in 1.9281067848205566s
Received healthy response to inference request in 1.9814190864562988s
2026-03-29T10:20:16.045147+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_22882_v7
Received healthy response to inference request in 2.0206568241119385s
30 requests
0 failed requests
5th percentile: 1.7748457789421082
10th percentile: 1.806413745880127
20th percentile: 1.8276538372039794
30th percentile: 1.8622231006622314
40th percentile: 1.9256675243377686
50th percentile: 1.9461889266967773
60th percentile: 1.9705533504486084
70th percentile: 2.0517067909240723
80th percentile: 2.1410696029663088
90th percentile: 2.4048154592514037
95th percentile: 2.487473952770233
99th percentile: 2.621886880397797
mean time: 2.0158047914505004
Pipeline stage StressChecker completed in 273.55s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.69s
Shutdown handler de-registered
chaiml-pony-v3-q27b-lr5_22882_v7 status is now deployed due to DeploymentManager action
chaiml-pony-v3-q27b-lr5_22882_v7 status is now inactive due to auto deactivation removed underperforming models