Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3b-mv1-top2-9386-v7-uploader
Waiting for job on chaiml-pony-d3b-mv1-top2-9386-v7-uploader to finish
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: Using quantization_mode: fp8
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: Checking if ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: Downloading snapshot of ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8-FP8...
2026-03-28T14:49:50.875682+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: Downloaded in 36.182s
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: Processed model ChaiML/pony-d3b-mv1-top2-q35b-lr5e6ep2g8 in 38.680s
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v7/default
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v7/default/recipe.yaml
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v7/default/chat_template.jinja
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v7/default/generation_config.json
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v7/default/.gitattributes
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v7/default/tokenizer_config.json
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v7/default/config.json
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v7/default/tokenizer.json
2026-03-28T14:50:50.979447+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
chaiml-pony-d3b-mv1-top2-9386-v7-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3b-mv1-top2-9386-v7/default/model.safetensors
Job chaiml-pony-d3b-mv1-top2-9386-v7-uploader completed after 143.22s with status: succeeded
Stopping job with name chaiml-pony-d3b-mv1-top2-9386-v7-uploader
Pipeline stage VLLMUploader completed in 143.79s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.10s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.30s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3b-mv1-top2-9386-v7
Waiting for inference service chaiml-pony-d3b-mv1-top2-9386-v7 to be ready
2026-03-28T14:51:51.099081+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
2026-03-28T14:52:51.255432+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
2026-03-28T14:53:51.353188+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
2026-03-28T14:54:51.446514+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
2026-03-28T14:55:51.544890+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
2026-03-28T14:56:51.637082+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
Inference service chaiml-pony-d3b-mv1-top2-9386-v7 ready after 350.9197061061859s
Pipeline stage VLLMDeployer completed in 351.52s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T14:57:51.728543+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
Failed to get response for submission chaiml-gspo-glm47-combi_10268_v1: ('http://chaiml-gspo-glm47-combi-10268-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 12.815340518951416s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 5.814079284667969s
2026-03-28T14:58:51.828207+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
Received healthy response to inference request in 15.0700523853302s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.123148679733276s
Received healthy response to inference request in 9.41396450996399s
2026-03-28T14:59:51.930932+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.3481690883636475s
Received healthy response to inference request in 1.3600091934204102s
Received healthy response to inference request in 3.41060471534729s
Received healthy response to inference request in 11.73862886428833s
Received healthy response to inference request in 1.3191766738891602s
Received healthy response to inference request in 3.738325357437134s
Received healthy response to inference request in 1.449193000793457s
Received healthy response to inference request in 1.3801672458648682s
Received healthy response to inference request in 1.9241440296173096s
Received healthy response to inference request in 1.2670769691467285s
Received healthy response to inference request in 1.3615810871124268s
Received healthy response to inference request in 1.5321063995361328s
Received healthy response to inference request in 1.4033575057983398s
Failed to get response for submission chaiml-glm-47-bobo-v1-s_16089_v2: ('http://chaiml-glm-47-bobo-v1-s-16089-v2-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
Received healthy response to inference request in 1.2775585651397705s
Received healthy response to inference request in 1.3306167125701904s
Received healthy response to inference request in 1.31056547164917s
Received healthy response to inference request in 1.3103153705596924s
Received healthy response to inference request in 1.3675785064697266s
Received healthy response to inference request in 1.3005661964416504s
30 requests
6 failed requests
5th percentile: 1.2879119992256165
10th percentile: 1.3093404531478883
20th percentile: 1.3283287048339845
30th percentile: 1.3611095190048217
40th percentile: 1.3940814018249512
50th percentile: 1.7281252145767212
60th percentile: 3.8922546863555905
70th percentile: 10.111363816261285
80th percentile: 16.07970809936525
90th percentile: 20.11950123310089
95th percentile: 20.132244348526
99th percentile: 20.160981364250183
mean time: 6.97185837427775
%s, retrying in %s seconds...
Received healthy response to inference request in 1.7336750030517578s
Received healthy response to inference request in 1.3889944553375244s
Received healthy response to inference request in 1.3671321868896484s
Received healthy response to inference request in 1.3132307529449463s
Received healthy response to inference request in 1.6806259155273438s
Received healthy response to inference request in 1.2334785461425781s
2026-03-28T15:00:52.025996+00:00 monitor updated for chaiml-pony-d3b-mv1-top2_9386_v7
Received healthy response to inference request in 1.5436782836914062s
Received healthy response to inference request in 1.3112514019012451s
Received healthy response to inference request in 1.4610605239868164s
Received healthy response to inference request in 1.6421704292297363s
Received healthy response to inference request in 1.5172677040100098s
Received healthy response to inference request in 1.2937352657318115s
Received healthy response to inference request in 1.2634007930755615s
Received healthy response to inference request in 1.1964187622070312s
Received healthy response to inference request in 1.8142364025115967s
Received healthy response to inference request in 1.2639439105987549s
Received healthy response to inference request in 1.2806134223937988s
Received healthy response to inference request in 1.4704439640045166s
Received healthy response to inference request in 1.4777092933654785s
Received healthy response to inference request in 1.3335232734680176s
Received healthy response to inference request in 1.8193764686584473s
Received healthy response to inference request in 1.733814001083374s
Received healthy response to inference request in 1.260441780090332s
Received healthy response to inference request in 1.278348445892334s
Received healthy response to inference request in 1.285390853881836s
Received healthy response to inference request in 1.31715726852417s
Received healthy response to inference request in 1.4908292293548584s
Received healthy response to inference request in 1.5404155254364014s
Received healthy response to inference request in 1.7338752746582031s
Received healthy response to inference request in 1.3550629615783691s
30 requests
0 failed requests
5th percentile: 1.2456120014190675
10th percentile: 1.2631048917770387
20th percentile: 1.2801604270935059
30th percentile: 1.305996561050415
40th percentile: 1.3269768714904786
50th percentile: 1.3780633211135864
60th percentile: 1.4733500957489014
70th percentile: 1.5242120504379273
80th percentile: 1.649861526489258
90th percentile: 1.733820128440857
95th percentile: 1.7780738949775694
99th percentile: 1.8178858494758605
mean time: 1.4467100699742634
Pipeline stage StressChecker completed in 259.15s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.19s
Shutdown handler de-registered
chaiml-pony-d3b-mv1-top2_9386_v7 status is now deployed due to DeploymentManager action
chaiml-pony-d3b-mv1-top2_9386_v7 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3b-mv1-top2_9386_v7 status is now torndown due to DeploymentManager action