Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v3-q27b-lr5-49140-v2-uploader
Waiting for job on chaiml-pony-v3-q27b-lr5-49140-v2-uploader to finish
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: Using quantization_mode: fp8
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: Checking if ChaiML/pony-v3-q27b-lr5e6ep2g8-30k-FP8 already exists in ChaiML
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: Downloading snapshot of ChaiML/pony-v3-q27b-lr5e6ep2g8-30k-FP8...
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: Downloaded in 34.377s
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: Processed model ChaiML/pony-v3-q27b-lr5e6ep2g8-30k in 37.160s
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-49140-v2/default
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-49140-v2/default/config.json
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-49140-v2/default/.gitattributes
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-49140-v2/default/tokenizer_config.json
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-49140-v2/default/recipe.yaml
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-49140-v2/default/chat_template.jinja
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-49140-v2/default/generation_config.json
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-49140-v2/default/tokenizer.json
2026-03-31T05:00:51.156221+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_49140_v2
chaiml-pony-v3-q27b-lr5-49140-v2-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-49140-v2/default/model.safetensors
Job chaiml-pony-v3-q27b-lr5-49140-v2-uploader completed after 113.1s with status: succeeded
Stopping job with name chaiml-pony-v3-q27b-lr5-49140-v2-uploader
Pipeline stage VLLMUploader completed in 113.66s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.23s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.76s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v3-q27b-lr5-49140-v2
Waiting for inference service chaiml-pony-v3-q27b-lr5-49140-v2 to be ready
2026-03-31T05:01:51.238553+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_49140_v2
2026-03-31T05:02:51.321321+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_49140_v2
2026-03-31T05:03:51.404980+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_49140_v2
Inference service chaiml-pony-v3-q27b-lr5-49140-v2 ready after 160.22041940689087s
Pipeline stage VLLMDeployer completed in 160.69s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-31T05:04:51.492093+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_49140_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 11.262761116027832s
2026-03-31T05:05:51.588490+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_49140_v2
Received healthy response to inference request in 19.642953395843506s
Received healthy response to inference request in 1.9482626914978027s
Received healthy response to inference request in 1.8847804069519043s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-31T05:06:51.677391+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_49140_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.8775150775909424s
Received healthy response to inference request in 4.2187840938568115s
Received healthy response to inference request in 6.2230446338653564s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.170893430709839s
Received healthy response to inference request in 1.8986330032348633s
Received healthy response to inference request in 1.9834370613098145s
Received healthy response to inference request in 2.0479841232299805s
Received healthy response to inference request in 2.0387983322143555s
Received healthy response to inference request in 2.5150113105773926s
2026-03-31T05:07:51.762425+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_49140_v2
Received healthy response to inference request in 2.005155563354492s
Received healthy response to inference request in 1.9349803924560547s
Received healthy response to inference request in 2.7393958568573s
Received healthy response to inference request in 1.898179292678833s
Received healthy response to inference request in 1.9617884159088135s
Received healthy response to inference request in 2.3100297451019287s
Received healthy response to inference request in 2.117948055267334s
Received healthy response to inference request in 1.940711259841919s
Received healthy response to inference request in 2.0603256225585938s
Received healthy response to inference request in 2.0072875022888184s
30 requests
7 failed requests
5th percentile: 1.8908099055290222
10th percentile: 1.8985876321792603
20th percentile: 1.946752405166626
30th percentile: 1.9986400127410888
40th percentile: 2.0443098068237306
50th percentile: 2.1444207429885864
60th percentile: 2.604765129089355
70th percentile: 7.7349595785140846
80th percentile: 20.112203502655028
90th percentile: 20.21114499568939
95th percentile: 20.2272540807724
99th percentile: 21.223967027664187
mean time: 7.448259822527567
%s, retrying in %s seconds...
Received healthy response to inference request in 1.9346623420715332s
Received healthy response to inference request in 1.8373157978057861s
Received healthy response to inference request in 2.624447822570801s
Received healthy response to inference request in 1.8316094875335693s
Received healthy response to inference request in 1.7948336601257324s
Received healthy response to inference request in 3.402045488357544s
Received healthy response to inference request in 1.8173832893371582s
Received healthy response to inference request in 1.755448341369629s
Received healthy response to inference request in 2.351405382156372s
Received healthy response to inference request in 2.0148234367370605s
Received healthy response to inference request in 2.5805845260620117s
Received healthy response to inference request in 2.3156356811523438s
Received healthy response to inference request in 2.5244877338409424s
Received healthy response to inference request in 2.4547314643859863s
Received healthy response to inference request in 1.9351146221160889s
Received healthy response to inference request in 1.956641435623169s
2026-03-31T05:08:51.855081+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_49140_v2
Received healthy response to inference request in 1.9009554386138916s
Received healthy response to inference request in 1.9702415466308594s
Received healthy response to inference request in 1.9378981590270996s
Received healthy response to inference request in 2.0906827449798584s
Received healthy response to inference request in 1.9338641166687012s
Received healthy response to inference request in 1.9445016384124756s
Received healthy response to inference request in 2.4312381744384766s
Received healthy response to inference request in 2.0351643562316895s
Received healthy response to inference request in 1.9706299304962158s
Received healthy response to inference request in 1.9384472370147705s
Received healthy response to inference request in 2.1666407585144043s
Received healthy response to inference request in 2.1263468265533447s
Received healthy response to inference request in 2.064678192138672s
Received healthy response to inference request in 2.0903372764587402s
30 requests
0 failed requests
5th percentile: 1.804980993270874
10th percentile: 1.8301868677139281
20th percentile: 1.9272823810577393
30th percentile: 1.9370630979537964
40th percentile: 1.9517855167388916
50th percentile: 1.9927266836166382
60th percentile: 2.0749418258666994
70th percentile: 2.1384350061416626
80th percentile: 2.3673719406127933
90th percentile: 2.5300974130630496
95th percentile: 2.6047093391418454
99th percentile: 3.1765421652793893
mean time: 2.124426563580831
Pipeline stage StressChecker completed in 292.98s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.87s
Shutdown handler de-registered
chaiml-pony-v3-q27b-lr5_49140_v2 status is now deployed due to DeploymentManager action
chaiml-pony-v3-q27b-lr5_49140_v2 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v3-q27b-lr5_49140_v2 status is now torndown due to DeploymentManager action