Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-mega-v1-plc-q27b-57593-v3-uploader
Waiting for job on chaiml-mega-v1-plc-q27b-57593-v3-uploader to finish
chaiml-mega-v1-plc-q27b-57593-v3-uploader: Using quantization_mode: fp8
chaiml-mega-v1-plc-q27b-57593-v3-uploader: Checking if ChaiML/mega-v1-plc-q27b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-mega-v1-plc-q27b-57593-v3-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-mega-v1-plc-q27b-57593-v3-uploader: Downloading snapshot of ChaiML/mega-v1-plc-q27b-lr5e6ep2g8-FP8...
2026-03-28T13:07:54.468638+00:00 monitor updated for chaiml-mega-v1-plc-q27b_57593_v3
chaiml-mega-v1-plc-q27b-57593-v3-uploader: Downloaded in 33.187s
chaiml-mega-v1-plc-q27b-57593-v3-uploader: Processed model ChaiML/mega-v1-plc-q27b-lr5e6ep2g8 in 35.690s
chaiml-mega-v1-plc-q27b-57593-v3-uploader: creating bucket guanaco-vllm-models
chaiml-mega-v1-plc-q27b-57593-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-mega-v1-plc-q27b-57593-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-mega-v1-plc-q27b-57593-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-mega-v1-plc-q27b-57593-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-mega-v1-plc-q27b-57593-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-mega-v1-plc-q27b-57593-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-mega-v1-plc-q27b-57593-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-mega-v1-plc-q27b-57593-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-mega-v1-plc-q27b-57593-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-mega-v1-plc-q27b-57593-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-mega-v1-plc-q27b-57593-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-mega-v1-plc-q27b-57593-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-mega-v1-plc-q27b-57593-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-mega-v1-plc-q27b-57593-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-mega-v1-plc-q27b-57593-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-mega-v1-plc-q27b-57593-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-mega-v1-plc-q27b-57593-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-mega-v1-plc-q27b-57593-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-mega-v1-plc-q27b-57593-v3/default
chaiml-mega-v1-plc-q27b-57593-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-mega-v1-plc-q27b-57593-v3/default/.gitattributes
chaiml-mega-v1-plc-q27b-57593-v3-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-mega-v1-plc-q27b-57593-v3/default/recipe.yaml
chaiml-mega-v1-plc-q27b-57593-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-mega-v1-plc-q27b-57593-v3/default/generation_config.json
chaiml-mega-v1-plc-q27b-57593-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-mega-v1-plc-q27b-57593-v3/default/tokenizer_config.json
chaiml-mega-v1-plc-q27b-57593-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-mega-v1-plc-q27b-57593-v3/default/config.json
chaiml-mega-v1-plc-q27b-57593-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-mega-v1-plc-q27b-57593-v3/default/chat_template.jinja
chaiml-mega-v1-plc-q27b-57593-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-mega-v1-plc-q27b-57593-v3/default/tokenizer.json
2026-03-28T13:08:54.559358+00:00 monitor updated for chaiml-mega-v1-plc-q27b_57593_v3
chaiml-mega-v1-plc-q27b-57593-v3-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-mega-v1-plc-q27b-57593-v3/default/model.safetensors
Job chaiml-mega-v1-plc-q27b-57593-v3-uploader completed after 144.53s with status: succeeded
Stopping job with name chaiml-mega-v1-plc-q27b-57593-v3-uploader
Pipeline stage VLLMUploader completed in 145.16s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.26s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-mega-v1-plc-q27b-57593-v3
Waiting for inference service chaiml-mega-v1-plc-q27b-57593-v3 to be ready
2026-03-28T13:09:54.657520+00:00 monitor updated for chaiml-mega-v1-plc-q27b_57593_v3
2026-03-28T13:10:54.746193+00:00 monitor updated for chaiml-mega-v1-plc-q27b_57593_v3
2026-03-28T13:11:54.871073+00:00 monitor updated for chaiml-mega-v1-plc-q27b_57593_v3
Inference service chaiml-mega-v1-plc-q27b-57593-v3 ready after 170.4376039505005s
Pipeline stage VLLMDeployer completed in 170.95s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T13:12:55.021574+00:00 monitor updated for chaiml-mega-v1-plc-q27b_57593_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Failed to get response for submission chaiml-gspo-glm47-cas72_44260_v1: ('http://chaiml-gspo-glm47-cas72-44260-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
Received healthy response to inference request in 13.965526580810547s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T13:13:55.111917+00:00 monitor updated for chaiml-mega-v1-plc-q27b_57593_v3
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.396633863449097s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T13:14:55.252855+00:00 monitor updated for chaiml-mega-v1-plc-q27b_57593_v3
Received healthy response to inference request in 4.540976047515869s
Received healthy response to inference request in 1.89351487159729s
Received healthy response to inference request in 2.0814852714538574s
Received healthy response to inference request in 1.814117670059204s
Received healthy response to inference request in 1.8556327819824219s
Received healthy response to inference request in 1.9651448726654053s
Received healthy response to inference request in 1.7450895309448242s
Received healthy response to inference request in 2.066094398498535s
Received healthy response to inference request in 2.0362088680267334s
Received healthy response to inference request in 4.9254515171051025s
Received healthy response to inference request in 1.9484970569610596s
Received healthy response to inference request in 2.0962963104248047s
Received healthy response to inference request in 1.9132673740386963s
Received healthy response to inference request in 1.9613256454467773s
Received healthy response to inference request in 2.0042684078216553s
Received healthy response to inference request in 1.9892685413360596s
Received healthy response to inference request in 2.0649399757385254s
Received healthy response to inference request in 1.9938488006591797s
Received healthy response to inference request in 2.332796812057495s
Received healthy response to inference request in 2.4940121173858643s
Received healthy response to inference request in 1.955336570739746s
30 requests
7 failed requests
5th percentile: 1.8327994704246522
10th percentile: 1.8897266626358031
20th percentile: 1.9539686679840087
30th percentile: 1.9820314407348634
40th percentile: 2.0234326839447023
50th percentile: 2.0737898349761963
60th percentile: 2.3972829341888424
70th percentile: 4.656318688392638
80th percentile: 20.102055072784424
90th percentile: 20.12880997657776
95th percentile: 20.14910272359848
99th percentile: 20.154973134994506
mean time: 6.89845339457194
%s, retrying in %s seconds...
Received healthy response to inference request in 1.6912484169006348s
Received healthy response to inference request in 1.8219795227050781s
Received healthy response to inference request in 1.7316200733184814s
Received healthy response to inference request in 2.029097557067871s
Received healthy response to inference request in 1.6486613750457764s
Received healthy response to inference request in 1.8999948501586914s
2026-03-28T13:15:55.344099+00:00 monitor updated for chaiml-mega-v1-plc-q27b_57593_v3
Received healthy response to inference request in 2.334163188934326s
Received healthy response to inference request in 2.5788965225219727s
Received healthy response to inference request in 1.9479165077209473s
Received healthy response to inference request in 2.48581862449646s
Received healthy response to inference request in 2.510946750640869s
Received healthy response to inference request in 1.8062031269073486s
Received healthy response to inference request in 1.8342645168304443s
Received healthy response to inference request in 1.7732479572296143s
Received healthy response to inference request in 1.8724384307861328s
Received healthy response to inference request in 1.8227972984313965s
Received healthy response to inference request in 1.9177284240722656s
Received healthy response to inference request in 1.853156328201294s
Received healthy response to inference request in 1.9300315380096436s
Received healthy response to inference request in 1.8893139362335205s
Received healthy response to inference request in 2.0411415100097656s
Received healthy response to inference request in 2.405545711517334s
Received healthy response to inference request in 1.8536858558654785s
Received healthy response to inference request in 2.3569490909576416s
Received healthy response to inference request in 2.041252374649048s
Received healthy response to inference request in 1.8998112678527832s
Received healthy response to inference request in 1.9613244533538818s
Received healthy response to inference request in 2.506478786468506s
Received healthy response to inference request in 1.9854109287261963s
Received healthy response to inference request in 1.9835431575775146s
30 requests
0 failed requests
5th percentile: 1.7094156622886658
10th percentile: 1.769085168838501
20th percentile: 1.8226337432861328
30th percentile: 1.8535269975662232
40th percentile: 1.8956123352050782
50th percentile: 1.9238799810409546
60th percentile: 1.970211935043335
70th percentile: 2.0327107429504396
80th percentile: 2.3387203693389895
90th percentile: 2.4878846406936646
95th percentile: 2.5089361667633057
99th percentile: 2.5591910886764526
mean time: 2.0138222694396974
Pipeline stage StressChecker completed in 274.19s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.01s
Shutdown handler de-registered
chaiml-mega-v1-plc-q27b_57593_v3 status is now deployed due to DeploymentManager action
chaiml-mega-v1-plc-q27b_57593_v3 status is now inactive due to admin request
chaiml-mega-v1-plc-q27b_57593_v3 status is now torndown due to DeploymentManager action