Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name google-gemma-4-31b-it-v41-uploader
Waiting for job on google-gemma-4-31b-it-v41-uploader to finish
2026-04-11T00:37:20.977047+00:00 monitor updated for google-gemma-4-31b-it_v41
2026-04-11T00:38:21.056987+00:00 monitor updated for google-gemma-4-31b-it_v41
google-gemma-4-31b-it-v41-uploader: Using quantization_mode: none
google-gemma-4-31b-it-v41-uploader: Downloading snapshot of google/gemma-4-31B-it...
google-gemma-4-31b-it-v41-uploader: Downloaded in 34.374s
google-gemma-4-31b-it-v41-uploader: Processed model google/gemma-4-31B-it in 34.504s
google-gemma-4-31b-it-v41-uploader: creating bucket guanaco-vllm-models
google-gemma-4-31b-it-v41-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v41-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
google-gemma-4-31b-it-v41-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
google-gemma-4-31b-it-v41-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
google-gemma-4-31b-it-v41-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v41-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
google-gemma-4-31b-it-v41-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v41-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
google-gemma-4-31b-it-v41-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v41-uploader: if re.search("-\.", bucket, re.UNICODE):
google-gemma-4-31b-it-v41-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v41-uploader: if re.search("\.\.", bucket, re.UNICODE):
google-gemma-4-31b-it-v41-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
google-gemma-4-31b-it-v41-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
google-gemma-4-31b-it-v41-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
google-gemma-4-31b-it-v41-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
google-gemma-4-31b-it-v41-uploader: Bucket 's3://guanaco-vllm-models/' created
google-gemma-4-31b-it-v41-uploader: uploading /tmp/model_output to s3://guanaco-vllm-models/google-gemma-4-31b-it-v41/default
google-gemma-4-31b-it-v41-uploader: cp /tmp/model_output/.gitattributes s3://guanaco-vllm-models/google-gemma-4-31b-it-v41/default/.gitattributes
google-gemma-4-31b-it-v41-uploader: cp /tmp/model_output/tokenizer_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v41/default/tokenizer_config.json
google-gemma-4-31b-it-v41-uploader: cp /tmp/model_output/generation_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v41/default/generation_config.json
google-gemma-4-31b-it-v41-uploader: cp /tmp/model_output/model.safetensors.index.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v41/default/model.safetensors.index.json
google-gemma-4-31b-it-v41-uploader: cp /tmp/model_output/chat_template.jinja s3://guanaco-vllm-models/google-gemma-4-31b-it-v41/default/chat_template.jinja
google-gemma-4-31b-it-v41-uploader: cp /tmp/model_output/README.md s3://guanaco-vllm-models/google-gemma-4-31b-it-v41/default/README.md
google-gemma-4-31b-it-v41-uploader: cp /tmp/model_output/processor_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v41/default/processor_config.json
google-gemma-4-31b-it-v41-uploader: cp /tmp/model_output/config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v41/default/config.json
google-gemma-4-31b-it-v41-uploader: cp /tmp/model_output/tokenizer.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v41/default/tokenizer.json
google-gemma-4-31b-it-v41-uploader: cp /tmp/model_output/model-00002-of-00002.safetensors s3://guanaco-vllm-models/google-gemma-4-31b-it-v41/default/model-00002-of-00002.safetensors
2026-04-11T00:39:21.140818+00:00 monitor updated for google-gemma-4-31b-it_v41
Job google-gemma-4-31b-it-v41-uploader completed after 234.77s with status: succeeded
Stopping job with name google-gemma-4-31b-it-v41-uploader
Pipeline stage VLLMUploader completed in 235.21s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.08s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.08s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service google-gemma-4-31b-it-v41
Waiting for inference service google-gemma-4-31b-it-v41 to be ready
2026-04-11T00:40:21.234769+00:00 monitor updated for google-gemma-4-31b-it_v41
2026-04-11T00:41:30.145681+00:00 monitor updated for google-gemma-4-31b-it_v41
2026-04-11T00:42:30.280633+00:00 monitor updated for google-gemma-4-31b-it_v41
2026-04-11T00:43:30.371070+00:00 monitor updated for google-gemma-4-31b-it_v41
Inference service google-gemma-4-31b-it-v41 ready after 220.28702330589294s
Pipeline stage VLLMDeployer completed in 220.72s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 10.841064929962158s
Received healthy response to inference request in 8.941861152648926s
Received healthy response to inference request in 4.088128328323364s
2026-04-11T00:44:30.454567+00:00 monitor updated for google-gemma-4-31b-it_v41
Received healthy response to inference request in 10.555948734283447s
Received healthy response to inference request in 2.289374828338623s
Received healthy response to inference request in 2.297060489654541s
Received healthy response to inference request in 2.2307400703430176s
Received healthy response to inference request in 2.5099196434020996s
Received healthy response to inference request in 2.2853498458862305s
Received healthy response to inference request in 2.259117364883423s
Received healthy response to inference request in 2.2399134635925293s
Received healthy response to inference request in 2.5743744373321533s
Received healthy response to inference request in 2.2934763431549072s
Received healthy response to inference request in 2.2710957527160645s
Received healthy response to inference request in 2.3166115283966064s
Received healthy response to inference request in 2.288918972015381s
Received healthy response to inference request in 2.270449638366699s
Received healthy response to inference request in 2.346980571746826s
Received healthy response to inference request in 2.370262622833252s
Received healthy response to inference request in 2.3406102657318115s
Received healthy response to inference request in 2.388899087905884s
Received healthy response to inference request in 2.6533937454223633s
Received healthy response to inference request in 10.490662574768066s
Received healthy response to inference request in 2.5074150562286377s
2026-04-11T00:45:30.911823+00:00 monitor updated for google-gemma-4-31b-it_v41
Received healthy response to inference request in 2.310305595397949s
Received healthy response to inference request in 2.283053398132324s
Received healthy response to inference request in 9.115112543106079s
Received healthy response to inference request in 3.845571279525757s
Received healthy response to inference request in 2.314063549041748s
Received healthy response to inference request in 2.527143955230713s
30 requests
0 failed requests
5th percentile: 2.2485552191734315
10th percentile: 2.2693164110183717
20th percentile: 2.284890556335449
30th percentile: 2.292245888710022
40th percentile: 2.3125603675842283
50th percentile: 2.343795418739319
60th percentile: 2.436305475234985
70th percentile: 2.541313099861145
80th percentile: 3.894082689285279
90th percentile: 9.25266754627228
95th percentile: 10.526569962501526
99th percentile: 10.758381233215331
mean time: 3.7348959922790526
Pipeline stage StressChecker completed in 115.09s
Shutdown handler de-registered
google-gemma-4-31b-it_v41 status is now deployed due to DeploymentManager action
google-gemma-4-31b-it_v41 status is now inactive due to auto deactivation removed underperforming models
google-gemma-4-31b-it_v41 status is now torndown due to DeploymentManager action