Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name google-gemma-4-31b-it-v14-uploader
Waiting for job on google-gemma-4-31b-it-v14-uploader to finish
google-gemma-4-31b-it-v14-uploader: Using quantization_mode: none
google-gemma-4-31b-it-v14-uploader: Downloading snapshot of google/gemma-4-31B-it...
google-gemma-4-31b-it-v14-uploader: Downloaded in 33.190s
2026-04-07T17:00:30.183522+00:00 monitor updated for google-gemma-4-31b-it_v14
google-gemma-4-31b-it-v14-uploader: Processed model google/gemma-4-31B-it in 55.785s
google-gemma-4-31b-it-v14-uploader: creating bucket guanaco-vllm-models
google-gemma-4-31b-it-v14-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v14-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
google-gemma-4-31b-it-v14-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
google-gemma-4-31b-it-v14-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
google-gemma-4-31b-it-v14-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v14-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
google-gemma-4-31b-it-v14-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v14-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
google-gemma-4-31b-it-v14-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v14-uploader: if re.search("-\.", bucket, re.UNICODE):
google-gemma-4-31b-it-v14-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v14-uploader: if re.search("\.\.", bucket, re.UNICODE):
google-gemma-4-31b-it-v14-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
google-gemma-4-31b-it-v14-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
google-gemma-4-31b-it-v14-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
google-gemma-4-31b-it-v14-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
google-gemma-4-31b-it-v14-uploader: Bucket 's3://guanaco-vllm-models/' created
google-gemma-4-31b-it-v14-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default
google-gemma-4-31b-it-v14-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default/.gitattributes
google-gemma-4-31b-it-v14-uploader: cp /dev/shm/model_output/processor_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default/processor_config.json
google-gemma-4-31b-it-v14-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default/tokenizer_config.json
google-gemma-4-31b-it-v14-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default/generation_config.json
google-gemma-4-31b-it-v14-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default/model.safetensors.index.json
google-gemma-4-31b-it-v14-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default/README.md
google-gemma-4-31b-it-v14-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default/chat_template.jinja
google-gemma-4-31b-it-v14-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default/config.json
google-gemma-4-31b-it-v14-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default/tokenizer.json
google-gemma-4-31b-it-v14-uploader: cp /dev/shm/model_output/model-00002-of-00002.safetensors s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default/model-00002-of-00002.safetensors
2026-04-07T17:01:30.339659+00:00 monitor updated for google-gemma-4-31b-it_v14
google-gemma-4-31b-it-v14-uploader: cp /dev/shm/model_output/model-00001-of-00002.safetensors s3://guanaco-vllm-models/google-gemma-4-31b-it-v14/default/model-00001-of-00002.safetensors
Job google-gemma-4-31b-it-v14-uploader completed after 168.94s with status: succeeded
Stopping job with name google-gemma-4-31b-it-v14-uploader
Pipeline stage VLLMUploader completed in 170.07s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.18s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.12s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service google-gemma-4-31b-it-v14
Waiting for inference service google-gemma-4-31b-it-v14 to be ready
2026-04-07T17:02:30.530464+00:00 monitor updated for google-gemma-4-31b-it_v14
2026-04-07T17:03:30.806708+00:00 monitor updated for google-gemma-4-31b-it_v14
2026-04-07T17:04:30.964813+00:00 monitor updated for google-gemma-4-31b-it_v14
2026-04-07T17:05:31.159711+00:00 monitor updated for google-gemma-4-31b-it_v14
2026-04-07T17:06:31.370348+00:00 monitor updated for google-gemma-4-31b-it_v14
2026-04-07T17:07:31.594198+00:00 monitor updated for google-gemma-4-31b-it_v14
2026-04-07T17:08:31.810408+00:00 monitor updated for google-gemma-4-31b-it_v14
Inference service google-gemma-4-31b-it-v14 ready after 385.2571635246277s
Pipeline stage VLLMDeployer completed in 386.28s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 6.442726373672485s
Received healthy response to inference request in 3.090862989425659s
Received healthy response to inference request in 5.922141790390015s
Received healthy response to inference request in 2.718519449234009s
Received healthy response to inference request in 2.6035983562469482s
Received healthy response to inference request in 2.977947473526001s
Received healthy response to inference request in 2.5809926986694336s
Received healthy response to inference request in 2.6863644123077393s
Received healthy response to inference request in 6.640293121337891s
Received healthy response to inference request in 2.52065372467041s
2026-04-07T17:09:31.984216+00:00 monitor updated for google-gemma-4-31b-it_v14
Received healthy response to inference request in 5.972095966339111s
Received healthy response to inference request in 2.8299856185913086s
Received healthy response to inference request in 5.882786512374878s
Received healthy response to inference request in 3.2776994705200195s
Received healthy response to inference request in 2.5953004360198975s
Received healthy response to inference request in 2.624157428741455s
Received healthy response to inference request in 2.7534618377685547s
Received healthy response to inference request in 2.6788365840911865s
Received healthy response to inference request in 2.6531622409820557s
Received healthy response to inference request in 2.7683961391448975s
Received healthy response to inference request in 2.5697314739227295s
Received healthy response to inference request in 2.6340749263763428s
Received healthy response to inference request in 2.6328699588775635s
Received healthy response to inference request in 2.6921584606170654s
Received healthy response to inference request in 2.6088714599609375s
Received healthy response to inference request in 2.663123607635498s
Received healthy response to inference request in 2.774801254272461s
2026-04-07T17:10:32.160091+00:00 monitor updated for google-gemma-4-31b-it_v14
Received healthy response to inference request in 2.7758100032806396s
Received healthy response to inference request in 2.7860825061798096s
Received healthy response to inference request in 2.742509126663208s
30 requests
0 failed requests
5th percentile: 2.5747990250587462
10th percentile: 2.593869662284851
20th percentile: 2.6211002349853514
30th percentile: 2.6474360466003417
40th percentile: 2.6833532810211183
50th percentile: 2.7305142879486084
60th percentile: 2.770958185195923
70th percentile: 2.7992534399032594
80th percentile: 3.128230285644532
90th percentile: 5.9271372079849245
95th percentile: 6.230942690372466
99th percentile: 6.582998764514923
mean time: 3.303333846728007
Pipeline stage StressChecker completed in 114.41s
Shutdown handler de-registered
google-gemma-4-31b-it_v14 status is now deployed due to DeploymentManager action
google-gemma-4-31b-it_v14 status is now inactive due to admin request