Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name google-gemma-4-31b-it-v25-uploader
Waiting for job on google-gemma-4-31b-it-v25-uploader to finish
google-gemma-4-31b-it-v25-uploader: Using quantization_mode: none
google-gemma-4-31b-it-v25-uploader: Downloading snapshot of google/gemma-4-31B-it...
google-gemma-4-31b-it-v25-uploader: Downloaded in 34.960s
2026-04-08T01:58:37.771552+00:00 monitor updated for google-gemma-4-31b-it_v25
google-gemma-4-31b-it-v25-uploader: Processed model google/gemma-4-31B-it in 57.242s
google-gemma-4-31b-it-v25-uploader: creating bucket guanaco-vllm-models
google-gemma-4-31b-it-v25-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v25-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
google-gemma-4-31b-it-v25-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
google-gemma-4-31b-it-v25-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
google-gemma-4-31b-it-v25-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v25-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
google-gemma-4-31b-it-v25-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v25-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
google-gemma-4-31b-it-v25-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v25-uploader: if re.search("-\.", bucket, re.UNICODE):
google-gemma-4-31b-it-v25-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v25-uploader: if re.search("\.\.", bucket, re.UNICODE):
google-gemma-4-31b-it-v25-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
google-gemma-4-31b-it-v25-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
google-gemma-4-31b-it-v25-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
google-gemma-4-31b-it-v25-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
google-gemma-4-31b-it-v25-uploader: Bucket 's3://guanaco-vllm-models/' created
google-gemma-4-31b-it-v25-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default
google-gemma-4-31b-it-v25-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default/model.safetensors.index.json
google-gemma-4-31b-it-v25-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default/.gitattributes
google-gemma-4-31b-it-v25-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default/README.md
google-gemma-4-31b-it-v25-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default/tokenizer_config.json
google-gemma-4-31b-it-v25-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default/generation_config.json
google-gemma-4-31b-it-v25-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default/config.json
google-gemma-4-31b-it-v25-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default/chat_template.jinja
google-gemma-4-31b-it-v25-uploader: cp /dev/shm/model_output/processor_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default/processor_config.json
google-gemma-4-31b-it-v25-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default/tokenizer.json
google-gemma-4-31b-it-v25-uploader: cp /dev/shm/model_output/model-00002-of-00002.safetensors s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default/model-00002-of-00002.safetensors
2026-04-08T01:59:37.948232+00:00 monitor updated for google-gemma-4-31b-it_v25
google-gemma-4-31b-it-v25-uploader: cp /dev/shm/model_output/model-00001-of-00002.safetensors s3://guanaco-vllm-models/google-gemma-4-31b-it-v25/default/model-00001-of-00002.safetensors
Job google-gemma-4-31b-it-v25-uploader completed after 148.66s with status: succeeded
Stopping job with name google-gemma-4-31b-it-v25-uploader
Pipeline stage VLLMUploader completed in 149.76s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.19s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.81s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service google-gemma-4-31b-it-v25
Waiting for inference service google-gemma-4-31b-it-v25 to be ready
2026-04-08T02:00:38.165215+00:00 monitor updated for google-gemma-4-31b-it_v25
2026-04-08T02:01:38.359060+00:00 monitor updated for google-gemma-4-31b-it_v25
2026-04-08T02:02:38.602240+00:00 monitor updated for google-gemma-4-31b-it_v25
2026-04-08T02:03:38.801384+00:00 monitor updated for google-gemma-4-31b-it_v25
Inference service google-gemma-4-31b-it-v25 ready after 232.70610904693604s
Pipeline stage VLLMDeployer completed in 234.48s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 12.451606512069702s
Received healthy response to inference request in 11.852229833602905s
2026-04-08T02:04:39.017372+00:00 monitor updated for google-gemma-4-31b-it_v25
Received healthy response to inference request in 11.828283548355103s
Received healthy response to inference request in 12.255571603775024s
Received healthy response to inference request in 4.168937921524048s
Received healthy response to inference request in 4.740953683853149s
Received healthy response to inference request in 11.935661792755127s
Received healthy response to inference request in 4.602880239486694s
Received healthy response to inference request in 4.449542999267578s
Received healthy response to inference request in 4.175592422485352s
Received healthy response to inference request in 4.575549364089966s
2026-04-08T02:05:39.248465+00:00 monitor updated for google-gemma-4-31b-it_v25
Received healthy response to inference request in 4.327905893325806s
Received healthy response to inference request in 4.242472887039185s
Received healthy response to inference request in 4.263993263244629s
Received healthy response to inference request in 4.149102449417114s
Received healthy response to inference request in 4.623436450958252s
Received healthy response to inference request in 4.303601980209351s
Received healthy response to inference request in 4.168993711471558s
Received healthy response to inference request in 4.370487928390503s
Received healthy response to inference request in 4.368121385574341s
Received healthy response to inference request in 4.171973705291748s
Received healthy response to inference request in 4.263621091842651s
Received healthy response to inference request in 4.181446313858032s
Received healthy response to inference request in 4.407179594039917s
Received healthy response to inference request in 4.467685222625732s
2026-04-08T02:06:39.499724+00:00 monitor updated for google-gemma-4-31b-it_v25
Received healthy response to inference request in 4.390730857849121s
Received healthy response to inference request in 5.127627849578857s
Received healthy response to inference request in 4.1790611743927s
Received healthy response to inference request in 4.08502721786499s
Received healthy response to inference request in 4.302287817001343s
30 requests
0 failed requests
5th percentile: 4.158028411865234
10th percentile: 4.168988132476807
20th percentile: 4.17836742401123
30th percentile: 4.257276630401611
40th percentile: 4.303076314926147
50th percentile: 4.369304656982422
60th percentile: 4.4241249561309814
70th percentile: 4.583748626708984
80th percentile: 4.818288516998292
90th percentile: 11.860573029518127
95th percentile: 12.11161218881607
99th percentile: 12.394756388664245
mean time: 5.647718890508016
Pipeline stage StressChecker completed in 177.67s
Shutdown handler de-registered
google-gemma-4-31b-it_v25 status is now deployed due to DeploymentManager action