Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name google-gemma-4-31b-it-v30-uploader
Waiting for job on google-gemma-4-31b-it-v30-uploader to finish
google-gemma-4-31b-it-v30-uploader: Using quantization_mode: none
google-gemma-4-31b-it-v30-uploader: Downloading snapshot of google/gemma-4-31B-it...
google-gemma-4-31b-it-v30-uploader: Downloaded in 30.987s
2026-04-08T17:13:53.709383+00:00 monitor updated for google-gemma-4-31b-it_v30
google-gemma-4-31b-it-v30-uploader: Processed model google/gemma-4-31B-it in 53.395s
google-gemma-4-31b-it-v30-uploader: creating bucket guanaco-vllm-models
google-gemma-4-31b-it-v30-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v30-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
google-gemma-4-31b-it-v30-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
google-gemma-4-31b-it-v30-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
google-gemma-4-31b-it-v30-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v30-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
google-gemma-4-31b-it-v30-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v30-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
google-gemma-4-31b-it-v30-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v30-uploader: if re.search("-\.", bucket, re.UNICODE):
google-gemma-4-31b-it-v30-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v30-uploader: if re.search("\.\.", bucket, re.UNICODE):
google-gemma-4-31b-it-v30-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
google-gemma-4-31b-it-v30-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
google-gemma-4-31b-it-v30-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
google-gemma-4-31b-it-v30-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
google-gemma-4-31b-it-v30-uploader: Bucket 's3://guanaco-vllm-models/' created
google-gemma-4-31b-it-v30-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default
google-gemma-4-31b-it-v30-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default/generation_config.json
google-gemma-4-31b-it-v30-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default/chat_template.jinja
google-gemma-4-31b-it-v30-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default/README.md
google-gemma-4-31b-it-v30-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default/config.json
google-gemma-4-31b-it-v30-uploader: cp /dev/shm/model_output/processor_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default/processor_config.json
google-gemma-4-31b-it-v30-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default/tokenizer_config.json
google-gemma-4-31b-it-v30-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default/model.safetensors.index.json
google-gemma-4-31b-it-v30-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default/.gitattributes
google-gemma-4-31b-it-v30-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default/tokenizer.json
google-gemma-4-31b-it-v30-uploader: cp /dev/shm/model_output/model-00002-of-00002.safetensors s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default/model-00002-of-00002.safetensors
2026-04-08T17:14:53.897857+00:00 monitor updated for google-gemma-4-31b-it_v30
google-gemma-4-31b-it-v30-uploader: cp /dev/shm/model_output/model-00001-of-00002.safetensors s3://guanaco-vllm-models/google-gemma-4-31b-it-v30/default/model-00001-of-00002.safetensors
Job google-gemma-4-31b-it-v30-uploader completed after 149.3s with status: succeeded
Stopping job with name google-gemma-4-31b-it-v30-uploader
Pipeline stage VLLMUploader completed in 150.41s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.19s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.44s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service google-gemma-4-31b-it-v30
Waiting for inference service google-gemma-4-31b-it-v30 to be ready
2026-04-08T17:15:54.127218+00:00 monitor updated for google-gemma-4-31b-it_v30
2026-04-08T17:16:54.354045+00:00 monitor updated for google-gemma-4-31b-it_v30
2026-04-08T17:17:54.676059+00:00 monitor updated for google-gemma-4-31b-it_v30
2026-04-08T17:18:54.984800+00:00 monitor updated for google-gemma-4-31b-it_v30
Inference service google-gemma-4-31b-it-v30 ready after 224.0188434123993s
Pipeline stage VLLMDeployer completed in 226.46s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 14.81768012046814s
Received healthy response to inference request in 4.190767049789429s
Received healthy response to inference request in 4.27469277381897s
Received healthy response to inference request in 4.164482355117798s
Received healthy response to inference request in 4.184171438217163s
2026-04-08T17:19:55.217887+00:00 monitor updated for google-gemma-4-31b-it_v30
Received healthy response to inference request in 10.629886150360107s
Received healthy response to inference request in 4.123170375823975s
Received healthy response to inference request in 5.440117359161377s
Received healthy response to inference request in 12.110343217849731s
Received healthy response to inference request in 4.188653230667114s
Received healthy response to inference request in 4.15369176864624s
Received healthy response to inference request in 4.144957780838013s
Received healthy response to inference request in 4.14884614944458s
Received healthy response to inference request in 4.250402450561523s
Received healthy response to inference request in 4.133734464645386s
2026-04-08T17:20:55.430024+00:00 monitor updated for google-gemma-4-31b-it_v30
Received healthy response to inference request in 15.608065366744995s
Received healthy response to inference request in 4.245303392410278s
Received healthy response to inference request in 4.103506326675415s
Received healthy response to inference request in 14.28876519203186s
Received healthy response to inference request in 4.341130256652832s
Received healthy response to inference request in 4.158815860748291s
Received healthy response to inference request in 4.14483380317688s
Received healthy response to inference request in 4.729858636856079s
Received healthy response to inference request in 4.5029988288879395s
Received healthy response to inference request in 4.133981943130493s
Received healthy response to inference request in 4.15725302696228s
2026-04-08T17:21:55.636985+00:00 monitor updated for google-gemma-4-31b-it_v30
Received healthy response to inference request in 4.299816608428955s
Received healthy response to inference request in 4.181836843490601s
Received healthy response to inference request in 4.225608587265015s
Received healthy response to inference request in 4.2210917472839355s
30 requests
0 failed requests
5th percentile: 4.12792421579361
10th percentile: 4.133957195281982
20th percentile: 4.148068475723266
30th percentile: 4.158347010612488
40th percentile: 4.183237600326538
50th percentile: 4.205929398536682
60th percentile: 4.2473430156707765
70th percentile: 4.312210702896118
80th percentile: 4.87191038131714
90th percentile: 12.328185415267948
95th percentile: 14.579668402671812
99th percentile: 15.378853645324707
mean time: 5.80994877020518
Pipeline stage StressChecker completed in 180.97s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.23s
Shutdown handler de-registered
google-gemma-4-31b-it_v30 status is now deployed due to DeploymentManager action
google-gemma-4-31b-it_v30 status is now inactive due to auto deactivation removed underperforming models