Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name google-gemma-4-31b-it-v48-uploader
Waiting for job on google-gemma-4-31b-it-v48-uploader to finish
google-gemma-4-31b-it-v48-uploader: Using quantization_mode: none
google-gemma-4-31b-it-v48-uploader: Downloading snapshot of google/gemma-4-31B-it...
google-gemma-4-31b-it-v48-uploader: HTTP Error 503 thrown while requesting HEAD https://huggingface.co/google/gemma-4-31B-it/resolve/439edf5652646a0d1bd8b46bfdc1d3645761a445/README.md
google-gemma-4-31b-it-v48-uploader: Retrying in 1s [Retry 1/5].
google-gemma-4-31b-it-v48-uploader: HTTP Error 503 thrown while requesting HEAD https://huggingface.co/google/gemma-4-31B-it/resolve/439edf5652646a0d1bd8b46bfdc1d3645761a445/chat_template.jinja
google-gemma-4-31b-it-v48-uploader: Retrying in 1s [Retry 1/5].
google-gemma-4-31b-it-v48-uploader: HTTP Error 503 thrown while requesting HEAD https://huggingface.co/google/gemma-4-31B-it/resolve/439edf5652646a0d1bd8b46bfdc1d3645761a445/README.md
google-gemma-4-31b-it-v48-uploader: Retrying in 2s [Retry 2/5].
google-gemma-4-31b-it-v48-uploader: HTTP Error 502 thrown while requesting HEAD https://huggingface.co/google/gemma-4-31B-it/resolve/439edf5652646a0d1bd8b46bfdc1d3645761a445/README.md
google-gemma-4-31b-it-v48-uploader: Retrying in 4s [Retry 3/5].
google-gemma-4-31b-it-v48-uploader: HTTP Error 504 thrown while requesting HEAD https://huggingface.co/google/gemma-4-31B-it/resolve/439edf5652646a0d1bd8b46bfdc1d3645761a445/tokenizer_config.json
google-gemma-4-31b-it-v48-uploader: Retrying in 1s [Retry 1/5].
google-gemma-4-31b-it-v48-uploader: HTTP Error 504 thrown while requesting HEAD https://huggingface.co/google/gemma-4-31B-it/resolve/439edf5652646a0d1bd8b46bfdc1d3645761a445/README.md
google-gemma-4-31b-it-v48-uploader: Retrying in 8s [Retry 4/5].
google-gemma-4-31b-it-v48-uploader: HTTP Error 504 thrown while requesting HEAD https://huggingface.co/google/gemma-4-31B-it/resolve/439edf5652646a0d1bd8b46bfdc1d3645761a445/tokenizer_config.json
google-gemma-4-31b-it-v48-uploader: Retrying in 2s [Retry 2/5].
2026-04-14T15:06:05.253147+00:00 monitor updated for google-gemma-4-31b-it_v48
google-gemma-4-31b-it-v48-uploader: Downloaded in 44.091s
google-gemma-4-31b-it-v48-uploader: Processed model google/gemma-4-31B-it in 54.803s
google-gemma-4-31b-it-v48-uploader: creating bucket guanaco-vllm-models
google-gemma-4-31b-it-v48-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v48-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
google-gemma-4-31b-it-v48-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
google-gemma-4-31b-it-v48-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
google-gemma-4-31b-it-v48-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v48-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
google-gemma-4-31b-it-v48-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v48-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
google-gemma-4-31b-it-v48-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v48-uploader: if re.search("-\.", bucket, re.UNICODE):
google-gemma-4-31b-it-v48-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
google-gemma-4-31b-it-v48-uploader: if re.search("\.\.", bucket, re.UNICODE):
google-gemma-4-31b-it-v48-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
google-gemma-4-31b-it-v48-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
google-gemma-4-31b-it-v48-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
google-gemma-4-31b-it-v48-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
google-gemma-4-31b-it-v48-uploader: Bucket 's3://guanaco-vllm-models/' created
google-gemma-4-31b-it-v48-uploader: uploading /tmp/model_output to s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default
google-gemma-4-31b-it-v48-uploader: cp /tmp/model_output/processor_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default/processor_config.json
google-gemma-4-31b-it-v48-uploader: cp /tmp/model_output/README.md s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default/README.md
google-gemma-4-31b-it-v48-uploader: cp /tmp/model_output/chat_template.jinja s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default/chat_template.jinja
google-gemma-4-31b-it-v48-uploader: cp /tmp/model_output/generation_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default/generation_config.json
google-gemma-4-31b-it-v48-uploader: cp /tmp/model_output/config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default/config.json
google-gemma-4-31b-it-v48-uploader: cp /tmp/model_output/tokenizer_config.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default/tokenizer_config.json
google-gemma-4-31b-it-v48-uploader: cp /tmp/model_output/.gitattributes s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default/.gitattributes
google-gemma-4-31b-it-v48-uploader: cp /tmp/model_output/model.safetensors.index.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default/model.safetensors.index.json
google-gemma-4-31b-it-v48-uploader: cp /tmp/model_output/tokenizer.json s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default/tokenizer.json
google-gemma-4-31b-it-v48-uploader: cp /tmp/model_output/model-00002-of-00002.safetensors s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default/model-00002-of-00002.safetensors
2026-04-14T15:07:05.580420+00:00 monitor updated for google-gemma-4-31b-it_v48
google-gemma-4-31b-it-v48-uploader: cp /tmp/model_output/model-00001-of-00002.safetensors s3://guanaco-vllm-models/google-gemma-4-31b-it-v48/default/model-00001-of-00002.safetensors
Job google-gemma-4-31b-it-v48-uploader completed after 148.92s with status: succeeded
Stopping job with name google-gemma-4-31b-it-v48-uploader
Pipeline stage VLLMUploader completed in 149.44s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.11s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.98s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service google-gemma-4-31b-it-v48
Waiting for inference service google-gemma-4-31b-it-v48 to be ready
2026-04-14T15:08:05.713937+00:00 monitor updated for google-gemma-4-31b-it_v48
2026-04-14T15:09:05.802537+00:00 monitor updated for google-gemma-4-31b-it_v48
2026-04-14T15:10:05.908384+00:00 monitor updated for google-gemma-4-31b-it_v48
2026-04-14T15:11:06.137140+00:00 monitor updated for google-gemma-4-31b-it_v48
Inference service google-gemma-4-31b-it-v48 ready after 220.80831003189087s
Pipeline stage VLLMDeployer completed in 221.29s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 10.660383224487305s
Received healthy response to inference request in 11.446459531784058s
Received healthy response to inference request in 2.6283223628997803s
Received healthy response to inference request in 10.803723335266113s
Received healthy response to inference request in 3.850119113922119s
Received healthy response to inference request in 2.4144339561462402s
Received healthy response to inference request in 2.5288679599761963s
Received healthy response to inference request in 2.388171434402466s
2026-04-14T15:12:06.452234+00:00 monitor updated for google-gemma-4-31b-it_v48
Received healthy response to inference request in 2.363091230392456s
Received healthy response to inference request in 2.357357978820801s
Received healthy response to inference request in 2.287808656692505s
Received healthy response to inference request in 10.860254526138306s
Received healthy response to inference request in 2.31982684135437s
Received healthy response to inference request in 2.3681275844573975s
Received healthy response to inference request in 2.245767116546631s
Received healthy response to inference request in 2.5618622303009033s
Received healthy response to inference request in 2.297719717025757s
Received healthy response to inference request in 2.569911479949951s
Received healthy response to inference request in 2.7112574577331543s
Received healthy response to inference request in 2.256559371948242s
Received healthy response to inference request in 9.897586345672607s
Received healthy response to inference request in 2.3112425804138184s
Received healthy response to inference request in 2.6068129539489746s
Received healthy response to inference request in 2.2975592613220215s
Received healthy response to inference request in 2.4574179649353027s
2026-04-14T15:13:06.842031+00:00 monitor updated for google-gemma-4-31b-it_v48
Received healthy response to inference request in 2.437959909439087s
Received healthy response to inference request in 2.333892822265625s
Received healthy response to inference request in 2.293621301651001s
Received healthy response to inference request in 2.315983533859253s
Received healthy response to inference request in 2.5747005939483643s
30 requests
0 failed requests
5th percentile: 2.2706215500831606
10th percentile: 2.2930400371551514
20th percentile: 2.308538007736206
30th percentile: 2.3296730279922486
40th percentile: 2.366113042831421
50th percentile: 2.4261969327926636
60th percentile: 2.542065668106079
70th percentile: 2.584334301948547
80th percentile: 2.9390297889709505
90th percentile: 10.674717235565186
95th percentile: 10.83481549024582
99th percentile: 11.27646008014679
mean time: 3.8482267459233603
Pipeline stage StressChecker completed in 119.39s
Shutdown handler de-registered
google-gemma-4-31b-it_v48 status is now deployed due to DeploymentManager action
google-gemma-4-31b-it_v48 status is now inactive due to auto deactivation removed underperforming models