Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name openai-gpt-oss-20b-v1-uploader
Waiting for job on openai-gpt-oss-20b-v1-uploader to finish
openai-gpt-oss-20b-v1-uploader: Using quantization_mode: none
openai-gpt-oss-20b-v1-uploader: Downloading snapshot of openai/gpt-oss-20b...
openai-gpt-oss-20b-v1-uploader: Downloaded in 13.813s
openai-gpt-oss-20b-v1-uploader: Processed model openai/gpt-oss-20b in 29.572s
openai-gpt-oss-20b-v1-uploader: creating bucket guanaco-vllm-models
openai-gpt-oss-20b-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
openai-gpt-oss-20b-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
openai-gpt-oss-20b-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
openai-gpt-oss-20b-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
openai-gpt-oss-20b-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
openai-gpt-oss-20b-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
openai-gpt-oss-20b-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
openai-gpt-oss-20b-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
openai-gpt-oss-20b-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
openai-gpt-oss-20b-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
openai-gpt-oss-20b-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
openai-gpt-oss-20b-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
openai-gpt-oss-20b-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
openai-gpt-oss-20b-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
openai-gpt-oss-20b-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
openai-gpt-oss-20b-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
openai-gpt-oss-20b-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
openai-gpt-oss-20b-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/config.json
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/README.md
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/USAGE_POLICY s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/USAGE_POLICY
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/original/config.json s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/original/config.json
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/LICENSE
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/original/dtypes.json s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/original/dtypes.json
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/.gitattributes
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/special_tokens_map.json
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/chat_template.jinja
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/model.safetensors.index.json
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/generation_config.json
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/tokenizer_config.json
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/tokenizer.json
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/model-00002-of-00002.safetensors s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/model-00002-of-00002.safetensors
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/model-00000-of-00002.safetensors s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/model-00000-of-00002.safetensors
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/model-00001-of-00002.safetensors s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/model-00001-of-00002.safetensors
2026-03-28T14:38:04.984784+00:00 monitor updated for openai-gpt-oss-20b_v1
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/metal/model.bin s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/metal/model.bin
openai-gpt-oss-20b-v1-uploader: cp /dev/shm/model_output/original/model.safetensors s3://guanaco-vllm-models/openai-gpt-oss-20b-v1/default/original/model.safetensors
Job openai-gpt-oss-20b-v1-uploader completed after 73.51s with status: succeeded
Stopping job with name openai-gpt-oss-20b-v1-uploader
Pipeline stage VLLMUploader completed in 73.97s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 3.41s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service openai-gpt-oss-20b-v1
Waiting for inference service openai-gpt-oss-20b-v1 to be ready
Failed to get response for submission chaiml-gspo-glm47-combi_10268_v1: ('http://chaiml-gspo-glm47-combi-10268-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
2026-03-28T14:39:05.074250+00:00 monitor updated for openai-gpt-oss-20b_v1
Failed to get response for submission chaiml-gspo-glm47-chai-_76408_v1: ('http://chaiml-gspo-glm47-chai-76408-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
2026-03-28T14:40:05.196387+00:00 monitor updated for openai-gpt-oss-20b_v1
2026-03-28T14:41:05.307099+00:00 monitor updated for openai-gpt-oss-20b_v1
Inference service openai-gpt-oss-20b-v1 ready after 170.55505561828613s
Pipeline stage VLLMDeployer completed in 171.63s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.95386004447937s
Received healthy response to inference request in 2.4616856575012207s
Received healthy response to inference request in 2.1529128551483154s
Received healthy response to inference request in 2.1483328342437744s
Received healthy response to inference request in 0.9399850368499756s
Received healthy response to inference request in 1.0346112251281738s
Received healthy response to inference request in 1.0361273288726807s
Received healthy response to inference request in 0.3445899486541748s
Received healthy response to inference request in 0.7197673320770264s
Received healthy response to inference request in 0.35232043266296387s
Received healthy response to inference request in 2.116934061050415s
Received healthy response to inference request in 0.35650110244750977s
Received healthy response to inference request in 0.3762967586517334s
Received healthy response to inference request in 1.436023473739624s
Received healthy response to inference request in 0.6692256927490234s
Received healthy response to inference request in 0.8638091087341309s
Received healthy response to inference request in 1.0411295890808105s
Received healthy response to inference request in 0.5091471672058105s
Received healthy response to inference request in 1.1479558944702148s
Received healthy response to inference request in 0.516916036605835s
Received healthy response to inference request in 0.6010839939117432s
Received healthy response to inference request in 0.6862008571624756s
Received healthy response to inference request in 0.5271000862121582s
Received healthy response to inference request in 0.793694019317627s
Received healthy response to inference request in 0.8375391960144043s
Received healthy response to inference request in 0.5099213123321533s
Received healthy response to inference request in 0.9057183265686035s
Received healthy response to inference request in 0.39716601371765137s
Received healthy response to inference request in 0.5522627830505371s
Received healthy response to inference request in 0.5069615840911865s
30 requests
0 failed requests
5th percentile: 0.3542017340660095
10th percentile: 0.37431719303131106
20th percentile: 0.5087100505828858
30th percentile: 0.5240448713302612
40th percentile: 0.6419690132141114
50th percentile: 0.7567306756973267
60th percentile: 0.8805727958679198
70th percentile: 1.035066056251526
80th percentile: 1.2055694103240975
90th percentile: 2.1487908363342285
95th percentile: 2.3227378964424124
99th percentile: 2.811129472255707
mean time: 0.9831926584243774
Pipeline stage StressChecker completed in 33.86s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.82s
Shutdown handler de-registered
openai-gpt-oss-20b_v1 status is now deployed due to DeploymentManager action
openai-gpt-oss-20b_v1 status is now inactive due to admin request
openai-gpt-oss-20b_v1 status is now torndown due to DeploymentManager action