Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-2a6f-69d4-linear-43777-v3-uploader
Waiting for job on chaiml-2a6f-69d4-linear-43777-v3-uploader to finish
chaiml-2a6f-69d4-linear-43777-v3-uploader: Using quantization_mode: none
chaiml-2a6f-69d4-linear-43777-v3-uploader: Downloading snapshot of ChaiML/2a6f-69d4-linear-w01-FP8...
chaiml-2a6f-69d4-linear-43777-v3-uploader:
Fetching 15 files: 0%| | 0/15 [00:00<?, ?it/s]
Fetching 15 files: 7%|▋ | 1/15 [00:00<00:03, 3.73it/s]
Fetching 15 files: 33%|███▎ | 5/15 [00:10<00:22, 2.26s/it]
Fetching 15 files: 47%|████▋ | 7/15 [00:12<00:13, 1.67s/it]
Fetching 15 files: 100%|██████████| 15/15 [00:12<00:00, 1.24it/s]
chaiml-2a6f-69d4-linear-43777-v3-uploader: Downloaded in 12.220s
chaiml-2a6f-69d4-linear-43777-v3-uploader: Processed model ChaiML/2a6f-69d4-linear-w01-FP8 in 21.528s
chaiml-2a6f-69d4-linear-43777-v3-uploader: creating bucket guanaco-vllm-models
chaiml-2a6f-69d4-linear-43777-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-43777-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-2a6f-69d4-linear-43777-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-2a6f-69d4-linear-43777-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-2a6f-69d4-linear-43777-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-43777-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-2a6f-69d4-linear-43777-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-43777-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-2a6f-69d4-linear-43777-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-43777-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-2a6f-69d4-linear-43777-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-2a6f-69d4-linear-43777-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-2a6f-69d4-linear-43777-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-2a6f-69d4-linear-43777-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-2a6f-69d4-linear-43777-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-2a6f-69d4-linear-43777-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-2a6f-69d4-linear-43777-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-2a6f-69d4-linear-43777-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/.gitattributes
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/chat_template.jinja
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/config.json
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/special_tokens_map.json
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/generation_config.json
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/recipe.yaml
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/model.safetensors.index.json
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/tokenizer_config.json
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/tokenizer.json
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/model-00006-of-00006.safetensors
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/model-00005-of-00006.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/model-00005-of-00006.safetensors
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/model-00004-of-00006.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/model-00004-of-00006.safetensors
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/model-00002-of-00006.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/model-00002-of-00006.safetensors
chaiml-2a6f-69d4-linear-43777-v3-uploader: cp /dev/shm/model_output/model-00003-of-00006.safetensors s3://guanaco-vllm-models/chaiml-2a6f-69d4-linear-43777-v3/model-00003-of-00006.safetensors
admin requested tearing down of qwen-qwen3-4b-instruct-2507_v1
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
Job chaiml-2a6f-69d4-linear-43777-v3-uploader completed after 137.62s with status: succeeded
run pipeline stage %s
Stopping job with name chaiml-2a6f-69d4-linear-43777-v3-uploader
Running pipeline stage MKMLDeleter
Pipeline stage VLLMUploader completed in 146.87s
%s, retrying in %s seconds...
run pipeline stage %s
%s, retrying in %s seconds...
Running pipeline stage VLLMTemplater
clean up pipeline due to error=TeardownError('401\nReason: Unauthorized\nHTTP response headers: HTTPHeaderDict({\'Audit-Id\': \'048afaf8-c670-4213-a20b-7336f15e2e69\', \'Cache-Control\': \'no-cache, private\', \'Content-Type\': \'application/json\', \'Date\': \'Sat, 07 Feb 2026 18:14:58 GMT\', \'Content-Length\': \'129\'})\nHTTP response body: b\'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Unauthorized","reason":"Unauthorized","code":401}\\n\'\nOriginal traceback: \n File "/root/miniconda3/envs/guanaco/lib/python3.11/site-packages/kubernetes/dynamic/client.py", line 55, in inner\n resp = func(self, *args, **kwargs)\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File "/root/miniconda3/envs/guanaco/lib/python3.11/site-packages/kubernetes/dynamic/client.py", line 273, in request\n api_response = self.client.call_api(\n ^^^^^^^^^^^^^^^^^^^^^\n\n File "/root/miniconda3/envs/guanaco/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 348, in call_api\n return self.__call_api(resource_path, method,\n ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File "/root/miniconda3/envs/guanaco/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 180, in __call_api\n response_data = self.request(\n ^^^^^^^^^^^^^\n\n File "/root/miniconda3/envs/guanaco/lib/python3.11/site-packages/kubernetes/client/api_client.py", line 373, in request\n return self.rest_client.GET(url,\n ^^^^^^^^^^^^^^^^^^^^^^^^^\n\n File "/root/miniconda3/envs/guanaco/lib/python3.11/site-packages/kubernetes/client/rest.py", line 244, in GET\n return self.request("GET", url,\n ^^^^^^^^^^^^^^^^^^^^^^^^\n\n File "/root/miniconda3/envs/guanaco/lib/python3.11/site-packages/kubernetes/client/rest.py", line 238, in request\n raise ApiException(http_resp=r)\n')
Pipeline stage VLLMTemplater completed in 2.27s
Shutdown handler de-registered
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-2a6f-69d4-linear-43777-v3
Waiting for inference service chaiml-2a6f-69d4-linear-43777-v3 to be ready
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Inference service chaiml-2a6f-69d4-linear-43777-v3 ready after 251.16863632202148s
Pipeline stage VLLMDeployer completed in 251.86s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.9747836589813232s
Received healthy response to inference request in 1.5688283443450928s
Received healthy response to inference request in 1.5576086044311523s
Received healthy response to inference request in 1.731670618057251s
Received healthy response to inference request in 2.07262921333313s
Received healthy response to inference request in 1.9432923793792725s
Received healthy response to inference request in 1.4948039054870605s
Received healthy response to inference request in 2.2209184169769287s
Received healthy response to inference request in 1.655045747756958s
Received healthy response to inference request in 1.9276454448699951s
Received healthy response to inference request in 1.7420544624328613s
Received healthy response to inference request in 1.8618502616882324s
Received healthy response to inference request in 1.7999341487884521s
Received healthy response to inference request in 1.832240104675293s
Received healthy response to inference request in 1.5107288360595703s
Received healthy response to inference request in 1.5255107879638672s
Received healthy response to inference request in 1.6126952171325684s
Received healthy response to inference request in 1.8005897998809814s
Received healthy response to inference request in 1.6871545314788818s
Received healthy response to inference request in 1.658782958984375s
Received healthy response to inference request in 1.4162743091583252s
Received healthy response to inference request in 1.8394322395324707s
Received healthy response to inference request in 1.4557032585144043s
Received healthy response to inference request in 1.6544609069824219s
Received healthy response to inference request in 1.4011478424072266s
Received healthy response to inference request in 2.139953374862671s
Received healthy response to inference request in 1.7893345355987549s
Received healthy response to inference request in 1.3662419319152832s
Received healthy response to inference request in 1.6208384037017822s
Received healthy response to inference request in 1.4258110523223877s
30 requests
0 failed requests
5th percentile: 1.407954752445221
10th percentile: 1.4248573780059814
20th percentile: 1.5075438499450684
30th percentile: 1.5654624223709106
40th percentile: 1.641011905670166
50th percentile: 1.6729687452316284
60th percentile: 1.7609664916992187
70th percentile: 1.8100848913192749
80th percentile: 1.8750092983245852
90th percentile: 1.984568214416504
95th percentile: 2.109657502174377
99th percentile: 2.197438554763794
mean time: 1.7095988432566325
Pipeline stage StressChecker completed in 72.16s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 4.40s
Shutdown handler de-registered
chaiml-2a6f-69d4-linear_43777_v3 status is now deployed due to DeploymentManager action
chaiml-2a6f-69d4-linear_43777_v3 status is now inactive due to auto deactivation removed underperforming models
chaiml-2a6f-69d4-linear_43777_v3 status is now torndown due to DeploymentManager action