submission_id: chaiml-glm-air-4-5-sft-lr1-e2_v1
developer_uid: rirv938
status: torndown
model_repo: ChaiML/glm_air_4_5_sft_lr1_e2
generation_params: {'temperature': 1.0, 'top_p': 0.95, 'min_p': 0.05, 'top_k': 60, 'presence_penalty': 0.1, 'frequency_penalty': 0.0, 'stopping_words': ['You:', '<|im_end|>', '<|im_start|>', '</s>', '###'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': '[gMASK]<sop><|system|>\nRespond as a high quality storyteller.<|user|>\n', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '<|assistant|>\n<think></think>\n{bot_name}:', 'truncate_by_message': False}
timestamp: 2026-02-22T00:51:08+00:00
model_name: chaiml-glm-air-4-5-sft-lr1-e2_v1
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader
Waiting for job on chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader to finish
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: Using quantization_mode: none
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: Downloading snapshot of ChaiML/glm_air_4_5_sft_lr1_e3...
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: Using quantization_mode: none
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: Downloading snapshot of ChaiML/glm_air_4_5_sft_lr1_e2...
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: Downloaded in 78.330s
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: Downloaded in 81.433s
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: Processed model ChaiML/glm_air_4_5_sft_lr1_e3 in 155.983s
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: creating bucket guanaco-vllm-models
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/chat_template.jinja
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/special_tokens_map.json
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/.gitattributes
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model.safetensors.index.json
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/args.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/args.json
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/tokenizer.json
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/config.json
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/tokenizer_config.json
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/README.md
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: Processed model ChaiML/glm_air_4_5_sft_lr1_e2 in 160.177s
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: creating bucket guanaco-vllm-models
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/.gitattributes
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/chat_template.jinja
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/args.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/args.json
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/README.md
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/model.safetensors.index.json
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/config.json
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/tokenizer_config.json
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/special_tokens_map.json
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/tokenizer.json
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00043-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00043-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00036-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00036-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00026-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00026-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00010-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00010-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00014-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00014-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/model-00043-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/model-00043-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/model-00026-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/model-00026-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/model-00004-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/model-00004-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/model-00003-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/model-00003-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/model-00025-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/model-00025-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/model-00035-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/model-00035-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00008-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00008-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader: cp /dev/shm/model_output/model-00011-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e2-v1/default/model-00011-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00032-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00032-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00039-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00039-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00011-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00011-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00040-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00040-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00042-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00042-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00025-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00025-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00003-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00003-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00037-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00037-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00027-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00027-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00024-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00024-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00017-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00017-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00004-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00004-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00021-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00021-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00038-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00038-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00019-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00019-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00001-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00001-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00023-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00023-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00035-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00035-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00041-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00041-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00022-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00022-of-00043.safetensors
chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader: cp /dev/shm/model_output/model-00006-of-00043.safetensors s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default/model-00006-of-00043.safetensors
Job chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader completed after 233.68s with status: succeeded
Stopping job with name chaiml-glm-air-4-5-sft-lr1-e3-v1-uploader
Pipeline stage VLLMUploader completed in 235.39s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.47s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-glm-air-4-5-sft-lr1-e3-v1
Waiting for inference service chaiml-glm-air-4-5-sft-lr1-e3-v1 to be ready
Job chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader completed after 233.44s with status: succeeded
Stopping job with name chaiml-glm-air-4-5-sft-lr1-e2-v1-uploader
Pipeline stage VLLMUploader completed in 235.54s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.57s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-glm-air-4-5-sft-lr1-e2-v1
Waiting for inference service chaiml-glm-air-4-5-sft-lr1-e2-v1 to be ready
Tearing down inference service chaiml-glm-air-4-5-sft-lr1-e3-v1
clean up pipeline due to error=DeploymentError('Timeout to start the InferenceService chaiml-glm-air-4-5-sft-lr1-e3-v1. The InferenceService is as following: {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'kind\': \'InferenceService\', \'metadata\': {\'annotations\': {\'autoscaling.knative.dev/class\': \'hpa.autoscaling.knative.dev\', \'autoscaling.knative.dev/container-concurrency-target-percentage\': \'70\', \'autoscaling.knative.dev/initial-scale\': \'5\', \'autoscaling.knative.dev/max-scale-down-rate\': \'1.1\', \'autoscaling.knative.dev/max-scale-up-rate\': \'2\', \'autoscaling.knative.dev/metric\': \'mean_pod_latency_ms_v2\', \'autoscaling.knative.dev/panic-threshold-percentage\': \'650\', \'autoscaling.knative.dev/panic-window-percentage\': \'35\', \'autoscaling.knative.dev/scale-down-delay\': \'30s\', \'autoscaling.knative.dev/scale-to-zero-grace-period\': \'10m\', \'autoscaling.knative.dev/stable-window\': \'180s\', \'autoscaling.knative.dev/target\': \'4000\', \'autoscaling.knative.dev/target-burst-capacity\': \'-1\', \'autoscaling.knative.dev/tick-interval\': \'15s\', \'features.knative.dev/http-full-duplex\': \'Enabled\', \'networking.knative.dev/ingress-class\': \'istio.ingress.networking.knative.dev\', \'serving.knative.dev/progress-deadline\': \'20m\'}, \'creationTimestamp\': \'2026-02-22T00:27:53Z\', \'finalizers\': [\'inferenceservice.finalizers\'], \'generation\': 1, \'labels\': {\'knative.coreweave.cloud/ingress\': \'istio.ingress.networking.knative.dev\', \'prometheus.k.chaiverse.com\': \'true\', \'qos.coreweave.cloud/latency\': \'low\'}, \'managedFields\': [{\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:annotations\': {\'.\': {}, \'f:autoscaling.knative.dev/class\': {}, \'f:autoscaling.knative.dev/container-concurrency-target-percentage\': {}, \'f:autoscaling.knative.dev/initial-scale\': {}, \'f:autoscaling.knative.dev/max-scale-down-rate\': {}, \'f:autoscaling.knative.dev/max-scale-up-rate\': {}, \'f:autoscaling.knative.dev/metric\': {}, \'f:autoscaling.knative.dev/panic-threshold-percentage\': {}, \'f:autoscaling.knative.dev/panic-window-percentage\': {}, \'f:autoscaling.knative.dev/scale-down-delay\': {}, \'f:autoscaling.knative.dev/scale-to-zero-grace-period\': {}, \'f:autoscaling.knative.dev/stable-window\': {}, \'f:autoscaling.knative.dev/target\': {}, \'f:autoscaling.knative.dev/target-burst-capacity\': {}, \'f:autoscaling.knative.dev/tick-interval\': {}, \'f:features.knative.dev/http-full-duplex\': {}, \'f:networking.knative.dev/ingress-class\': {}, \'f:serving.knative.dev/progress-deadline\': {}}, \'f:labels\': {\'.\': {}, \'f:knative.coreweave.cloud/ingress\': {}, \'f:prometheus.k.chaiverse.com\': {}, \'f:qos.coreweave.cloud/latency\': {}}}, \'f:spec\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:affinity\': {\'.\': {}, \'f:nodeAffinity\': {\'.\': {}, \'f:tion\': {}, \'f:requiredDuringSchedulingIgnoredDuringExecution\': {}}}, \'f:containerConcurrency\': {}, \'f:containers\': {}, \'f:imagePullSecrets\': {}, \'f:maxReplicas\': {}, \'f:minReplicas\': {}, \'f:priorityClassName\': {}, \'f:timeout\': {}, \'f:volumes\': {}}}}, \'manager\': \'OpenAPI-Generator\', \'operation\': \'Update\', \'time\': \'2026-02-22T00:27:53Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:finalizers\': {\'.\': {}, \'v:"inferenceservice.finalizers"\': {}}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'time\': \'2026-02-22T00:27:53Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:status\': {\'.\': {}, \'f:components\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:latestCreatedRevision\': {}}}, \'f:conditions\': {}, \'f:modelStatus\': {\'.\': {}, \'f:states\': {\'.\': {}, \'f:activeModelState\': {}, \'f:targetModelState\': {}}, \'f:transitionStatus\': {}}, \'f:observedGeneration\': {}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'subresource\': \'status\', \'time\': \'2026-02-22T00:27:54Z\'}], \'name\': \'chaiml-glm-air-4-5-sft-lr1-e3-v1\', \'namespace\': \'tenant-chaiml-guanaco\', \'resourceVersion\': \'464473655\', \'uid\': \'e45b155a-ddaf-41ff-8e03-85371494582e\'}, \'spec\': {\'predictor\': {\'affinity\': {\'nodeAffinity\': {\'tion\': [{\'preference\': {\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'A100_NVLINK_80GB\']}]}, \'weight\': 5}], \'requiredDuringSchedulingIgnoredDuringExecution\': {\'nodeSelectorTerms\': [{\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'A100_NVLINK_80GB\']}]}]}}}, \'containerConcurrency\': 0, \'containers\': [{\'args\': [\'serve\', \'s3://guanaco-vllm-models/chaiml-glm-air-4-5-sft-lr1-e3-v1/default\', \'--port\', \'8080\', \'--tensor-parallel-size\', \'4\', \'--max-model-len\', \'3000\', \'--max-num-batched-tokens\', \'3000\', \'--max-num-seqs\', \'128\', \'--gpu-memory-utilization\', \'0.95\', \'--trust-remote-code\', \'--load-format\', \'runai_streamer\', \'--served-model-name\', \'ChaiML/glm_air_4_5_sft_lr1_e3\'], \'env\': [{\'name\': \'RESERVE_MEMORY\', \'value\': \'2048\'}, {\'name\': \'DOWNLOAD_TO_LOCAL\', \'value\': \'/dev/shm/model_cache\'}, {\'name\': \'NUM_GPUS\', \'value\': \'4\'}, {\'name\': \'VLLM_ASSETS_CACHE\', \'value\': \'/code/vllm_assets_cache\'}, {\'name\': \'RUNAI_STREAMER_S3_USE_VIRTUAL_ADDRESSING\', \'value\': \'1\'}, {\'name\': \'RUNAI_STREAMER_CONCURRENCY\', \'value\': \'1\'}, {\'name\': \'AWS_EC2_METADATA_DISABLED\', \'value\': \'true\'}, {\'name\': \'AWS_ACCESS_KEY_ID\', \'value\': \'CWZAGMHZXKZRFGJK\'}, {\'name\': \'AWS_SECRET_ACCESS_KEY\', \'value\': \'cwoAeWzp46q4O0sTNXOEuZ1MvZzKEFlS9DtEhnTldKp\'}, {\'name\': \'AWS_ENDPOINT_URL\', \'value\': \'https://cwobject.com\'}, {\'name\': \'HF_TOKEN\', \'valueFrom\': {\'secretKeyRef\': {\'key\': \'token\', \'name\': \'hf-token\'}}}], \'image\': \'gcr.io/chai-959f8/vllm:v0.13.1\', \'imagePullPolicy\': \'IfNotPresent\', \'name\': \'kserve-container\', \'readinessProbe\': {\'failureThreshold\': 1, \'httpGet\': {\'path\': \'/v1/models\', \'port\': 8080}, \'initialDelaySeconds\': 60, \'periodSeconds\': 10, \'successThreshold\': 1, \'timeoutSeconds\': 5}, \'resources\': {\'limits\': {\'cpu\': \'8\', \'memory\': \'256Gi\', \'nvidia.com/gpu\': \'4\'}, \'requests\': {\'cpu\': \'8\', \'memory\': \'256Gi\', \'nvidia.com/gpu\': \'4\'}}, \'volumeMounts\': [{\'mountPath\': \'/dev/shm\', \'name\': \'shared-memory-cache\'}, {\'mountPath\': \'/root/.cache\', \'name\': \'cache-volume\'}]}], \'imagePullSecrets\': [{\'name\': \'docker-creds\'}], \'maxReplicas\': 40, \'minReplicas\': 0, \'priorityClassName\': \'creator-studio\', \'timeout\': 60, \'volumes\': [{\'emptyDir\': {\'medium\': \'Memory\', \'sizeLimit\': \'256Gi\'}, \'name\': \'shared-memory-cache\'}, {\'name\': \'cache-volume\', \'persistentVolumeClaim\': {\'claimName\': \'cache-pvc\'}}]}}, \'status\': {\'components\': {\'predictor\': {\'latestCreatedRevision\': \'chaiml-glm-air-4-5-sft-lr1-e3-v1-predictor-00001\'}}, \'conditions\': [{\'lastTransitionTime\': \'2026-02-22T00:27:54Z\', \'reason\': \'PredictorConfigurationReady not ready\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'LatestDeploymentReady\'}, {\'lastTransitionTime\': \'2026-02-22T00:27:54Z\', \'message\': \'Revision "chaiml-glm-air-4-5-sft-lr1-e3-v1-predictor-00001" failed with message: 0/82 nodes are available: 1 node(s) were unschedulable, 15 Insufficient nvidia.com/gpu, 66 node(s) didn\\\'t match Pod\\\'s node affinity/selector. no new claims to deallocate, preemption: 0/82 nodes are available: 15 Insufficient nvidia.com/gpu, 67 Preemption is not helpful for scheduling..\', \'reason\': \'RevisionFailed\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'PredictorConfigurationReady\'}, {\'lastTransitionTime\': \'2026-02-22T00:27:54Z\', \'message\': \'Configuration "chaiml-glm-air-4-5-sft-lr1-e3-v1-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'status\': \'False\', \'type\': \'PredictorReady\'}, {\'lastTransitionTime\': \'2026-02-22T00:27:54Z\', \'message\': \'Configuration "chaiml-glm-air-4-5-sft-lr1-e3-v1-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'PredictorRouteReady\'}, {\'lastTransitionTime\': \'2026-02-22T00:27:54Z\', \'message\': \'Configuration "chaiml-glm-air-4-5-sft-lr1-e3-v1-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'status\': \'False\', \'type\': \'Ready\'}, {\'lastTransitionTime\': \'2026-02-22T00:27:54Z\', \'reason\': \'PredictorRouteReady not ready\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'RoutesReady\'}], \'modelStatus\': {\'states\': {\'activeModelState\': \'\', \'targetModelState\': \'Pending\'}, \'transitionStatus\': \'InProgress\'}, \'observedGeneration\': 1}}')
run pipeline stage %s
Running pipeline stage VLLMDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage VLLMDeleter completed in 0.64s
run pipeline stage %s
Running pipeline stage VLLMModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
%s, retrying in %s seconds...
Cleaning model data from S3
Cleaning model data from model cache
%s, retrying in %s seconds...
Cleaning model data from S3
Tearing down inference service chaiml-glm-air-4-5-sft-lr1-e2-v1
Cleaning model data from model cache
Shutdown handler de-registered
TeardownError('An error occurred (PathStyleRequestNotAllowed) when calling the ListObjects operation: The path style requests are not allowed for this method, please switch to hostname-based requests.')
run pipeline stage %s
chaiml-glm-air-4-5-sft-lr1-e3_v1 status is now failed due to DeploymentManager action
Running pipeline stage VLLMDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage VLLMDeleter completed in 0.58s
run pipeline stage %s
Running pipeline stage VLLMModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
%s, retrying in %s seconds...
Cleaning model data from S3
Cleaning model data from model cache
%s, retrying in %s seconds...
Cleaning model data from S3
Cleaning model data from model cache
Shutdown handler de-registered
chaiml-glm-air-4-5-sft-lr1-e2_v1 status is now failed due to DeploymentManager action
chaiml-glm-air-4-5-sft-lr1-e2_v1 status is now torndown due to DeploymentManager action