submission_id: zai-org-glm-4-7-fp8_v4
developer_uid: richhx
status: torndown
model_repo: zai-org/GLM-4.7-FP8
generation_params: {'temperature': 1.0, 'top_p': 0.95, 'min_p': 0.05, 'top_k': 60, 'presence_penalty': 0.1, 'frequency_penalty': 0.0, 'stopping_words': ['</s>', 'You:', '<|im_end|>', '<|im_start|>', '###'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': '[gMASK]<sop><|system|>\nRespond as a high quality storyteller.<|user|>\n', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '<|assistant|>\n<think></think>\n{bot_name}:', 'truncate_by_message': False}
timestamp: 2026-04-14T19:16:09+00:00
model_name: zai-org-glm-4-7-fp8_v4
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name zai-org-glm-4-7-fp8-v4-uploader
Waiting for job on zai-org-glm-4-7-fp8-v4-uploader to finish
zai-org-glm-4-7-fp8-v4-uploader: zai-org/GLM-4.7-FP8 is already quantized
zai-org-glm-4-7-fp8-v4-uploader: Using quantization_mode: none
zai-org-glm-4-7-fp8-v4-uploader: Downloading snapshot of zai-org/GLM-4.7-FP8...
2026-04-14T18:29:06.938076+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:30:07.122497+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:31:07.392064+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
zai-org-glm-4-7-fp8-v4-uploader: Downloaded in 157.749s
zai-org-glm-4-7-fp8-v4-uploader: Processed model zai-org/GLM-4.7-FP8 in 157.889s
zai-org-glm-4-7-fp8-v4-uploader: creating bucket guanaco-vllm-models
zai-org-glm-4-7-fp8-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
zai-org-glm-4-7-fp8-v4-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
zai-org-glm-4-7-fp8-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
zai-org-glm-4-7-fp8-v4-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
zai-org-glm-4-7-fp8-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
zai-org-glm-4-7-fp8-v4-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
zai-org-glm-4-7-fp8-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
zai-org-glm-4-7-fp8-v4-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
zai-org-glm-4-7-fp8-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
zai-org-glm-4-7-fp8-v4-uploader: if re.search("-\.", bucket, re.UNICODE):
zai-org-glm-4-7-fp8-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
zai-org-glm-4-7-fp8-v4-uploader: if re.search("\.\.", bucket, re.UNICODE):
zai-org-glm-4-7-fp8-v4-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
zai-org-glm-4-7-fp8-v4-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
zai-org-glm-4-7-fp8-v4-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
zai-org-glm-4-7-fp8-v4-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
zai-org-glm-4-7-fp8-v4-uploader: Bucket 's3://guanaco-vllm-models/' created
zai-org-glm-4-7-fp8-v4-uploader: uploading /tmp/model_output to s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default
2026-04-14T18:32:07.492095+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/config.json s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/config.json
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/README.md s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/README.md
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/chat_template.jinja s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/chat_template.jinja
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/generation_config.json s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/generation_config.json
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/.gitattributes s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/.gitattributes
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model.safetensors.index.json s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model.safetensors.index.json
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/tokenizer.json s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/tokenizer.json
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/tokenizer_config.json s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/tokenizer_config.json
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00003-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00003-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00002-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00002-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00001-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00001-of-00092.safetensors
2026-04-14T18:33:07.609307+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00084-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00084-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00028-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00028-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00043-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00043-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00051-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00051-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00044-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00044-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00086-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00086-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00033-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00033-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00056-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00056-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00009-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00009-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00082-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00082-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00081-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00081-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00045-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00045-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00065-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00065-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00014-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00014-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00017-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00017-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00077-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00077-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00030-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00030-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00071-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00071-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00015-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00015-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00062-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00062-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00078-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00078-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00006-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00006-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00052-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00052-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00021-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00021-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00041-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00041-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00038-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00038-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00018-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00018-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00090-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00090-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00070-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00070-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00047-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00047-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00036-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00036-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00061-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00061-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00011-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00011-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00031-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00031-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00010-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00010-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00004-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00004-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00055-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00055-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00075-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00075-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00083-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00083-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00013-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00013-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00067-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00067-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00072-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00072-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00085-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00085-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00042-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00042-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00073-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00073-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00032-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00032-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00063-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00063-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00037-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00037-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00027-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00027-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00074-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00074-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00007-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00007-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00040-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00040-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00039-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00039-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00059-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00059-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00089-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00089-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00064-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00064-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00079-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00079-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00020-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00020-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00005-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00005-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00026-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00026-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00054-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00054-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00048-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00048-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00088-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00088-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00029-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00029-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00022-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00022-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00025-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00025-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00046-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00046-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00016-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00016-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00050-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00050-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00066-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00066-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00057-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00057-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00091-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00091-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00019-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00019-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00035-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00035-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00069-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00069-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00076-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00076-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00012-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00012-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00034-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00034-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00087-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00087-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00024-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00024-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00060-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00060-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00008-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00008-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00058-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00058-of-00092.safetensors
zai-org-glm-4-7-fp8-v4-uploader: cp /tmp/model_output/model-00080-of-00092.safetensors s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default/model-00080-of-00092.safetensors
Job zai-org-glm-4-7-fp8-v4-uploader completed after 353.04s with status: succeeded
Stopping job with name zai-org-glm-4-7-fp8-v4-uploader
Pipeline stage VLLMUploader completed in 353.63s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.13s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.81s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service zai-org-glm-4-7-fp8-v4
Waiting for inference service zai-org-glm-4-7-fp8-v4 to be ready
2026-04-14T18:34:07.718917+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:35:07.843217+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:36:07.978275+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:37:08.120179+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:38:08.257748+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:39:08.378397+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:40:08.907282+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:41:09.090896+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:42:09.203604+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:43:09.315863+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:44:09.425368+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:45:09.538250+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:46:09.938302+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:47:10.152502+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:48:10.286724+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:49:10.457917+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:50:10.553036+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:51:10.663068+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:52:10.775000+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:53:10.897862+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:54:11.009558+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:55:11.109673+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:56:11.208000+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:57:11.314755+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:58:11.419379+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T18:59:11.524344+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:00:11.624565+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:01:11.729616+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:02:11.830778+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:03:11.928734+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:04:12.031865+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:05:12.132745+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:06:12.230930+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:07:12.343580+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:08:12.444875+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:09:12.545423+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:10:12.666583+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:11:12.779151+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:12:12.877769+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
2026-04-14T19:13:12.973103+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
Tearing down inference service zai-org-glm-4-7-fp8-v4
clean up pipeline due to error=DeploymentError('Timeout to start the InferenceService zai-org-glm-4-7-fp8-v4. The InferenceService is as following: {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'kind\': \'InferenceService\', \'metadata\': {\'annotations\': {\'autoscaling.knative.dev/class\': \'hpa.autoscaling.knative.dev\', \'autoscaling.knative.dev/container-concurrency-target-percentage\': \'70\', \'autoscaling.knative.dev/initial-scale\': \'5\', \'autoscaling.knative.dev/max-scale-down-rate\': \'1.1\', \'autoscaling.knative.dev/max-scale-up-rate\': \'2\', \'autoscaling.knative.dev/metric\': \'mean_pod_latency_ms_v2\', \'autoscaling.knative.dev/panic-threshold-percentage\': \'650\', \'autoscaling.knative.dev/panic-window-percentage\': \'35\', \'autoscaling.knative.dev/scale-down-delay\': \'30s\', \'autoscaling.knative.dev/scale-to-zero-grace-period\': \'10m\', \'autoscaling.knative.dev/stable-window\': \'180s\', \'autoscaling.knative.dev/target\': \'4000\', \'autoscaling.knative.dev/target-burst-capacity\': \'-1\', \'autoscaling.knative.dev/tick-interval\': \'15s\', \'features.knative.dev/http-full-duplex\': \'Enabled\', \'networking.knative.dev/ingress-class\': \'istio.ingress.networking.knative.dev\', \'serving.knative.dev/progress-deadline\': \'40m\'}, \'creationTimestamp\': \'2026-04-14T18:34:03Z\', \'finalizers\': [\'inferenceservice.finalizers\'], \'generation\': 1, \'labels\': {\'istio.io/rev\': \'prod-canary\', \'knative.coreweave.cloud/ingress\': \'istio.ingress.networking.knative.dev\', \'prometheus.k.chaiverse.com\': \'true\', \'qos.coreweave.cloud/latency\': \'low\'}, \'managedFields\': [{\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:annotations\': {\'.\': {}, \'f:autoscaling.knative.dev/class\': {}, \'f:autoscaling.knative.dev/container-concurrency-target-percentage\': {}, \'f:autoscaling.knative.dev/initial-scale\': {}, \'f:autoscaling.knative.dev/max-scale-down-rate\': {}, \'f:autoscaling.knative.dev/max-scale-up-rate\': {}, \'f:autoscaling.knative.dev/metric\': {}, \'f:autoscaling.knative.dev/panic-threshold-percentage\': {}, \'f:autoscaling.knative.dev/panic-window-percentage\': {}, \'f:autoscaling.knative.dev/scale-down-delay\': {}, \'f:autoscaling.knative.dev/scale-to-zero-grace-period\': {}, \'f:autoscaling.knative.dev/stable-window\': {}, \'f:autoscaling.knative.dev/target\': {}, \'f:autoscaling.knative.dev/target-burst-capacity\': {}, \'f:autoscaling.knative.dev/tick-interval\': {}, \'f:features.knative.dev/http-full-duplex\': {}, \'f:networking.knative.dev/ingress-class\': {}, \'f:serving.knative.dev/progress-deadline\': {}}, \'f:labels\': {\'.\': {}, \'f:istio.io/rev\': {}, \'f:knative.coreweave.cloud/ingress\': {}, \'f:prometheus.k.chaiverse.com\': {}, \'f:qos.coreweave.cloud/latency\': {}}}, \'f:spec\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:affinity\': {\'.\': {}, \'f:nodeAffinity\': {\'.\': {}, \'f:tion\': {}, \'f:requiredDuringSchedulingIgnoredDuringExecution\': {}}, \'f:podAffinity\': {\'.\': {}, \'f:tion\': {}}}, \'f:containerConcurrency\': {}, \'f:containers\': {}, \'f:imagePullSecrets\': {}, \'f:maxReplicas\': {}, \'f:minReplicas\': {}, \'f:priorityClassName\': {}, \'f:timeout\': {}, \'f:volumes\': {}}}}, \'manager\': \'OpenAPI-Generator\', \'operation\': \'Update\', \'time\': \'2026-04-14T18:34:03Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:metadata\': {\'f:finalizers\': {\'.\': {}, \'v:"inferenceservice.finalizers"\': {}}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'time\': \'2026-04-14T18:34:03Z\'}, {\'apiVersion\': \'serving.kserve.io/v1beta1\', \'fieldsType\': \'FieldsV1\', \'fieldsV1\': {\'f:status\': {\'.\': {}, \'f:components\': {\'.\': {}, \'f:predictor\': {\'.\': {}, \'f:latestCreatedRevision\': {}}}, \'f:conditions\': {}, \'f:modelStatus\': {\'.\': {}, \'f:lastFailureInfo\': {\'.\': {}, \'f:exitCode\': {}, \'f:message\': {}, \'f:reason\': {}}, \'f:states\': {\'.\': {}, \'f:activeModelState\': {}, \'f:targetModelState\': {}}, \'f:transitionStatus\': {}}, \'f:observedGeneration\': {}}}, \'manager\': \'manager\', \'operation\': \'Update\', \'subresource\': \'status\', \'time\': \'2026-04-14T19:14:08Z\'}], \'name\': \'zai-org-glm-4-7-fp8-v4\', \'namespace\': \'tenant-chaiml-guanaco\', \'resourceVersion\': \'1354276786\', \'uid\': \'a94b57c3-abc1-489b-985a-bf44f5c492b1\'}, \'spec\': {\'predictor\': {\'affinity\': {\'nodeAffinity\': {\'tion\': [{\'preference\': {\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'A100_NVLINK_80GB\']}]}, \'weight\': 5}], \'requiredDuringSchedulingIgnoredDuringExecution\': {\'nodeSelectorTerms\': [{\'matchExpressions\': [{\'key\': \'gpu.nvidia.com/class\', \'operator\': \'In\', \'values\': [\'A100_NVLINK_80GB\']}]}]}}, \'podAffinity\': {\'tion\': [{\'podAffinityTerm\': {\'labelSelector\': {\'matchLabels\': {\'serving.kserve.io/inferenceservice\': \'zai-org-glm-4-7-fp8-v4\'}}, \'topologyKey\': \'kubernetes.io/hostname\'}, \'weight\': 100}]}}, \'containerConcurrency\': 0, \'containers\': [{\'args\': [\'serve\', \'s3://guanaco-vllm-models/zai-org-glm-4-7-fp8-v4/default\', \'--port\', \'8080\', \'--tensor-parallel-size\', \'8\', \'--max-model-len\', \'10240\', \'--max-num-batched-tokens\', \'10240\', \'--max-num-seqs\', \'64\', \'--gpu-memory-utilization\', \'0.9\', \'--trust-remote-code\', \'--load-format\', \'runai_streamer\', \'--served-model-name\', \'zai-org/GLM-4.7-FP8\', \'--model-loader-extra-config\', \'{"distributed": true, "concurrency": 2}\'], \'env\': [{\'name\': \'RESERVE_MEMORY\', \'value\': \'2048\'}, {\'name\': \'DOWNLOAD_TO_LOCAL\', \'value\': \'/dev/shm/model_cache\'}, {\'name\': \'NUM_GPUS\', \'value\': \'8\'}, {\'name\': \'VLLM_ASSETS_CACHE\', \'value\': \'/code/vllm_assets_cache\'}, {\'name\': \'RUNAI_STREAMER_S3_USE_VIRTUAL_ADDRESSING\', \'value\': \'1\'}, {\'name\': \'RUNAI_STREAMER_CONCURRENCY\', \'value\': \'1\'}, {\'name\': \'AWS_EC2_METADATA_DISABLED\', \'value\': \'true\'}, {\'name\': \'AWS_ACCESS_KEY_ID\', \'value\': \'CWZAGMHZXKZRFGJK\'}, {\'name\': \'AWS_SECRET_ACCESS_KEY\', \'value\': \'cwoAeWzp46q4O0sTNXOEuZ1MvZzKEFlS9DtEhnTldKp\'}, {\'name\': \'AWS_ENDPOINT_URL\', \'value\': \'https://cwobject.com\'}, {\'name\': \'HF_TOKEN\', \'valueFrom\': {\'secretKeyRef\': {\'key\': \'token\', \'name\': \'hf-token\'}}}, {\'name\': \'RUNAI_STREAMER_S3_REQUEST_TIMEOUT_MS\', \'value\': \'30000\'}, {\'name\': \'VLLM_ROCM_USE_AITER\', \'value\': \'1\'}, {\'name\': \'VLLM_ROCM_USE_AITER_MOE\', \'value\': \'1\'}, {\'name\': \'VLLM_USE_TRITON_FLASH_ATTN\', \'value\': \'0\'}], \'image\': \'gcr.io/chai-959f8/vllm:v0.17.1.transformers-5.3.0-dsa_patch\', \'imagePullPolicy\': \'IfNotPresent\', \'name\': \'kserve-container\', \'readinessProbe\': {\'failureThreshold\': 1, \'httpGet\': {\'path\': \'/v1/models\', \'port\': 8080}, \'initialDelaySeconds\': 60, \'periodSeconds\': 10, \'successThreshold\': 1, \'timeoutSeconds\': 5}, \'resources\': {\'limits\': {\'cpu\': \'16\', \'memory\': \'394Gi\', \'nvidia.com/gpu\': \'8\'}, \'requests\': {\'cpu\': \'16\', \'memory\': \'394Gi\', \'nvidia.com/gpu\': \'8\'}}, \'volumeMounts\': [{\'mountPath\': \'/dev/shm\', \'name\': \'shared-memory-cache\'}]}], \'imagePullSecrets\': [{\'name\': \'docker-creds\'}], \'maxReplicas\': 5, \'minReplicas\': 0, \'priorityClassName\': \'chaiverse\', \'timeout\': 20, \'volumes\': [{\'emptyDir\': {\'medium\': \'Memory\', \'sizeLimit\': \'394Gi\'}, \'name\': \'shared-memory-cache\'}]}}, \'status\': {\'components\': {\'predictor\': {\'latestCreatedRevision\': \'zai-org-glm-4-7-fp8-v4-predictor-00001\'}}, \'conditions\': [{\'lastTransitionTime\': \'2026-04-14T18:34:21Z\', \'reason\': \'PredictorConfigurationReady not ready\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'LatestDeploymentReady\'}, {\'lastTransitionTime\': \'2026-04-14T19:14:08Z\', \'message\': \'Revision "zai-org-glm-4-7-fp8-v4-predictor-00001" failed with message: Container failed with: m.py", line 154, in __init__\\n(APIServer pid=1) self.engine_core = EngineCoreClient.make_async_mp_client(\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper\\n(APIServer pid=1) return func(*args, **kwargs)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 127, in make_async_mp_client\\n(APIServer pid=1) return AsyncMPClient(*client_args)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper\\n(APIServer pid=1) return func(*args, **kwargs)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 911, in __init__\\n(APIServer pid=1) super().__init__(\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 569, in __init__\\n(APIServer pid=1) with launch_core_engines(\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/lib/python3.12/contextlib.py", line 144, in __exit__\\n(APIServer pid=1) next(self.gen)\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 951, in launch_core_engines\\n(APIServer pid=1) wait_for_engine_startup(\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1010, in wait_for_engine_startup\\n(APIServer pid=1) raise RuntimeError(\\n(APIServer pid=1) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}\\n/usr/lib/python3.12/multiprocessing/resource_tracker.py:279: UserWarning: resource_tracker: There appear to be 9 leaked shared_memory objects to clean up at shutdown\\n warnings.warn(\\\'resource_tracker: There appear to be %!!(MISSING)d(MISSING) \\\'\\n.\', \'reason\': \'RevisionFailed\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'PredictorConfigurationReady\'}, {\'lastTransitionTime\': \'2026-04-14T18:34:21Z\', \'message\': \'Configuration "zai-org-glm-4-7-fp8-v4-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'status\': \'False\', \'type\': \'PredictorReady\'}, {\'lastTransitionTime\': \'2026-04-14T18:34:21Z\', \'message\': \'Configuration "zai-org-glm-4-7-fp8-v4-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'PredictorRouteReady\'}, {\'lastTransitionTime\': \'2026-04-14T18:34:21Z\', \'message\': \'Configuration "zai-org-glm-4-7-fp8-v4-predictor" does not have any ready Revision.\', \'reason\': \'RevisionMissing\', \'status\': \'False\', \'type\': \'Ready\'}, {\'lastTransitionTime\': \'2026-04-14T18:34:21Z\', \'reason\': \'PredictorRouteReady not ready\', \'severity\': \'Info\', \'status\': \'False\', \'type\': \'RoutesReady\'}], \'modelStatus\': {\'lastFailureInfo\': {\'exitCode\': 1, \'message\': \'m.py", line 154, in __init__\\n(APIServer pid=1) self.engine_core = EngineCoreClient.make_async_mp_client(\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper\\n(APIServer pid=1) return func(*args, **kwargs)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 127, in make_async_mp_client\\n(APIServer pid=1) return AsyncMPClient(*client_args)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/tracing/otel.py", line 178, in sync_wrapper\\n(APIServer pid=1) return func(*args, **kwargs)\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 911, in __init__\\n(APIServer pid=1) super().__init__(\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/core_client.py", line 569, in __init__\\n(APIServer pid=1) with launch_core_engines(\\n(APIServer pid=1) ^^^^^^^^^^^^^^^^^^^^\\n(APIServer pid=1) File "/usr/lib/python3.12/contextlib.py", line 144, in __exit__\\n(APIServer pid=1) next(self.gen)\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 951, in launch_core_engines\\n(APIServer pid=1) wait_for_engine_startup(\\n(APIServer pid=1) File "/usr/local/lib/python3.12/dist-packages/vllm/v1/engine/utils.py", line 1010, in wait_for_engine_startup\\n(APIServer pid=1) raise RuntimeError(\\n(APIServer pid=1) RuntimeError: Engine core initialization failed. See root cause above. Failed core proc(s): {}\\n/usr/lib/python3.12/multiprocessing/resource_tracker.py:279: UserWarning: resource_tracker: There appear to be 9 leaked shared_memory objects to clean up at shutdown\\n warnings.warn(\\\'resource_tracker: There appear to be %d \\\'\\n\', \'reason\': \'ModelLoadFailed\'}, \'states\': {\'activeModelState\': \'\', \'targetModelState\': \'FailedToLoad\'}, \'transitionStatus\': \'BlockedByFailedLoad\'}, \'observedGeneration\': 1}}')
run pipeline stage %s
Running pipeline stage VLLMDeleter
2026-04-14T19:14:13.080620+00:00 monitor updated for zai-org-glm-4-7-fp8_v4
Checking if service zai-org-glm-4-7-fp8-v4 is running
Skipping teardown as no inference service was found
Pipeline stage VLLMDeleter completed in 0.78s
run pipeline stage %s
Running pipeline stage VLLMModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
Deleting key zai-org-glm-4-7-fp8-v4/default/.gitattributes from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/README.md from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/chat_template.jinja from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/config.json from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/generation_config.json from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00001-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00002-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00003-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00004-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00005-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00006-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00007-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00008-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00009-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00010-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00011-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00012-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00013-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00014-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00015-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00016-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00017-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00018-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00019-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00020-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00021-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00022-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00023-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00024-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00025-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00026-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00027-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00028-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00029-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00030-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00031-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00032-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00033-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00034-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00035-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00036-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00037-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00038-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00039-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00040-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00041-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00042-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00043-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00044-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00045-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00046-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00047-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00048-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00049-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00050-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00051-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00052-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00053-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00054-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00055-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00056-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00057-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00058-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00059-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00060-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00061-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00062-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00063-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00064-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00065-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00066-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00067-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00068-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00069-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00070-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00071-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00072-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00073-of-00092.safetensors from bucket guanaco-vllm-models
Unable to record family friendly update due to error: HTTPConnectionPool(host='chaiml-nemo-guard-merged-v3-predictor.tenant-chaiml-guanaco.k2.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00074-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00075-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00076-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00077-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00078-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00079-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00080-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00081-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00082-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00083-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00084-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00085-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00086-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00087-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00088-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00089-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00090-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00091-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model-00092-of-00092.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/model.safetensors.index.json from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/mtp.safetensors from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/tokenizer.json from bucket guanaco-vllm-models
Deleting key zai-org-glm-4-7-fp8-v4/default/tokenizer_config.json from bucket guanaco-vllm-models
Pipeline stage VLLMModelDeleter completed in 48.50s
Shutdown handler de-registered
zai-org-glm-4-7-fp8_v4 status is now failed due to DeploymentManager action
zai-org-glm-4-7-fp8_v4 status is now torndown due to DeploymentManager action