developer_uid: richhx
submission_id: chaiml-kaniwara-japan-fu_433_v10
model_name: chaiml-kaniwara-japan-fu_433_v10
model_group: ChaiML/Kaniwara-Japan-Fu
status: torndown
timestamp: 2026-04-16T22:32:19+00:00
num_battles: 12143
num_wins: 5357
celo_rating: 0.0
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/Kaniwara-Japan-Future-RPG260223150837_sft
model_architecture: MistralForCausalLM
model_num_parameters: 24096691200.0
best_of: 8
max_input_tokens: 1024
max_output_tokens: 64
reward_model: default
display_name: chaiml-kaniwara-japan-fu_433_v10
is_internal_developer: True
language_model: ChaiML/Kaniwara-Japan-Future-RPG260223150837_sft
model_size: 24B
ranking_group: single
us_pacific_date: 2026-04-13
win_ratio: 0.4411595157704027
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '####', '####\n', 'You:', '</s>'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}</s>\n', 'user_template': 'You: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-kaniwara-japan-fu-433-v10-uploader
Waiting for job on chaiml-kaniwara-japan-fu-433-v10-uploader to finish
2026-04-13T21:04:02.818857+00:00 monitor updated for chaiml-pony-v2-g46-lr1_80834_v33
2026-04-13T21:04:42.311616+00:00 monitor updated for chaiml-kaniwara-japan-fu_433_v10
2026-04-13T21:05:03.147164+00:00 monitor updated for chaiml-pony-v2-g46-lr1_80834_v33
chaiml-kaniwara-japan-fu-433-v10-uploader: Using quantization_mode: fp8
chaiml-kaniwara-japan-fu-433-v10-uploader: Checking if ChaiML/Kaniwara-Japan-Future-RPG260223150837_sft-FP8 already exists in ChaiML
chaiml-kaniwara-japan-fu-433-v10-uploader: Downloading snapshot of ChaiML/Kaniwara-Japan-Future-RPG260223150837_sft...
2026-04-13T21:05:42.624571+00:00 monitor updated for chaiml-kaniwara-japan-fu_433_v10
chaiml-pony-v2-g46-lr1-80834-v33-uploader: Downloaded in 154.513s
chaiml-pony-v2-g46-lr1-80834-v33-uploader: Processed model ChaiML/pony-v2-g46-lr1e4ep1r64b16 in 158.335s
chaiml-pony-v2-g46-lr1-80834-v33-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v2-g46-lr1-80834-v33-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-g46-lr1-80834-v33-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v2-g46-lr1-80834-v33-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v2-g46-lr1-80834-v33-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v2-g46-lr1-80834-v33-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-g46-lr1-80834-v33-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v2-g46-lr1-80834-v33-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-g46-lr1-80834-v33-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v2-g46-lr1-80834-v33-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-g46-lr1-80834-v33-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v2-g46-lr1-80834-v33-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v2-g46-lr1-80834-v33-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v2-g46-lr1-80834-v33-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v2-g46-lr1-80834-v33-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v2-g46-lr1-80834-v33-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-kaniwara-japan-fu-433-v10-uploader: Downloaded in 25.546s
chaiml-pony-v2-g46-lr1-80834-v33-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-kaniwara-japan-fu-433-v10-uploader: Loading /tmp/model_input...
chaiml-pony-v2-g46-lr1-80834-v33-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-kaniwara-japan-fu-433-v10-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-pony-v2-g46-lr1-80834-v33-uploader: uploading /tmp/model_output to s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default
chaiml-kaniwara-japan-fu-433-v10-uploader: `torch_dtype` is deprecated! Use `dtype` instead!
chaiml-kaniwara-japan-fu-433-v10-uploader: Applying quantization...
chaiml-kaniwara-japan-fu-433-v10-uploader: 2026-04-13T21:05:59.0449 | __init__ | WARNING - Disabling tokenizer parallelism due to threading conflict between FastTokenizer and Datasets. Set TOKENIZERS_PARALLELISM=false to suppress this warning.
chaiml-kaniwara-japan-fu-433-v10-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-kaniwara-japan-fu-433-v10-uploader: 2026-04-13T21:05:59.4084 | reset | INFO - Compression lifecycle reset
chaiml-kaniwara-japan-fu-433-v10-uploader: 2026-04-13T21:05:59.4100 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-kaniwara-japan-fu-433-v10-uploader: 2026-04-13T21:05:59.4378 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-kaniwara-japan-fu-433-v10-uploader: 2026-04-13T21:05:59.4380 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
chaiml-kaniwara-japan-fu-433-v10-uploader: 2026-04-13T21:05:59.4460 | dispatch_model | WARNING - Forced to offload modules due to insufficient gpu resources
2026-04-13T21:06:03.469957+00:00 monitor updated for chaiml-pony-v2-g46-lr1_80834_v33
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/config.json
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/generation_config.json
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/chat_template.jinja
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/.gitattributes
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/tokenizer_config.json
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/recipe.yaml
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/special_tokens_map.json
chaiml-kaniwara-japan-fu-433-v10-uploader: 2026-04-13T21:06:06.3487 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-kaniwara-japan-fu-433-v10-uploader: 2026-04-13T21:06:06.3489 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
chaiml-kaniwara-japan-fu-433-v10-uploader: Saving to /tmp/model_output...
chaiml-kaniwara-japan-fu-433-v10-uploader: /usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py:3970: UserWarning: Attempting to save a model with offloaded modules. Ensure that unallocated cpu memory exceeds the `shard_size` (5GB default)
chaiml-kaniwara-japan-fu-433-v10-uploader: warnings.warn(
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00072-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00072-of-00072.safetensors
2026-04-13T21:06:43.442955+00:00 monitor updated for chaiml-kaniwara-japan-fu_433_v10
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model.safetensors.index.json
chaiml-kaniwara-japan-fu-433-v10-uploader: Updating config in /tmp/model_output
chaiml-kaniwara-japan-fu-433-v10-uploader: Pushing to ChaiML/Kaniwara-Japan-Future-RPG260223150837_sft-FP8
chaiml-kaniwara-japan-fu-433-v10-uploader: Checking if ChaiML/Kaniwara-Japan-Future-RPG260223150837_sft-FP8 already exists in ChaiML
chaiml-kaniwara-japan-fu-433-v10-uploader: Creating repo ChaiML/Kaniwara-Japan-Future-RPG260223150837_sft-FP8 and uploading /tmp/model_output to it
chaiml-kaniwara-japan-fu-433-v10-uploader: ---------- 2026-04-13 21:06:40 (0:00:00) ----------
chaiml-kaniwara-japan-fu-433-v10-uploader: Files: hashed 6/13 (276.3K/24.9G) | pre-uploaded: 0/0 (0.0/24.9G) (+13 unsure) | committed: 0/13 (0.0/24.9G) | ignored: 0
chaiml-kaniwara-japan-fu-433-v10-uploader: Workers: hashing: 7 | get upload mode: 6 | pre-uploading: 0 | committing: 0 | waiting: 113
chaiml-kaniwara-japan-fu-433-v10-uploader: ---------------------------------------------------
2026-04-13T21:07:03.772510+00:00 monitor updated for chaiml-pony-v2-g46-lr1_80834_v33
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00027-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00027-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00049-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00049-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00062-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00062-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00054-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00054-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00048-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00048-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00017-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00017-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00045-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00045-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00022-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00022-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00068-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00068-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00028-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00028-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00059-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00059-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00033-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00033-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00046-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00046-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00029-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00029-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00052-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00052-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00064-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00064-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00014-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00014-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00011-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00011-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00018-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00018-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00010-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00010-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00003-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00003-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00004-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00004-of-00072.safetensors
chaiml-kaniwara-japan-fu-433-v10-uploader: Processed model ChaiML/Kaniwara-Japan-Future-RPG260223150837_sft in 115.833s
chaiml-kaniwara-japan-fu-433-v10-uploader: creating bucket guanaco-vllm-models
chaiml-kaniwara-japan-fu-433-v10-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-kaniwara-japan-fu-433-v10-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-kaniwara-japan-fu-433-v10-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-kaniwara-japan-fu-433-v10-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-kaniwara-japan-fu-433-v10-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-kaniwara-japan-fu-433-v10-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-kaniwara-japan-fu-433-v10-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-kaniwara-japan-fu-433-v10-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-kaniwara-japan-fu-433-v10-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-kaniwara-japan-fu-433-v10-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-kaniwara-japan-fu-433-v10-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-kaniwara-japan-fu-433-v10-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-kaniwara-japan-fu-433-v10-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-kaniwara-japan-fu-433-v10-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-kaniwara-japan-fu-433-v10-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-kaniwara-japan-fu-433-v10-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-kaniwara-japan-fu-433-v10-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-kaniwara-japan-fu-433-v10-uploader: uploading /tmp/model_output to s3://guanaco-vllm-models/chaiml-kaniwara-japan-fu-433-v10/default
chaiml-kaniwara-japan-fu-433-v10-uploader: cp /tmp/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-kaniwara-japan-fu-433-v10/default/model.safetensors.index.json
chaiml-kaniwara-japan-fu-433-v10-uploader: cp /tmp/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-kaniwara-japan-fu-433-v10/default/recipe.yaml
chaiml-kaniwara-japan-fu-433-v10-uploader: cp /tmp/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-kaniwara-japan-fu-433-v10/default/special_tokens_map.json
chaiml-kaniwara-japan-fu-433-v10-uploader: cp /tmp/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-kaniwara-japan-fu-433-v10/default/generation_config.json
chaiml-kaniwara-japan-fu-433-v10-uploader: cp /tmp/model_output/config.json s3://guanaco-vllm-models/chaiml-kaniwara-japan-fu-433-v10/default/config.json
chaiml-kaniwara-japan-fu-433-v10-uploader: cp /tmp/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-kaniwara-japan-fu-433-v10/default/tokenizer_config.json
chaiml-kaniwara-japan-fu-433-v10-uploader: cp /tmp/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-kaniwara-japan-fu-433-v10/default/tokenizer.json
chaiml-kaniwara-japan-fu-433-v10-uploader: cp /tmp/model_output/model-00006-of-00006.safetensors s3://guanaco-vllm-models/chaiml-kaniwara-japan-fu-433-v10/default/model-00006-of-00006.safetensors
2026-04-13T21:07:43.761415+00:00 monitor updated for chaiml-kaniwara-japan-fu_433_v10
Job chaiml-kaniwara-japan-fu-433-v10-uploader completed after 242.05s with status: succeeded
Stopping job with name chaiml-kaniwara-japan-fu-433-v10-uploader
Pipeline stage VLLMUploader completed in 243.77s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.46s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.24s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-kaniwara-japan-fu-433-v10
Waiting for inference service chaiml-kaniwara-japan-fu-433-v10 to be ready
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00023-of-00072.safetensors?partNumber=2&uploadId=0ffd0760-2fe8-49e4-897d-8926ef46bf6d": write tcp 10.0.23.239:45446->166.19.18.1:443: use of closed network connection
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00026-of-00072.safetensors?partNumber=4&uploadId=32721d7b-b251-401c-b7b3-35df11a85fa4": write tcp 10.0.23.239:45416->166.19.18.1:443: use of closed network connection
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00026-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00026-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
Shutdown handler not registered because Python interpreter is not running in the main thread
2026-04-13T21:08:04.082755+00:00 monitor updated for chaiml-pony-v2-g46-lr1_80834_v33
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Post "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00065-of-00072.safetensors?uploads=": EOF
run pipeline %s
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
run pipeline stage %s
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00030-of-00072.safetensors?partNumber=3&uploadId=f1726196-0b01-41e2-ac43-1f38d46dc40f": write tcp 10.0.23.239:45542->166.19.18.1:443: use of closed network connection
Running pipeline stage VLLMUploader
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00044-of-00072.safetensors?partNumber=1&uploadId=ce5a8987-e366-486f-9024-2e3fa156c825": write tcp 10.0.23.239:45510->166.19.18.1:443: use of closed network connection
Starting job with name qwen-qwen3-235b-a22b-i-47730-v24-uploader
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00044-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00044-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
Waiting for job on qwen-qwen3-235b-a22b-i-47730-v24-uploader to finish
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00038-of-00072.safetensors?partNumber=1&uploadId=6143aebd-b541-4109-a96a-1c718cfdfeda": write tcp 10.0.23.239:45390->166.19.18.1:443: write: connection timed out
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00019-of-00072.safetensors?partNumber=2&uploadId=0bd4ff93-a00a-4690-9d8e-c7ed145288f3": write tcp 10.0.23.239:45522->166.19.18.1:443: write: connection timed out
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00030-of-00072.safetensors?partNumber=4&uploadId=f1726196-0b01-41e2-ac43-1f38d46dc40f": write tcp 10.0.23.239:45456->166.19.18.1:443: write: connection timed out
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00040-of-00072.safetensors?partNumber=3&uploadId=b4b3a24b-f477-4a78-9bcb-44d15eed99a6": write tcp 10.0.23.239:45370->166.19.18.1:443: write: broken pipe
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00023-of-00072.safetensors?partNumber=4&uploadId=0ffd0760-2fe8-49e4-897d-8926ef46bf6d": write tcp 10.0.23.239:45448->166.19.18.1:443: write: broken pipe
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00043-of-00072.safetensors?partNumber=2&uploadId=f7ad4c1e-caac-4d34-bda6-e298241c91b6": write tcp 10.0.23.239:45468->166.19.18.1:443: write: broken pipe
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/tokenizer.json": write tcp 10.0.23.239:45398->166.19.18.1:443: write: broken pipe
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00042-of-00072.safetensors?partNumber=3&uploadId=9791b8e9-2fc1-440c-96a3-ad9e819482ad": write tcp 10.0.23.239:45342->166.19.18.1:443: write: connection timed out
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00024-of-00072.safetensors?partNumber=2&uploadId=1fb6b34f-fb9a-400b-affb-dd078847b145": write tcp 10.0.23.239:45374->166.19.18.1:443: write: broken pipe
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00041-of-00072.safetensors?partNumber=2&uploadId=60a04136-258e-4be0-adc3-2553a8f86e5a": write tcp 10.0.23.239:45558->166.19.18.1:443: write: connection timed out
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/tokenizer.json
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00042-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00042-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00030-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00030-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00043-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00043-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00020-of-00072.safetensors?partNumber=1&uploadId=8010bb20-eaeb-40f2-bad4-8956551aca55": write tcp 10.0.23.239:45410->166.19.18.1:443: write: connection timed out
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00023-of-00072.safetensors?partNumber=1&uploadId=0ffd0760-2fe8-49e4-897d-8926ef46bf6d": write tcp 10.0.23.239:45440->166.19.18.1:443: write: broken pipe
chaiml-pony-v2-g46-lr1-80834-v33-uploader: DEBUG retryable error: RequestError: send request failed
chaiml-pony-v2-g46-lr1-80834-v33-uploader: caused by: Put "https://guanaco-vllm-models.cwobject.com/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00034-of-00072.safetensors?partNumber=3&uploadId=e4dfa805-b833-44e7-9e7b-d4ddfd7e5649": write tcp 10.0.23.239:45520->166.19.18.1:443: write: connection timed out
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00034-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00034-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00020-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00020-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00023-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00023-of-00072.safetensors
chaiml-pony-v2-g46-lr1-80834-v33-uploader: cp /tmp/model_output/model-00065-of-00072.safetensors s3://guanaco-vllm-models/chaiml-pony-v2-g46-lr1-80834-v33/default/model-00065-of-00072.safetensors
Job chaiml-pony-v2-g46-lr1-80834-v33-uploader completed after 395.69s with status: succeeded
Stopping job with name chaiml-pony-v2-g46-lr1-80834-v33-uploader
Pipeline stage VLLMUploader completed in 397.51s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.82s
run pipeline stage %s
Running pipeline stage VLLMTemplater
2026-04-13T21:08:44.100199+00:00 monitor updated for chaiml-kaniwara-japan-fu_433_v10
Pipeline stage VLLMTemplater completed in 1.81s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v2-g46-lr1-80834-v33
Waiting for inference service chaiml-pony-v2-g46-lr1-80834-v33 to be ready
qwen-qwen3-235b-a22b-i-47730-v24-uploader: Using quantization_mode: w4a16
qwen-qwen3-235b-a22b-i-47730-v24-uploader: Checking if ChaiML/Qwen3-235B-A22B-Instruct-2507-W4A16 already exists in ChaiML
qwen-qwen3-235b-a22b-i-47730-v24-uploader: Downloading snapshot of Qwen/Qwen3-235B-A22B-Instruct-2507...
2026-04-13T21:09:03.825873+00:00 monitor updated for qwen-qwen3-235b-a22b-i_47730_v24
2026-04-13T21:09:04.772575+00:00 monitor updated for chaiml-pony-v2-g46-lr1_80834_v33
2026-04-13T21:09:44.561079+00:00 monitor updated for chaiml-kaniwara-japan-fu_433_v10
2026-04-13T21:10:04.277326+00:00 monitor updated for qwen-qwen3-235b-a22b-i_47730_v24
2026-04-13T21:10:05.216846+00:00 monitor updated for chaiml-pony-v2-g46-lr1_80834_v33
Inference service chaiml-kaniwara-japan-fu-433-v10 ready after 161.98267555236816s
Pipeline stage VLLMDeployer completed in 164.03s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 4.661473274230957s
Received healthy response to inference request in 4.862468242645264s
2026-04-13T21:10:45.034270+00:00 monitor updated for chaiml-kaniwara-japan-fu_433_v10
Received healthy response to inference request in 2.8098244667053223s
Received healthy response to inference request in 2.8670241832733154s
Received healthy response to inference request in 2.960700035095215s
Received healthy response to inference request in 2.769190788269043s
Received healthy response to inference request in 3.2361013889312744s
Received healthy response to inference request in 3.0505425930023193s
2026-04-13T21:11:04.986946+00:00 monitor updated for qwen-qwen3-235b-a22b-i_47730_v24
2026-04-13T21:11:05.696612+00:00 monitor updated for chaiml-pony-v2-g46-lr1_80834_v33
Received healthy response to inference request in 3.415175676345825s
Received healthy response to inference request in 2.916670083999634s
Received healthy response to inference request in 2.792447328567505s
Received healthy response to inference request in 2.8960628509521484s
Received healthy response to inference request in 2.8704681396484375s
Received healthy response to inference request in 2.974917411804199s
qwen-qwen3-235b-a22b-i-47730-v24-uploader: Downloaded in 153.489s
qwen-qwen3-235b-a22b-i-47730-v24-uploader: Applying quantization...
qwen-qwen3-235b-a22b-i-47730-v24-uploader: 2026-04-13 21:11:23 INFO base.py L473: `enable_opt_rtn` is turned on, set `--disable_opt_rtn` for higher speed at the cost of accuracy.
qwen-qwen3-235b-a22b-i-47730-v24-uploader: 2026-04-13 21:11:23 INFO base.py L517: using torch.bfloat16 for quantization tuning
qwen-qwen3-235b-a22b-i-47730-v24-uploader: 2026-04-13 21:11:23 WARNING formats.py L166: some layers are skipped quantization (shape not divisible by 32): 
qwen-qwen3-235b-a22b-i-47730-v24-uploader: 2026-04-13 21:11:23 INFO base.py L1660: Using predefined ignore_layers: model.layers.[0-93].mlp.gate
Received healthy response to inference request in 2.927828073501587s
qwen-qwen3-235b-a22b-i-47730-v24-uploader: 2026-04-13 21:11:25 INFO base.py L1150: start to compute imatrix
Received healthy response to inference request in 2.7502920627593994s
Received healthy response to inference request in 2.9118502140045166s
Received healthy response to inference request in 2.7684085369110107s
Received healthy response to inference request in 4.7864813804626465s
2026-04-13T21:11:45.747304+00:00 monitor updated for chaiml-kaniwara-japan-fu_433_v10
Received healthy response to inference request in 2.7500486373901367s
Received healthy response to inference request in 4.618350505828857s
Received healthy response to inference request in 2.7546370029449463s
Received healthy response to inference request in 2.876657724380493s
Received healthy response to inference request in 2.745826482772827s
Received healthy response to inference request in 2.7761542797088623s
2026-04-13T21:12:05.633589+00:00 monitor updated for qwen-qwen3-235b-a22b-i_47730_v24
2026-04-13T21:12:06.195270+00:00 monitor updated for chaiml-pony-v2-g46-lr1_80834_v33
Received healthy response to inference request in 2.9927659034729004s
Received healthy response to inference request in 2.846034288406372s
Received healthy response to inference request in 3.017359972000122s
Received healthy response to inference request in 2.9151771068573s
qwen-qwen3-235b-a22b-i-47730-v24-uploader: 2026-04-13 21:12:16 WARNING base.py L1270: MoE layer detected: optimized RTN is disabled for efficiency. Use `--enable_opt_rtn` to force-enable it for MoE layers.
qwen-qwen3-235b-a22b-i-47730-v24-uploader: 2026-04-13 21:12:18 INFO device.py L1692: 'peak_ram': 19.11GB, 'peak_vram': 11.38GB
Received healthy response to inference request in 4.663482904434204s
30 requests
0 failed requests
5th percentile: 2.7501581788063048
10th percentile: 2.7542025089263915
20th percentile: 2.7747615814208983
30th percentile: 2.835171341896057
40th percentile: 2.874181890487671
50th percentile: 2.913513660430908
60th percentile: 2.940976858139038
70th percentile: 3.000144124031067
80th percentile: 3.271916246414185
90th percentile: 4.661674237251281
95th percentile: 4.731132066249847
99th percentile: 4.840432052612305
mean time: 3.206147384643555
Pipeline stage StressChecker completed in 113.70s
Shutdown handler de-registered
chaiml-kaniwara-japan-fu_433_v10 status is now deployed due to DeploymentManager action
chaiml-kaniwara-japan-fu_433_v10 status is now inactive due to auto deactivation removed underperforming models
chaiml-kaniwara-japan-fu_433_v10 status is now torndown due to DeploymentManager action