developer_uid: acehao-chai
submission_id: chaiml-grpo-q235b-kimid_79428_v1
model_name: chaiml-grpo-q235b-kimid_79428_v1
model_group: ChaiML/grpo-q235b-kimid-
status: torndown
timestamp: 2026-02-18T23:32:05+00:00
num_battles: 10340
num_wins: 5397
celo_rating: 1312.05
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/grpo-q235b-kimid-v5a-merged-chai-rm-step-700
model_architecture: Qwen3MoeForCausalLM
model_num_parameters: 18790207488.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-grpo-q235b-kimid_79428_v1
ineligible_reason: max_output_tokens!=64
is_internal_developer: False
language_model: ChaiML/grpo-q235b-kimid-v5a-merged-chai-rm-step-700
model_size: 19B
ranking_group: single
us_pacific_date: 2026-02-15
win_ratio: 0.5219535783365571
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['<|assistant|>', '<|im_end|>', '####', '<|user|>', '</think>', '</s>'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': "<|im_start|>system\n{bot_name}'s persona: {memory}<|im_end|>\n", 'prompt_template': '', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Starting job with name chaiml-grpo-q235b-kimid-79428-v1-uploader
Waiting for job on chaiml-grpo-q235b-kimid-79428-v1-uploader to finish
chaiml-grpo-q235b-kimid-79428-v1-uploader: Using quantization_mode: w4a16
chaiml-grpo-q235b-kimid-79428-v1-uploader: Checking if ChaiML/grpo-q235b-kimid-v5a-merged-chai-rm-step-700-W4A16 already exists in ChaiML
chaiml-grpo-q235b-kimid-79428-v1-uploader: Downloading snapshot of ChaiML/grpo-q235b-kimid-v5a-merged-chai-rm-step-700...
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
chaiml-grpo-q235b-kimid-79428-v1-uploader: Downloaded in 162.655s
chaiml-grpo-q235b-kimid-79428-v1-uploader: Applying quantization...
chaiml-grpo-q235b-kimid-79428-v1-uploader: The tokenizer you are loading from '/tmp/model_input' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-grpo-q235b-kimid-79428-v1-uploader: 2026-02-15 11:31:52 WARNING modeling_utils.py L4670: `torch_dtype` is deprecated! Use `dtype` instead!
chaiml-grpo-q235b-kimid-79428-v1-uploader: 2026-02-15 11:32:21 INFO base.py L366: using torch.bfloat16 for quantization tuning
chaiml-grpo-q235b-kimid-79428-v1-uploader: 2026-02-15 11:32:25 INFO base.py L1145: start to compute imatrix
chaiml-grpo-q235b-kimid-79428-v1-uploader: /usr/local/lib/python3.12/dist-packages/torch/backends/cuda/__init__.py:131: UserWarning: Please use the new API settings to control TF32 behavior, such as torch.backends.cudnn.conv.fp32_precision = 'tf32' or torch.backends.cuda.matmul.fp32_precision = 'ieee'. Old settings, e.g, torch.backends.cuda.matmul.allow_tf32 = True, torch.backends.cudnn.allow_tf32 = True, allowTF32CuDNN() and allowTF32CuBLAS() will be deprecated after Pytorch 2.9. Please see https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices (Triggered internally at /pytorch/aten/src/ATen/Context.cpp:80.)
chaiml-grpo-q235b-kimid-79428-v1-uploader: return torch._C._get_cublas_allow_tf32()
chaiml-grpo-q235b-kimid-79428-v1-uploader: W0215 11:33:30.052000 7 torch/_dynamo/convert_frame.py:1358] [6/8] torch._dynamo hit config.recompile_limit (8)
chaiml-grpo-q235b-kimid-79428-v1-uploader: W0215 11:33:30.052000 7 torch/_dynamo/convert_frame.py:1358] [6/8] function: 'forward' (/usr/local/lib/python3.12/dist-packages/transformers/models/qwen3_moe/modeling_qwen3_moe.py:208)
chaiml-grpo-q235b-kimid-79428-v1-uploader: W0215 11:33:30.052000 7 torch/_dynamo/convert_frame.py:1358] [6/8] last reason: 6/7: self._modules['up_proj'].imatrix_cnt == 1347 # module.imatrix_cnt += input.shape[0] # auto_round/compressors/base.py:1179 in get_imatrix_hook (HINT: torch.compile considers integer attributes of the nn.Module to be static. If you are observing recompilation, you might want to make this integer dynamic using torch._dynamo.config.allow_unspec_int_on_nn_module = True, or convert this integer into a tensor.)
chaiml-grpo-q235b-kimid-79428-v1-uploader: W0215 11:33:30.052000 7 torch/_dynamo/convert_frame.py:1358] [6/8] To log all recompilation reasons, use TORCH_LOGS="recompiles".
chaiml-grpo-q235b-kimid-79428-v1-uploader: W0215 11:33:30.052000 7 torch/_dynamo/convert_frame.py:1358] [6/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html
chaiml-grpo-q235b-kimid-79428-v1-uploader: W0215 11:33:32.945000 7 torch/_dynamo/convert_frame.py:1358] [3/8] torch._dynamo hit config.recompile_limit (8)
chaiml-grpo-q235b-kimid-79428-v1-uploader: W0215 11:33:32.945000 7 torch/_dynamo/convert_frame.py:1358] [3/8] function: 'forward' (/usr/local/lib/python3.12/dist-packages/transformers/models/qwen3_moe/modeling_qwen3_moe.py:305)
chaiml-grpo-q235b-kimid-79428-v1-uploader: W0215 11:33:32.945000 7 torch/_dynamo/convert_frame.py:1358] [3/8] last reason: 3/7: self._modules['self_attn']._modules['k_proj'].imatrix_cnt == 56 # module.imatrix_cnt += input.shape[0] # auto_round/compressors/base.py:1179 in get_imatrix_hook (HINT: torch.compile considers integer attributes of the nn.Module to be static. If you are observing recompilation, you might want to make this integer dynamic using torch._dynamo.config.allow_unspec_int_on_nn_module = True, or convert this integer into a tensor.)
chaiml-grpo-q235b-kimid-79428-v1-uploader: W0215 11:33:32.945000 7 torch/_dynamo/convert_frame.py:1358] [3/8] To log all recompilation reasons, use TORCH_LOGS="recompiles".
chaiml-grpo-q235b-kimid-79428-v1-uploader: W0215 11:33:32.945000 7 torch/_dynamo/convert_frame.py:1358] [3/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html
chaiml-grpo-q235b-kimid-79428-v1-uploader: 2026-02-15 11:33:51 WARNING gguf.py L297: please use more data via setting `nsamples` to improve accuracy as calibration activations contain 0
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Connection pool is full, discarding connection: %s. Connection pool size: %s
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
chaiml-grpo-q235b-kimid-79428-v1-uploader: Checking if ChaiML/grpo-q235b-kimid-v5a-merged-chai-rm-step-700-W4A16 already exists in ChaiML
chaiml-grpo-q235b-kimid-79428-v1-uploader: Creating repo ChaiML/grpo-q235b-kimid-v5a-merged-chai-rm-step-700-W4A16 and uploading /dev/shm/model_output to it
chaiml-grpo-q235b-kimid-79428-v1-uploader:       
chaiml-grpo-q235b-kimid-79428-v1-uploader: ---------- 2026-02-15 12:59:54 (0:01:00) ----------
chaiml-grpo-q235b-kimid-79428-v1-uploader: Files: hashed 38/38 (131.9G/131.9G) | pre-uploaded: 2/28 (1.9G/131.9G) | committed: 0/38 (0.0/131.9G) | ignored: 0
chaiml-grpo-q235b-kimid-79428-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 26 | committing: 0 | waiting: 100
chaiml-grpo-q235b-kimid-79428-v1-uploader: ---------------------------------------------------
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
chaiml-grpo-q235b-kimid-79428-v1-uploader:       
chaiml-grpo-q235b-kimid-79428-v1-uploader: ---------- 2026-02-15 13:00:54 (0:02:00) ----------
chaiml-grpo-q235b-kimid-79428-v1-uploader: Files: hashed 38/38 (131.9G/131.9G) | pre-uploaded: 10/28 (41.9G/131.9G) | committed: 0/38 (0.0/131.9G) | ignored: 0
chaiml-grpo-q235b-kimid-79428-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 18 | committing: 0 | waiting: 108
chaiml-grpo-q235b-kimid-79428-v1-uploader: ---------------------------------------------------
chaiml-grpo-q235b-kimid-79428-v1-uploader:       
chaiml-grpo-q235b-kimid-79428-v1-uploader: ---------- 2026-02-15 13:01:54 (0:03:00) ----------
chaiml-grpo-q235b-kimid-79428-v1-uploader: Files: hashed 38/38 (131.9G/131.9G) | pre-uploaded: 18/28 (81.9G/131.9G) | committed: 0/38 (0.0/131.9G) | ignored: 0
chaiml-grpo-q235b-kimid-79428-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 10 | committing: 0 | waiting: 116
chaiml-grpo-q235b-kimid-79428-v1-uploader: ---------------------------------------------------
chaiml-grpo-q235b-kimid-79428-v1-uploader:       
chaiml-grpo-q235b-kimid-79428-v1-uploader: ---------- 2026-02-15 13:02:54 (0:04:00) ----------
chaiml-grpo-q235b-kimid-79428-v1-uploader: Files: hashed 38/38 (131.9G/131.9G) | pre-uploaded: 26/28 (121.9G/131.9G) | committed: 0/38 (0.0/131.9G) | ignored: 0
chaiml-grpo-q235b-kimid-79428-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 2 | committing: 0 | waiting: 124
chaiml-grpo-q235b-kimid-79428-v1-uploader: ---------------------------------------------------
chaiml-grpo-q235b-kimid-79428-v1-uploader:       
chaiml-grpo-q235b-kimid-79428-v1-uploader: ---------- 2026-02-15 13:03:54 (0:05:00) ----------
chaiml-grpo-q235b-kimid-79428-v1-uploader: Files: hashed 38/38 (131.9G/131.9G) | pre-uploaded: 28/28 (131.9G/131.9G) | committed: 0/38 (0.0/131.9G) | ignored: 0
chaiml-grpo-q235b-kimid-79428-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 0 | committing: 1 | waiting: 125
chaiml-grpo-q235b-kimid-79428-v1-uploader: ---------------------------------------------------
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
chaiml-grpo-q235b-kimid-79428-v1-uploader: Processed model ChaiML/grpo-q235b-kimid-v5a-merged-chai-rm-step-700 in 5703.691s
chaiml-grpo-q235b-kimid-79428-v1-uploader: creating bucket guanaco-vllm-models
chaiml-grpo-q235b-kimid-79428-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-grpo-q235b-kimid-79428-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-grpo-q235b-kimid-79428-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-grpo-q235b-kimid-79428-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-grpo-q235b-kimid-79428-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-grpo-q235b-kimid-79428-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-grpo-q235b-kimid-79428-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-grpo-q235b-kimid-79428-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-grpo-q235b-kimid-79428-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-grpo-q235b-kimid-79428-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-grpo-q235b-kimid-79428-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-grpo-q235b-kimid-79428-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-grpo-q235b-kimid-79428-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-grpo-q235b-kimid-79428-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-grpo-q235b-kimid-79428-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-grpo-q235b-kimid-79428-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-grpo-q235b-kimid-79428-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-grpo-q235b-kimid-79428-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00027-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00027-of-00027.safetensors
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00006-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00006-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00026-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00026-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00009-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00009-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00018-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00018-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00017-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00017-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00014-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00014-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00021-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00021-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00010-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00010-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00005-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00005-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00008-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00008-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00020-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00020-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00019-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00019-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00011-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00011-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00004-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00004-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00002-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00002-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00016-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00016-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00025-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00025-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00012-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00012-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00024-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00024-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00003-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00003-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00022-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00022-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00023-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00023-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00013-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00013-of-00027.safetensors
chaiml-grpo-q235b-kimid-79428-v1-uploader: cp /dev/shm/model_output/model-00007-of-00027.safetensors s3://guanaco-vllm-models/chaiml-grpo-q235b-kimid-79428-v1/default/model-00007-of-00027.safetensors
Job chaiml-grpo-q235b-kimid-79428-v1-uploader completed after 5976.59s with status: succeeded
Stopping job with name chaiml-grpo-q235b-kimid-79428-v1-uploader
Pipeline stage VLLMUploader completed in 5980.87s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.17s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-grpo-q235b-kimid-79428-v1
Waiting for inference service chaiml-grpo-q235b-kimid-79428-v1 to be ready
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Inference service chaiml-grpo-q235b-kimid-79428-v1 ready after 724.6704521179199s
Pipeline stage VLLMDeployer completed in 725.26s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.0369296073913574s
Received healthy response to inference request in 2.2798454761505127s
Received healthy response to inference request in 1.944493293762207s
Received healthy response to inference request in 2.3410348892211914s
Received healthy response to inference request in 1.9979228973388672s
Received healthy response to inference request in 1.918426275253296s
Received healthy response to inference request in 2.0541436672210693s
Received healthy response to inference request in 2.3535027503967285s
Received healthy response to inference request in 2.0243396759033203s
Received healthy response to inference request in 1.8971259593963623s
Received healthy response to inference request in 2.3537025451660156s
Received healthy response to inference request in 1.9568068981170654s
Received healthy response to inference request in 2.0410752296447754s
Received healthy response to inference request in 2.2700610160827637s
Received healthy response to inference request in 2.4411871433258057s
Received healthy response to inference request in 2.2784433364868164s
Received healthy response to inference request in 2.0551905632019043s
Received healthy response to inference request in 2.0900392532348633s
Received healthy response to inference request in 1.9506256580352783s
Received healthy response to inference request in 1.9018011093139648s
Received healthy response to inference request in 2.1057212352752686s
{"detail":"HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-79428-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Max retries exceeded with url: /v1/completions (Caused by ConnectTimeoutError(<urllib3.connection.HTTPConnection object at 0x79f3aa89ff10>, 'Connection to chaiml-grpo-q235b-kimid-79428-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com timed out. (connect timeout=12.0)'))"}
Received unhealthy response to inference request!
Received healthy response to inference request in 2.2708702087402344s
Received healthy response to inference request in 2.0197927951812744s
Received healthy response to inference request in 2.3715429306030273s
Received healthy response to inference request in 2.1557257175445557s
Received healthy response to inference request in 1.9004871845245361s
Received healthy response to inference request in 2.4566550254821777s
Received healthy response to inference request in 2.7214338779449463s
Received healthy response to inference request in 2.077498435974121s
30 requests
1 failed requests
5th percentile: 1.901078450679779
10th percentile: 1.9167637586593629
20th percentile: 1.955570650100708
30th percentile: 2.0229756116867064
40th percentile: 2.048916292190552
50th percentile: 2.083768844604492
60th percentile: 2.2014598369598386
70th percentile: 2.2788639783859255
80th percentile: 2.353542709350586
90th percentile: 2.4427339315414427
95th percentile: 2.6022833943366996
99th percentile: 9.451704223155984
mean time: 2.4822370847066244
%s, retrying in %s seconds...
Received healthy response to inference request in 2.0486276149749756s
Received healthy response to inference request in 1.9470489025115967s
Received healthy response to inference request in 1.8632924556732178s
Received healthy response to inference request in 1.8460867404937744s
Received healthy response to inference request in 2.2907538414001465s
Received healthy response to inference request in 2.195744752883911s
Received healthy response to inference request in 2.125495433807373s
Received healthy response to inference request in 2.0262889862060547s
Received healthy response to inference request in 2.2245383262634277s
Received healthy response to inference request in 2.263461112976074s
Received healthy response to inference request in 1.9607036113739014s
Received healthy response to inference request in 1.984374761581421s
Received healthy response to inference request in 2.485902786254883s
Received healthy response to inference request in 2.238609790802002s
Received healthy response to inference request in 2.4336788654327393s
Received healthy response to inference request in 1.8139159679412842s
Received healthy response to inference request in 2.517747640609741s
Received healthy response to inference request in 2.214766502380371s
Received healthy response to inference request in 2.1420857906341553s
Received healthy response to inference request in 2.299065113067627s
Received healthy response to inference request in 2.042795181274414s
Received healthy response to inference request in 2.0035364627838135s
Received healthy response to inference request in 2.142646551132202s
Received healthy response to inference request in 2.0376718044281006s
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Received healthy response to inference request in 1.9187250137329102s
Received healthy response to inference request in 2.4502289295196533s
Received healthy response to inference request in 1.934861660003662s
Received healthy response to inference request in 2.560997486114502s
Received healthy response to inference request in 1.7462282180786133s
Received healthy response to inference request in 1.9107952117919922s
30 requests
0 failed requests
5th percentile: 1.8283928155899047
10th percentile: 1.8615718841552735
20th percentile: 1.9316343307495116
30th percentile: 1.977273416519165
40th percentile: 2.0331186771392824
50th percentile: 2.0870615243911743
60th percentile: 2.1638858318328857
70th percentile: 2.228759765625
80th percentile: 2.2924160957336426
90th percentile: 2.4537963151931765
95th percentile: 2.503417456150055
99th percentile: 2.5484550309181215
mean time: 2.122355850537618
Pipeline stage StressChecker completed in 157.68s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.70s
Shutdown handler de-registered
chaiml-grpo-q235b-kimid_79428_v1 status is now deployed due to DeploymentManager action
chaiml-grpo-q235b-kimid_79428_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-grpo-q235b-kimid_79428_v1 status is now torndown due to DeploymentManager action