chaiml-pony-v1-q235b-lr_99625

developer_uid: zonemercy
submission_id: chaiml-pony-v1-q235b-lr_99625_v1
model_name: chaiml-pony-v1-q235b-lr_99625_v1
model_group: ChaiML/pony-v1-q235b-lr1
status: torndown
timestamp: 2026-02-24T12:21:23+00:00
num_battles: 10962
num_wins: 5937
celo_rating: 1328.44
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/pony-v1-q235b-lr1e4ep1r64g4
model_architecture: Qwen3MoeForCausalLM
model_num_parameters: 18790207488.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-pony-v1-q235b-lr_99625_v1
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/pony-v1-q235b-lr1e4ep1r64g4
model_size: 19B
ranking_group: single
us_pacific_date: 2026-02-21
win_ratio: 0.5415982484948002
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['</think>', '<|assistant|>', '<|im_end|>', '<|user|>', '</s>', '####'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': "<|im_start|>system\n{bot_name}'s persona: {memory}<|im_end|>\n", 'prompt_template': '', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v1-q235b-lr-99625-v1-uploader
Waiting for job on chaiml-pony-v1-q235b-lr-99625-v1-uploader to finish
chaiml-pony-v1-q235b-lr-99625-v1-uploader: Using quantization_mode: w4a16
chaiml-pony-v1-q235b-lr-99625-v1-uploader: Checking if ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16 already exists in ChaiML
chaiml-pony-v1-q235b-lr-99625-v1-uploader: Downloading snapshot of ChaiML/pony-v1-q235b-lr1e4ep1r64g4...
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
chaiml-pony-v1-q235b-lr-99625-v1-uploader: Downloaded in 157.918s
chaiml-pony-v1-q235b-lr-99625-v1-uploader: Applying quantization...
chaiml-pony-v1-q235b-lr-99625-v1-uploader: [33;1m2026-02-20 23:21:18 WARNING modeling_utils.py L4670: `torch_dtype` is deprecated! Use `dtype` instead![0m
chaiml-pony-v1-q235b-lr-99625-v1-uploader: [38;20m2026-02-20 23:21:42 INFO base.py L366: using torch.bfloat16 for quantization tuning[0m
chaiml-pony-v1-q235b-lr-99625-v1-uploader: [38;20m2026-02-20 23:21:46 INFO base.py L1145: start to compute imatrix[0m
chaiml-pony-v1-q235b-lr-99625-v1-uploader: /usr/local/lib/python3.12/dist-packages/torch/backends/cuda/__init__.py:131: UserWarning: Please use the new API settings to control TF32 behavior, such as torch.backends.cudnn.conv.fp32_precision = 'tf32' or torch.backends.cuda.matmul.fp32_precision = 'ieee'. Old settings, e.g, torch.backends.cuda.matmul.allow_tf32 = True, torch.backends.cudnn.allow_tf32 = True, allowTF32CuDNN() and allowTF32CuBLAS() will be deprecated after Pytorch 2.9. Please see https://pytorch.org/docs/main/notes/cuda.html#tensorfloat-32-tf32-on-ampere-and-later-devices (Triggered internally at /pytorch/aten/src/ATen/Context.cpp:80.)
chaiml-pony-v1-q235b-lr-99625-v1-uploader:   return torch._C._get_cublas_allow_tf32()
chaiml-pony-v1-q235b-lr-99625-v1-uploader: W0220 23:22:50.227000 7 torch/_dynamo/convert_frame.py:1358] [6/8] torch._dynamo hit config.recompile_limit (8)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: W0220 23:22:50.227000 7 torch/_dynamo/convert_frame.py:1358] [6/8]    function: 'forward' (/usr/local/lib/python3.12/dist-packages/transformers/models/qwen3_moe/modeling_qwen3_moe.py:208)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: W0220 23:22:50.227000 7 torch/_dynamo/convert_frame.py:1358] [6/8]    last reason: 6/7: self._modules['up_proj'].imatrix_cnt == 1357             # module.imatrix_cnt += input.shape[0]  # auto_round/compressors/base.py:1179 in get_imatrix_hook (HINT: torch.compile considers integer attributes of the nn.Module to be static. If you are observing recompilation, you might want to make this integer dynamic using torch._dynamo.config.allow_unspec_int_on_nn_module = True, or convert this integer into a tensor.)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: W0220 23:22:50.227000 7 torch/_dynamo/convert_frame.py:1358] [6/8] To log all recompilation reasons, use TORCH_LOGS="recompiles".
chaiml-pony-v1-q235b-lr-99625-v1-uploader: W0220 23:22:50.227000 7 torch/_dynamo/convert_frame.py:1358] [6/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html
chaiml-pony-v1-q235b-lr-99625-v1-uploader: W0220 23:22:53.156000 7 torch/_dynamo/convert_frame.py:1358] [3/8] torch._dynamo hit config.recompile_limit (8)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: W0220 23:22:53.156000 7 torch/_dynamo/convert_frame.py:1358] [3/8]    function: 'forward' (/usr/local/lib/python3.12/dist-packages/transformers/models/qwen3_moe/modeling_qwen3_moe.py:305)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: W0220 23:22:53.156000 7 torch/_dynamo/convert_frame.py:1358] [3/8]    last reason: 3/7: self._modules['self_attn']._modules['k_proj'].imatrix_cnt == 56  # module.imatrix_cnt += input.shape[0]  # auto_round/compressors/base.py:1179 in get_imatrix_hook (HINT: torch.compile considers integer attributes of the nn.Module to be static. If you are observing recompilation, you might want to make this integer dynamic using torch._dynamo.config.allow_unspec_int_on_nn_module = True, or convert this integer into a tensor.)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: W0220 23:22:53.156000 7 torch/_dynamo/convert_frame.py:1358] [3/8] To log all recompilation reasons, use TORCH_LOGS="recompiles".
chaiml-pony-v1-q235b-lr-99625-v1-uploader: W0220 23:22:53.156000 7 torch/_dynamo/convert_frame.py:1358] [3/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
HTTP Request: %s %s "%s %d %s"
Failed to get response for submission chaiml-mistral-24b-2048_15988_v1: ('http://chaiml-mistral-24b-2048-15988-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission chaiml-grpo-q235b-kimid_37540_v1: HTTPConnectionPool(host='chaiml-grpo-q235b-kimid-37540-v1-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: Checking if ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16 already exists in ChaiML
chaiml-pony-v1-q235b-lr-99625-v1-uploader: ChaiML/pony-v1-q235b-lr1e4ep1r64g4-W4A16 already exists in ChaiML
chaiml-pony-v1-q235b-lr-99625-v1-uploader: Processed model ChaiML/pony-v1-q235b-lr1e4ep1r64g4 in 4929.286s
chaiml-pony-v1-q235b-lr-99625-v1-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v1-q235b-lr-99625-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v1-uploader:   RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v1-q235b-lr-99625-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v1-q235b-lr-99625-v1-uploader:   RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v1-uploader:   invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v1-uploader:   invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v1-uploader:   if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v1-q235b-lr-99625-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v1-q235b-lr-99625-v1-uploader:   if re.search("\.\.", bucket, re.UNICODE):
Failed to get response for submission chaiml-mistral-24b-2048_54327_v6: ('http://chaiml-mistral-24b-2048-54327-v6-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
chaiml-pony-v1-q235b-lr-99625-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v1-q235b-lr-99625-v1-uploader:   _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v1-q235b-lr-99625-v1-uploader:   wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v1-q235b-lr-99625-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v1-q235b-lr-99625-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default
chaiml-pony-v1-q235b-lr-99625-v1-uploader: cp /dev/shm/model_output/added_tokens.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default/added_tokens.json
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
chaiml-pony-v1-q235b-lr-99625-v1-uploader: cp /dev/shm/model_output/special_tokens_map.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default/special_tokens_map.json
chaiml-pony-v1-q235b-lr-99625-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default/config.json
chaiml-pony-v1-q235b-lr-99625-v1-uploader: cp /dev/shm/model_output/quantization_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default/quantization_config.json
chaiml-pony-v1-q235b-lr-99625-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default/generation_config.json
chaiml-pony-v1-q235b-lr-99625-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default/chat_template.jinja
chaiml-pony-v1-q235b-lr-99625-v1-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default/merges.txt
chaiml-pony-v1-q235b-lr-99625-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default/tokenizer_config.json
chaiml-pony-v1-q235b-lr-99625-v1-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default/vocab.json
chaiml-pony-v1-q235b-lr-99625-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default/model.safetensors.index.json
chaiml-pony-v1-q235b-lr-99625-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v1-q235b-lr-99625-v1/default/tokenizer.json
Job chaiml-pony-v1-q235b-lr-99625-v1-uploader completed after 5075.67s with status: succeeded
Stopping job with name chaiml-pony-v1-q235b-lr-99625-v1-uploader
Pipeline stage VLLMUploader completed in 5086.45s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 7.50s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v1-q235b-lr-99625-v1
Waiting for inference service chaiml-pony-v1-q235b-lr-99625-v1 to be ready
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Failed to get response for submission chaiml-mistral-24b-2048-_2678_v3: ('http://chaiml-mistral-24b-2048-2678-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Inference service chaiml-pony-v1-q235b-lr-99625-v1 ready after 423.36439752578735s
Pipeline stage VLLMDeployer completed in 434.20s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.306901693344116s
Received healthy response to inference request in 3.1940560340881348s
Received healthy response to inference request in 1.9121835231781006s
Received healthy response to inference request in 1.9526307582855225s
Received healthy response to inference request in 1.9006667137145996s
Received healthy response to inference request in 1.9284508228302002s
Received healthy response to inference request in 1.9427177906036377s
Received healthy response to inference request in 2.0362370014190674s
Received healthy response to inference request in 2.1134347915649414s
Received healthy response to inference request in 2.074896812438965s
Received healthy response to inference request in 1.9994926452636719s
Received healthy response to inference request in 1.9526095390319824s
Received healthy response to inference request in 2.3292155265808105s
Received healthy response to inference request in 1.9194140434265137s
Received healthy response to inference request in 2.1897621154785156s
Received healthy response to inference request in 2.7545204162597656s
Received healthy response to inference request in 2.044762134552002s
Received healthy response to inference request in 2.165517807006836s
Received healthy response to inference request in 2.0465898513793945s
Received healthy response to inference request in 2.034233570098877s
Received healthy response to inference request in 2.087395429611206s
Received healthy response to inference request in 1.9238927364349365s
Received healthy response to inference request in 2.0457022190093994s
Received healthy response to inference request in 2.4621622562408447s
Received healthy response to inference request in 1.941845178604126s
Received healthy response to inference request in 2.4443702697753906s
Received healthy response to inference request in 2.0626442432403564s
Received healthy response to inference request in 2.090564012527466s
Received healthy response to inference request in 2.1526947021484375s
Received healthy response to inference request in 2.282867908477783s
30 requests
0 failed requests
5th percentile: 1.9154372572898866
10th percentile: 1.9234448671340942
20th percentile: 1.9425432682037354
30th percentile: 1.985434079170227
40th percentile: 2.041352081298828
50th percentile: 2.0546170473098755
60th percentile: 2.08866286277771
70th percentile: 2.156541633605957
80th percentile: 2.28767466545105
90th percentile: 2.446149468421936
95th percentile: 2.62295924425125
99th percentile: 3.066590704917908
mean time: 2.1430810848871866
Pipeline stage StressChecker completed in 84.01s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.75s
Shutdown handler de-registered
chaiml-pony-v1-q235b-lr_99625_v1 status is now deployed due to DeploymentManager action
chaiml-pony-v1-q235b-lr_99625_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v1-q235b-lr_99625_v1 status is now torndown due to DeploymentManager action