developer_uid: zonemercy
submission_id: chaiml-pony-d3a-mv1-plc-_5598_v1
model_name: chaiml-pony-d3a-mv1-plc-_5598_v1
model_group: ChaiML/pony-d3a-mv1-plc-
status: torndown
timestamp: 2026-03-31T16:41:59+00:00
num_battles: 4474
num_wins: 2337
celo_rating: 1307.11
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4
model_architecture: Qwen3_5ForConditionalGeneration
model_num_parameters: 23564784640.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-pony-d3a-mv1-plc-_5598_v1
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4
model_size: 24B
ranking_group: single
us_pacific_date: 2026-03-28
win_ratio: 0.5223513634331695
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['<|im_end|>', '<|user|>', '</s>', '####', '<|assistant|>'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': "<|im_start|>system\n{bot_name}'s persona: {memory}<|im_end|>\n", 'prompt_template': '', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3a-mv1-plc-5598-v1-uploader
Waiting for job on chaiml-pony-d3a-mv1-plc-5598-v1-uploader to finish
Failed to get response for submission chaiml-gspo-glm47-combi_10268_v1: ('http://chaiml-gspo-glm47-combi-10268-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Using quantization_mode: fp8
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Checking if ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4-FP8 already exists in ChaiML
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Downloading snapshot of ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4...
2026-03-28T14:35:22.884796+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Downloaded in 33.497s
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Loading /tmp/model_input...
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: The fast path is not available because one of the required library is not installed. Falling back to torch implementation. To install follow https://github.com/fla-org/flash-linear-attention#installation and https://github.com/Dao-AILab/causal-conv1d
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Applying quantization...
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: 2026-03-28T14:35:39.934988+0000 | __init__ | WARNING - Disabling tokenizer parallelism due to threading conflict between FastTokenizer and Datasets. Set TOKENIZERS_PARALLELISM=false to suppress this warning.
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: 2026-03-28T14:35:42.019383+0000 | reset | INFO - Compression lifecycle reset
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: 2026-03-28T14:35:42.022409+0000 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: 2026-03-28T14:35:42.069943+0000 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: 2026-03-28T14:35:42.070183+0000 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: 2026-03-28T14:35:42.083365+0000 | dispatch_model | WARNING - Forced to offload modules due to insufficient gpu resources
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: 2026-03-28T14:35:49.262332+0000 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: 2026-03-28T14:35:49.262501+0000 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Saving to /dev/shm/model_output...
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: /usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py:3344: UserWarning: Attempting to save a model with offloaded modules. Ensure that unallocated cpu memory exceeds the `shard_size` (50GB default)
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: warnings.warn(
2026-03-28T14:36:23.035226+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Cleaning quantization config in /dev/shm/model_output
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Pushing to ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4-FP8
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Checking if ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4-FP8 already exists in ChaiML
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Creating repo ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4-FP8 and uploading /dev/shm/model_output to it
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Found 1 files larger than 20GB (recommended limit):
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: - model.safetensors: 35.9GB
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Large files may slow down loading and processing.
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: ---------- 2026-03-28 14:36:37 (0:00:00) ----------
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Files: hashed 5/7 (34.1K/35.9G) | pre-uploaded: 0/0 (0.0/35.9G) (+7 unsure) | committed: 0/7 (0.0/35.9G) | ignored: 0
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Workers: hashing: 2 | get upload mode: 5 | pre-uploading: 0 | committing: 0 | waiting: 57
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: ---------------------------------------------------
2026-03-28T14:37:23.144669+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
chaiml-pony-d3a-mv1-plc-5598-v1-uploader:       
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: ---------- 2026-03-28 14:37:37 (0:01:00) ----------
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Files: hashed 7/7 (35.9G/35.9G) | pre-uploaded: 1/2 (20.0M/35.9G) | committed: 0/7 (0.0/35.9G) | ignored: 0
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 1 | committing: 0 | waiting: 63
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: ---------------------------------------------------
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Processed model ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4 in 195.124s
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v1/default
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v1/default/chat_template.jinja
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v1/default/recipe.yaml
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v1/default/generation_config.json
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v1/default/tokenizer_config.json
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v1/default/config.json
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v1/default/tokenizer.json
2026-03-28T14:38:23.292290+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
chaiml-pony-d3a-mv1-plc-5598-v1-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v1/default/model.safetensors
2026-03-28T14:39:23.388521+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
Job chaiml-pony-d3a-mv1-plc-5598-v1-uploader completed after 308.9s with status: succeeded
Stopping job with name chaiml-pony-d3a-mv1-plc-5598-v1-uploader
Pipeline stage VLLMUploader completed in 309.51s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.10s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.68s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3a-mv1-plc-5598-v1
Waiting for inference service chaiml-pony-d3a-mv1-plc-5598-v1 to be ready
2026-03-28T14:40:23.559552+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
2026-03-28T14:41:23.683634+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
Unable to record family friendly update due to error: Invalid JSON input: Expecting value: line 1 column 1 (char 0)
2026-03-28T14:42:23.891314+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
Inference service chaiml-pony-d3a-mv1-plc-5598-v1 ready after 180.86755347251892s
Pipeline stage VLLMDeployer completed in 181.40s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T14:43:24.069031+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 11.989588022232056s
Received healthy response to inference request in 4.336457014083862s
Received healthy response to inference request in 2.3914806842803955s
2026-03-28T14:44:24.223391+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.772698402404785s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.193748712539673s
Received healthy response to inference request in 1.9220898151397705s
Received healthy response to inference request in 1.9083852767944336s
2026-03-28T14:45:24.315913+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.9802887439727783s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.9750118255615234s
Received healthy response to inference request in 2.419912576675415s
Received healthy response to inference request in 2.208159923553467s
Received healthy response to inference request in 1.9278886318206787s
Received healthy response to inference request in 2.441502332687378s
2026-03-28T14:46:24.428626+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.9965620040893555s
Received healthy response to inference request in 2.034695863723755s
Received healthy response to inference request in 15.38066554069519s
Received healthy response to inference request in 2.0852210521698s
Received healthy response to inference request in 2.0943217277526855s
Received healthy response to inference request in 2.4197075366973877s
Received healthy response to inference request in 2.103870391845703s
Received healthy response to inference request in 2.160033941268921s
30 requests
9 failed requests
5th percentile: 1.9246992826461793
10th percentile: 1.9750487327575683
20th percentile: 2.075116014480591
30th percentile: 2.1431848764419557
40th percentile: 2.318152379989624
50th percentile: 2.4307074546813965
60th percentile: 4.510953569412231
70th percentile: 16.806868672370896
80th percentile: 20.170349884033204
90th percentile: 20.27227611541748
95th percentile: 20.523996114730835
99th percentile: 20.574102928638457
mean time: 8.548843081792196
%s, retrying in %s seconds...
Received healthy response to inference request in 1.9767673015594482s
Received healthy response to inference request in 2.054100751876831s
Received healthy response to inference request in 2.0717337131500244s
Received healthy response to inference request in 1.921492338180542s
Received healthy response to inference request in 1.857604742050171s
Received healthy response to inference request in 2.0310215950012207s
Received healthy response to inference request in 3.133348226547241s
Received healthy response to inference request in 1.937939167022705s
Received healthy response to inference request in 2.385251522064209s
Received healthy response to inference request in 2.1039412021636963s
Received healthy response to inference request in 2.1337568759918213s
Received healthy response to inference request in 1.952939748764038s
2026-03-28T14:47:24.532401+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v1
Received healthy response to inference request in 1.9309477806091309s
Received healthy response to inference request in 1.8602688312530518s
Received healthy response to inference request in 1.9407052993774414s
Received healthy response to inference request in 2.690883159637451s
Received healthy response to inference request in 1.9015095233917236s
Received healthy response to inference request in 1.9563438892364502s
Received healthy response to inference request in 2.400973320007324s
Received healthy response to inference request in 2.2814066410064697s
Received healthy response to inference request in 1.978754997253418s
Received healthy response to inference request in 1.964498519897461s
Received healthy response to inference request in 1.98142409324646s
Received healthy response to inference request in 2.2167258262634277s
Received healthy response to inference request in 2.0701470375061035s
Received healthy response to inference request in 1.97892165184021s
Received healthy response to inference request in 2.3412983417510986s
Received healthy response to inference request in 2.1479856967926025s
Received healthy response to inference request in 2.037384033203125s
Received healthy response to inference request in 1.9885616302490234s
30 requests
0 failed requests
5th percentile: 1.8788271427154541
10th percentile: 1.91949405670166
20th percentile: 1.940152072906494
30th percentile: 1.9620521306991576
40th percentile: 1.9788549900054933
50th percentile: 2.009791612625122
60th percentile: 2.06051926612854
70th percentile: 2.1128859043121335
80th percentile: 2.2296619892120364
90th percentile: 2.3868237018585203
95th percentile: 2.560423731803893
99th percentile: 3.0050333571434025
mean time: 2.1076212485631305
Pipeline stage StressChecker completed in 328.06s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.34s
Shutdown handler de-registered
chaiml-pony-d3a-mv1-plc-_5598_v1 status is now deployed due to DeploymentManager action
chaiml-pony-d3a-mv1-plc-_5598_v1 status is now inactive due to admin request
chaiml-pony-d3a-mv1-plc-_5598_v1 status is now torndown due to DeploymentManager action