developer_uid: zonemercy
submission_id: chaiml-pony-d3a-mv1-son_59529_v1
model_name: chaiml-pony-d3a-mv1-son_59529_v1
model_group: ChaiML/pony-d3a-mv1-sonn
status: torndown
timestamp: 2026-03-31T16:41:18+00:00
num_battles: 4435
num_wins: 2328
celo_rating: 1308.39
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/pony-d3a-mv1-sonnetwintop2-q27b-lr1e5ep1g4
model_architecture: Qwen3_5ForConditionalGeneration
model_num_parameters: 23564784640.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-pony-d3a-mv1-son_59529_v1
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/pony-d3a-mv1-sonnetwintop2-q27b-lr1e5ep1g4
model_size: 24B
ranking_group: single
us_pacific_date: 2026-03-28
win_ratio: 0.5249154453213077
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['</s>', '<|im_end|>', '<|user|>', '<|assistant|>', '####'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': "<|im_start|>system\n{bot_name}'s persona: {memory}<|im_end|>\n", 'prompt_template': '', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3a-mv1-son-59529-v1-uploader
Waiting for job on chaiml-pony-d3a-mv1-son-59529-v1-uploader to finish
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Using quantization_mode: fp8
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Checking if ChaiML/pony-d3a-mv1-sonnetwintop2-q27b-lr1e5ep1g4-FP8 already exists in ChaiML
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Downloading snapshot of ChaiML/pony-d3a-mv1-sonnetwintop2-q27b-lr1e5ep1g4...
2026-03-28T14:35:00.960298+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Downloaded in 25.011s
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Loading /tmp/model_input...
chaiml-pony-d3a-mv1-son-59529-v1-uploader: The fast path is not available because one of the required library is not installed. Falling back to torch implementation. To install follow https://github.com/fla-org/flash-linear-attention#installation and https://github.com/Dao-AILab/causal-conv1d
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Applying quantization...
chaiml-pony-d3a-mv1-son-59529-v1-uploader: 2026-03-28T14:35:09.278322+0000 | __init__ | WARNING - Disabling tokenizer parallelism due to threading conflict between FastTokenizer and Datasets. Set TOKENIZERS_PARALLELISM=false to suppress this warning.
chaiml-pony-d3a-mv1-son-59529-v1-uploader: 2026-03-28T14:35:11.390170+0000 | reset | INFO - Compression lifecycle reset
chaiml-pony-d3a-mv1-son-59529-v1-uploader: 2026-03-28T14:35:11.392707+0000 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-pony-d3a-mv1-son-59529-v1-uploader: 2026-03-28T14:35:11.437987+0000 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-pony-d3a-mv1-son-59529-v1-uploader: 2026-03-28T14:35:11.438232+0000 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
chaiml-pony-d3a-mv1-son-59529-v1-uploader: 2026-03-28T14:35:11.450877+0000 | dispatch_model | WARNING - Forced to offload modules due to insufficient gpu resources
chaiml-pony-d3a-mv1-son-59529-v1-uploader: 2026-03-28T14:35:18.999225+0000 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-pony-d3a-mv1-son-59529-v1-uploader: 2026-03-28T14:35:18.999373+0000 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Saving to /dev/shm/model_output...
chaiml-pony-d3a-mv1-son-59529-v1-uploader: /usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py:3344: UserWarning: Attempting to save a model with offloaded modules. Ensure that unallocated cpu memory exceeds the `shard_size` (50GB default)
chaiml-pony-d3a-mv1-son-59529-v1-uploader: warnings.warn(
2026-03-28T14:36:01.134445+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
Failed to get response for submission chaiml-glm-47-bobo-v1-s_16089_v2: ('http://chaiml-glm-47-bobo-v1-s-16089-v2-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Cleaning quantization config in /dev/shm/model_output
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Pushing to ChaiML/pony-d3a-mv1-sonnetwintop2-q27b-lr1e5ep1g4-FP8
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Checking if ChaiML/pony-d3a-mv1-sonnetwintop2-q27b-lr1e5ep1g4-FP8 already exists in ChaiML
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Creating repo ChaiML/pony-d3a-mv1-sonnetwintop2-q27b-lr1e5ep1g4-FP8 and uploading /dev/shm/model_output to it
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Found 1 files larger than 20GB (recommended limit):
chaiml-pony-d3a-mv1-son-59529-v1-uploader: - model.safetensors: 35.9GB
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Large files may slow down loading and processing.
chaiml-pony-d3a-mv1-son-59529-v1-uploader: ---------- 2026-03-28 14:36:06 (0:00:00) ----------
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Files: hashed 5/7 (34.1K/35.9G) | pre-uploaded: 0/0 (0.0/35.9G) (+7 unsure) | committed: 0/7 (0.0/35.9G) | ignored: 0
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Workers: hashing: 2 | get upload mode: 5 | pre-uploading: 0 | committing: 0 | waiting: 57
chaiml-pony-d3a-mv1-son-59529-v1-uploader: ---------------------------------------------------
2026-03-28T14:37:01.223691+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
chaiml-pony-d3a-mv1-son-59529-v1-uploader:       
chaiml-pony-d3a-mv1-son-59529-v1-uploader: ---------- 2026-03-28 14:37:06 (0:01:00) ----------
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Files: hashed 7/7 (35.9G/35.9G) | pre-uploaded: 1/2 (20.0M/35.9G) | committed: 0/7 (0.0/35.9G) | ignored: 0
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 1 | committing: 0 | waiting: 63
chaiml-pony-d3a-mv1-son-59529-v1-uploader: ---------------------------------------------------
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Processed model ChaiML/pony-d3a-mv1-sonnetwintop2-q27b-lr1e5ep1g4 in 186.677s
chaiml-pony-d3a-mv1-son-59529-v1-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3a-mv1-son-59529-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-59529-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3a-mv1-son-59529-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3a-mv1-son-59529-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3a-mv1-son-59529-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-59529-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-son-59529-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-59529-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-son-59529-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-59529-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-son-59529-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-59529-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-son-59529-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3a-mv1-son-59529-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3a-mv1-son-59529-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3a-mv1-son-59529-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3a-mv1-son-59529-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3a-mv1-son-59529-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-59529-v1/default
chaiml-pony-d3a-mv1-son-59529-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-59529-v1/default/tokenizer_config.json
chaiml-pony-d3a-mv1-son-59529-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-59529-v1/default/chat_template.jinja
chaiml-pony-d3a-mv1-son-59529-v1-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-59529-v1/default/recipe.yaml
chaiml-pony-d3a-mv1-son-59529-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-59529-v1/default/config.json
chaiml-pony-d3a-mv1-son-59529-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-59529-v1/default/generation_config.json
chaiml-pony-d3a-mv1-son-59529-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-59529-v1/default/tokenizer.json
2026-03-28T14:38:01.317698+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
chaiml-pony-d3a-mv1-son-59529-v1-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-59529-v1/default/model.safetensors
Job chaiml-pony-d3a-mv1-son-59529-v1-uploader completed after 297.11s with status: succeeded
Stopping job with name chaiml-pony-d3a-mv1-son-59529-v1-uploader
Pipeline stage VLLMUploader completed in 297.61s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.09s
run pipeline stage %s
Running pipeline stage VLLMTemplater
2026-03-28T14:39:01.427243+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
Pipeline stage VLLMTemplater completed in 2.80s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3a-mv1-son-59529-v1
Waiting for inference service chaiml-pony-d3a-mv1-son-59529-v1 to be ready
2026-03-28T14:40:01.521522+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
2026-03-28T14:41:01.611578+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
2026-03-28T14:42:01.703051+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
Inference service chaiml-pony-d3a-mv1-son-59529-v1 ready after 180.18793964385986s
Pipeline stage VLLMDeployer completed in 180.61s
run pipeline stage %s
Running pipeline stage StressChecker
Failed to get response for submission chaiml-gspo-glm47-combi_10268_v1: ('http://chaiml-gspo-glm47-combi-10268-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T14:43:01.790512+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T14:44:01.893528+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.199792385101318s
2026-03-28T14:45:01.980243+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.282181024551392s
Received healthy response to inference request in 2.1986570358276367s
Received healthy response to inference request in 2.002103090286255s
Received healthy response to inference request in 4.133390426635742s
Received healthy response to inference request in 2.79046368598938s
Received healthy response to inference request in 1.9097938537597656s
Received healthy response to inference request in 1.9058313369750977s
Received healthy response to inference request in 8.84338927268982s
Received healthy response to inference request in 2.3131418228149414s
Received healthy response to inference request in 2.1787025928497314s
Received healthy response to inference request in 1.9326179027557373s
Received healthy response to inference request in 1.9727897644042969s
Received healthy response to inference request in 2.301077127456665s
Received healthy response to inference request in 2.0406136512756348s
Received healthy response to inference request in 1.9930715560913086s
Received healthy response to inference request in 1.9546613693237305s
Received healthy response to inference request in 2.1777303218841553s
Received healthy response to inference request in 1.9813389778137207s
Received healthy response to inference request in 2.164440155029297s
2026-03-28T14:46:02.079051+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
Received healthy response to inference request in 2.1859359741210938s
30 requests
9 failed requests
5th percentile: 1.920064675807953
10th percentile: 1.952457022666931
20th percentile: 1.990725040435791
30th percentile: 2.127292203903198
40th percentile: 2.183042621612549
50th percentile: 2.3071094751358032
60th percentile: 4.159951210021973
70th percentile: 12.224094438552825
80th percentile: 20.125858497619628
90th percentile: 20.130479335784912
95th percentile: 20.137688481807707
99th percentile: 20.156792612075805
mean time: 7.954816460609436
%s, retrying in %s seconds...
Received healthy response to inference request in 2.0013620853424072s
Received healthy response to inference request in 1.792414903640747s
Received healthy response to inference request in 1.845430612564087s
Received healthy response to inference request in 2.468658447265625s
Received healthy response to inference request in 1.796844482421875s
Received healthy response to inference request in 2.003641366958618s
Received healthy response to inference request in 1.7626841068267822s
Received healthy response to inference request in 1.936957836151123s
Received healthy response to inference request in 1.9131689071655273s
Received healthy response to inference request in 1.8729581832885742s
Received healthy response to inference request in 2.0659587383270264s
Received healthy response to inference request in 1.852116584777832s
Received healthy response to inference request in 1.9131202697753906s
Received healthy response to inference request in 2.0122549533843994s
Received healthy response to inference request in 2.0307815074920654s
Received healthy response to inference request in 1.9149551391601562s
Received healthy response to inference request in 1.9768757820129395s
Received healthy response to inference request in 2.003652334213257s
Received healthy response to inference request in 2.0129053592681885s
Received healthy response to inference request in 1.9823215007781982s
Received healthy response to inference request in 2.707267999649048s
Received healthy response to inference request in 2.1573522090911865s
Received healthy response to inference request in 2.0785093307495117s
Received healthy response to inference request in 2.6153087615966797s
Received healthy response to inference request in 2.454115867614746s
Received healthy response to inference request in 2.3559207916259766s
Received healthy response to inference request in 2.3088560104370117s
2026-03-28T14:47:02.179782+00:00 monitor updated for chaiml-pony-d3a-mv1-son_59529_v1
Received healthy response to inference request in 2.181800603866577s
Received healthy response to inference request in 1.9994397163391113s
Received healthy response to inference request in 2.18485164642334s
30 requests
0 failed requests
5th percentile: 1.7944082140922546
10th percentile: 1.8405719995498657
20th percentile: 1.9050878524780275
30th percentile: 1.930357027053833
40th percentile: 1.9925924301147462
50th percentile: 2.0036468505859375
60th percentile: 2.020055818557739
70th percentile: 2.102162194252014
80th percentile: 2.2096525192260748
90th percentile: 2.455570125579834
95th percentile: 2.549316120147705
99th percentile: 2.6805998206138613
mean time: 2.0734162012736004
Pipeline stage StressChecker completed in 306.29s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.91s
Shutdown handler de-registered
chaiml-pony-d3a-mv1-son_59529_v1 status is now deployed due to DeploymentManager action
chaiml-pony-d3a-mv1-son_59529_v1 status is now inactive due to admin request
chaiml-pony-d3a-mv1-son_59529_v1 status is now torndown due to DeploymentManager action