developer_uid: zonemercy
submission_id: chaiml-pony-d3a-mv1-son_75599_v4
model_name: chaiml-pony-d3a-mv1-son_75599_v4
model_group: ChaiML/pony-d3a-mv1-sonn
status: inactive
timestamp: 2026-03-28T06:53:16+00:00
num_battles: 403
num_wins: 192
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/pony-d3a-mv1-sonnetwintop2-q35b-lr5e6ep2g8
model_architecture: Qwen3_5MoeForConditionalGeneration
model_num_parameters: 33753909248.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-pony-d3a-mv1-son_75599_v4
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/pony-d3a-mv1-sonnetwintop2-q35b-lr5e6ep2g8
model_size: 34B
ranking_group: single
us_pacific_date: 2026-03-27
win_ratio: 0.47642679900744417
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.8, 'frequency_penalty': 0.0, 'stopping_words': ['</s>', '<|assistant|>', '<|user|>', '<|im_end|>', '####'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': "<|im_start|>system\n{bot_name}'s persona: {memory}<|im_end|>\n", 'prompt_template': '', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3a-mv1-son-75599-v4-uploader
Waiting for job on chaiml-pony-d3a-mv1-son-75599-v4-uploader to finish
Failed to get request counts for guanaco-submitter. Falling back to default
chaiml-pony-d3a-mv1-son-75599-v4-uploader: Using quantization_mode: fp8
chaiml-pony-d3a-mv1-son-75599-v4-uploader: Checking if ChaiML/pony-d3a-mv1-sonnetwintop2-q35b-lr5e6ep2g8-FP8 already exists in ChaiML
chaiml-pony-d3a-mv1-son-75599-v4-uploader: Model already exists. Downloading to /dev/shm/model_output...
chaiml-pony-d3a-mv1-son-75599-v4-uploader: Downloading snapshot of ChaiML/pony-d3a-mv1-sonnetwintop2-q35b-lr5e6ep2g8-FP8...
2026-03-28T06:30:14.936485+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v4
chaiml-pony-d3a-mv1-son-75599-v4-uploader: Downloaded in 26.743s
chaiml-pony-d3a-mv1-son-75599-v4-uploader: Processed model ChaiML/pony-d3a-mv1-sonnetwintop2-q35b-lr5e6ep2g8 in 29.225s
chaiml-pony-d3a-mv1-son-75599-v4-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3a-mv1-son-75599-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-75599-v4-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3a-mv1-son-75599-v4-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3a-mv1-son-75599-v4-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3a-mv1-son-75599-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-75599-v4-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-son-75599-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-75599-v4-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-son-75599-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-75599-v4-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-son-75599-v4-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-son-75599-v4-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-son-75599-v4-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3a-mv1-son-75599-v4-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3a-mv1-son-75599-v4-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3a-mv1-son-75599-v4-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3a-mv1-son-75599-v4-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3a-mv1-son-75599-v4-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v4/default
chaiml-pony-d3a-mv1-son-75599-v4-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v4/default/chat_template.jinja
chaiml-pony-d3a-mv1-son-75599-v4-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v4/default/config.json
chaiml-pony-d3a-mv1-son-75599-v4-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v4/default/tokenizer_config.json
chaiml-pony-d3a-mv1-son-75599-v4-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v4/default/generation_config.json
chaiml-pony-d3a-mv1-son-75599-v4-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v4/default/recipe.yaml
chaiml-pony-d3a-mv1-son-75599-v4-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v4/default/.gitattributes
chaiml-pony-d3a-mv1-son-75599-v4-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v4/default/tokenizer.json
2026-03-28T06:31:15.046286+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v4
chaiml-pony-d3a-mv1-son-75599-v4-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-son-75599-v4/default/model.safetensors
Job chaiml-pony-d3a-mv1-son-75599-v4-uploader completed after 142.72s with status: succeeded
Stopping job with name chaiml-pony-d3a-mv1-son-75599-v4-uploader
Pipeline stage VLLMUploader completed in 143.13s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.08s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.27s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3a-mv1-son-75599-v4
Waiting for inference service chaiml-pony-d3a-mv1-son-75599-v4 to be ready
2026-03-28T06:32:15.140932+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v4
2026-03-28T06:33:15.233406+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v4
2026-03-28T06:34:15.374190+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v4
Inference service chaiml-pony-d3a-mv1-son-75599-v4 ready after 210.95023250579834s
Pipeline stage VLLMDeployer completed in 211.36s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-28T06:35:15.460743+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v4
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T06:36:15.547289+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v4
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.3402674198150635s
Received healthy response to inference request in 8.374897003173828s
Received healthy response to inference request in 19.690934419631958s
Received healthy response to inference request in 3.9743776321411133s
2026-03-28T06:37:15.638500+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v4
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.368853807449341s
Received healthy response to inference request in 1.646470069885254s
Received healthy response to inference request in 2.1415350437164307s
Received healthy response to inference request in 1.7149996757507324s
Received healthy response to inference request in 1.657517910003662s
Received healthy response to inference request in 1.6575770378112793s
Received healthy response to inference request in 2.1528561115264893s
Received healthy response to inference request in 18.45615267753601s
Received healthy response to inference request in 1.7551603317260742s
Received healthy response to inference request in 1.8809230327606201s
Received healthy response to inference request in 1.6717300415039062s
Received healthy response to inference request in 1.926042079925537s
Received healthy response to inference request in 2.28259539604187s
Received healthy response to inference request in 3.1007895469665527s
2026-03-28T06:38:15.735298+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v4
Received healthy response to inference request in 1.8919386863708496s
Received healthy response to inference request in 1.9707310199737549s
Received healthy response to inference request in 1.785879135131836s
Received healthy response to inference request in 1.6685049533843994s
Received healthy response to inference request in 2.2108981609344482s
Received healthy response to inference request in 1.6601219177246094s
Received healthy response to inference request in 1.649057149887085s
30 requests
5 failed requests
5th percentile: 1.6528644919395448
10th percentile: 1.6575711250305176
20th percentile: 1.6710850238800048
30th percentile: 1.7766634941101074
40th percentile: 1.912400722503662
50th percentile: 2.14719557762146
60th percentile: 2.3170987606048583
70th percentile: 3.530500483512877
80th percentile: 18.703109025955204
90th percentile: 20.159186244010925
95th percentile: 20.23790571689606
99th percentile: 20.555332591533663
mean time: 6.469168225924174
%s, retrying in %s seconds...
Received healthy response to inference request in 2.018404722213745s
Received healthy response to inference request in 1.6717092990875244s
Received healthy response to inference request in 2.3319151401519775s
Received healthy response to inference request in 1.6388602256774902s
Received healthy response to inference request in 1.6641321182250977s
Received healthy response to inference request in 1.7552697658538818s
Received healthy response to inference request in 2.0358333587646484s
Received healthy response to inference request in 1.8697235584259033s
Received healthy response to inference request in 1.6490800380706787s
Received healthy response to inference request in 1.6361057758331299s
Received healthy response to inference request in 1.6260709762573242s
Received healthy response to inference request in 1.6419038772583008s
Received healthy response to inference request in 1.6757638454437256s
Received healthy response to inference request in 1.7452895641326904s
Received healthy response to inference request in 1.721341609954834s
Received healthy response to inference request in 1.7164888381958008s
Received healthy response to inference request in 1.8324997425079346s
Received healthy response to inference request in 1.7319118976593018s
Received healthy response to inference request in 1.784212350845337s
Received healthy response to inference request in 2.361624240875244s
Received healthy response to inference request in 1.6560678482055664s
Received healthy response to inference request in 1.6511516571044922s
Received healthy response to inference request in 2.313598394393921s
Received healthy response to inference request in 1.8617498874664307s
Received healthy response to inference request in 1.6731529235839844s
2026-03-28T06:39:15.898979+00:00 monitor updated for chaiml-pony-d3a-mv1-son_75599_v4
Received healthy response to inference request in 1.6856963634490967s
Received healthy response to inference request in 1.8242437839508057s
Received healthy response to inference request in 1.7694344520568848s
Received healthy response to inference request in 1.7402613162994385s
Failed to get response for submission chaiml-pony-d3b-mv1-top2_9386_v5: ('http://chaiml-pony-d3b-mv1-top2-9386-v5-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com/v1/completions', 'request timeout')
Received healthy response to inference request in 1.9371006488800049s
30 requests
0 failed requests
5th percentile: 1.637345278263092
10th percentile: 1.6415995121002198
20th percentile: 1.6550846099853516
30th percentile: 1.6727198362350464
40th percentile: 1.704171848297119
50th percentile: 1.7360866069793701
60th percentile: 1.760935640335083
70th percentile: 1.8267205715179442
80th percentile: 1.8831989765167239
90th percentile: 2.063609862327576
95th percentile: 2.323672604560852
99th percentile: 2.353008601665497
mean time: 1.8073532740275065
Pipeline stage StressChecker completed in 253.70s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.67s
Shutdown handler de-registered
chaiml-pony-d3a-mv1-son_75599_v4 status is now deployed due to DeploymentManager action
chaiml-pony-d3a-mv1-son_75599_v4 status is now inactive due to admin request