developer_uid: zonemercy
submission_id: chaiml-pony-d3a-mv1-plc-_5598_v2
model_name: chaiml-pony-d3a-mv1-plc-_5598_v2
model_group: ChaiML/pony-d3a-mv1-plc-
status: torndown
timestamp: 2026-03-31T17:51:08+00:00
num_battles: 10784
num_wins: 5745
celo_rating: 1318.07
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4
model_architecture: Qwen3_5ForConditionalGeneration
model_num_parameters: 23564784640.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-pony-d3a-mv1-plc-_5598_v2
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4
model_size: 24B
ranking_group: single
us_pacific_date: 2026-03-28
win_ratio: 0.5327336795252225
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.8, 'frequency_penalty': 0.0, 'stopping_words': ['<|user|>', '<|im_end|>', '<|assistant|>', '####', '</s>'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': "<|im_start|>system\n{bot_name}'s persona: {memory}<|im_end|>\n", 'prompt_template': '', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-d3a-mv1-plc-5598-v2-uploader
Waiting for job on chaiml-pony-d3a-mv1-plc-5598-v2-uploader to finish
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Using quantization_mode: fp8
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Checking if ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4-FP8 already exists in ChaiML
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Downloading snapshot of ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4...
2026-03-28T14:35:33.753100+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v2
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Downloaded in 21.389s
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Loading /tmp/model_input...
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: The fast path is not available because one of the required library is not installed. Falling back to torch implementation. To install follow https://github.com/fla-org/flash-linear-attention#installation and https://github.com/Dao-AILab/causal-conv1d
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Applying quantization...
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: 2026-03-28T14:35:42.241107+0000 | __init__ | WARNING - Disabling tokenizer parallelism due to threading conflict between FastTokenizer and Datasets. Set TOKENIZERS_PARALLELISM=false to suppress this warning.
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: 2026-03-28T14:35:44.944136+0000 | reset | INFO - Compression lifecycle reset
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: 2026-03-28T14:35:44.947057+0000 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: 2026-03-28T14:35:44.993781+0000 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: 2026-03-28T14:35:44.994024+0000 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: 2026-03-28T14:35:45.009543+0000 | dispatch_model | WARNING - Forced to offload modules due to insufficient gpu resources
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: 2026-03-28T14:35:52.515955+0000 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: 2026-03-28T14:35:52.516130+0000 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Saving to /dev/shm/model_output...
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: /usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py:3344: UserWarning: Attempting to save a model with offloaded modules. Ensure that unallocated cpu memory exceeds the `shard_size` (50GB default)
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: warnings.warn(
Failed to get response for submission chaiml-gspo-glm47-cas72_44260_v1: ('http://chaiml-gspo-glm47-cas72-44260-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
2026-03-28T14:36:33.854637+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v2
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Cleaning quantization config in /dev/shm/model_output
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Pushing to ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4-FP8
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Checking if ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4-FP8 already exists in ChaiML
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4-FP8 already exists in ChaiML
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Processed model ChaiML/pony-d3a-mv1-plc-q27b-lr5e6ep1g4 in 82.129s
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: creating bucket guanaco-vllm-models
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v2/default
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v2/default/tokenizer_config.json
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v2/default/generation_config.json
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v2/default/config.json
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v2/default/recipe.yaml
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v2/default/chat_template.jinja
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v2/default/tokenizer.json
2026-03-28T14:37:34.119002+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v2
chaiml-pony-d3a-mv1-plc-5598-v2-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-d3a-mv1-plc-5598-v2/default/model.safetensors
Job chaiml-pony-d3a-mv1-plc-5598-v2-uploader completed after 195.32s with status: succeeded
Stopping job with name chaiml-pony-d3a-mv1-plc-5598-v2-uploader
Pipeline stage VLLMUploader completed in 195.75s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.10s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.63s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-d3a-mv1-plc-5598-v2
Waiting for inference service chaiml-pony-d3a-mv1-plc-5598-v2 to be ready
Failed to get response for submission chaiml-glm-47-bobo-v1-s_16089_v2: ('http://chaiml-glm-47-bobo-v1-s-16089-v2-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'activator request timeout')
2026-03-28T14:38:34.210722+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v2
2026-03-28T14:39:34.302790+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v2
2026-03-28T14:40:34.422949+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v2
Inference service chaiml-pony-d3a-mv1-plc-5598-v2 ready after 180.40729236602783s
Pipeline stage VLLMDeployer completed in 181.38s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T14:41:34.557762+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T14:42:34.938726+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.493112802505493s
Received healthy response to inference request in 4.449168682098389s
Received healthy response to inference request in 1.8706510066986084s
Received healthy response to inference request in 1.9499855041503906s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-28T14:43:35.092860+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v2
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.0971789360046387s
Received healthy response to inference request in 1.8704500198364258s
Received healthy response to inference request in 4.317626237869263s
Received healthy response to inference request in 1.944586992263794s
Received healthy response to inference request in 1.9420979022979736s
Received healthy response to inference request in 2.217494010925293s
2026-03-28T14:44:36.109673+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v2
Received healthy response to inference request in 17.29521918296814s
Received healthy response to inference request in 1.946004867553711s
Received healthy response to inference request in 1.9803216457366943s
Received healthy response to inference request in 1.9871251583099365s
Received healthy response to inference request in 2.190309762954712s
Received healthy response to inference request in 2.114530324935913s
Received healthy response to inference request in 2.0702388286590576s
Received healthy response to inference request in 2.0372188091278076s
Received healthy response to inference request in 2.3815925121307373s
Received healthy response to inference request in 2.0008857250213623s
Received healthy response to inference request in 2.110753059387207s
30 requests
9 failed requests
5th percentile: 1.9028021097183228
10th percentile: 1.944338083267212
20th percentile: 1.9742544174194336
30th percentile: 2.026318883895874
40th percentile: 2.1053234100341798
50th percentile: 2.2039018869400024
60th percentile: 4.3702432155609126
70th percentile: 18.141651606559744
80th percentile: 20.12617931365967
90th percentile: 20.13749716281891
95th percentile: 20.14131667613983
99th percentile: 20.147916724681853
mean time: 8.215454260508219
%s, retrying in %s seconds...
Received healthy response to inference request in 1.970489263534546s
Received healthy response to inference request in 2.110823392868042s
Received healthy response to inference request in 1.9781932830810547s
Received healthy response to inference request in 1.7789721488952637s
Received healthy response to inference request in 1.8918817043304443s
Received healthy response to inference request in 1.8344833850860596s
Received healthy response to inference request in 1.8374218940734863s
Received healthy response to inference request in 1.8931775093078613s
Received healthy response to inference request in 2.022132158279419s
Received healthy response to inference request in 2.1446917057037354s
Received healthy response to inference request in 1.931652307510376s
Received healthy response to inference request in 1.9086577892303467s
Received healthy response to inference request in 2.192476272583008s
Received healthy response to inference request in 1.8946409225463867s
Received healthy response to inference request in 1.9921092987060547s
Received healthy response to inference request in 1.9110875129699707s
2026-03-28T14:45:36.280785+00:00 monitor updated for chaiml-pony-d3a-mv1-plc-_5598_v2
Received healthy response to inference request in 1.967822551727295s
Received healthy response to inference request in 2.449918270111084s
Received healthy response to inference request in 1.963348150253296s
Received healthy response to inference request in 2.074037790298462s
Received healthy response to inference request in 2.599529504776001s
Received healthy response to inference request in 1.9879205226898193s
Received healthy response to inference request in 2.0561177730560303s
Received healthy response to inference request in 1.9675605297088623s
Received healthy response to inference request in 2.7911953926086426s
Received healthy response to inference request in 1.9991850852966309s
Received healthy response to inference request in 2.0047316551208496s
Received healthy response to inference request in 2.0043587684631348s
Received healthy response to inference request in 2.190765619277954s
Received healthy response to inference request in 2.0602707862854004s
30 requests
0 failed requests
5th percentile: 1.8358057141304016
10th percentile: 1.8864357233047486
20th percentile: 1.9058544158935546
30th percentile: 1.95383939743042
40th percentile: 1.9694225788116455
50th percentile: 1.990014910697937
60th percentile: 2.0045079231262206
70th percentile: 2.057363677024841
80th percentile: 2.117597055435181
90th percentile: 2.218220472335816
95th percentile: 2.532204449176788
99th percentile: 2.7356122851371767
mean time: 2.0469884316126508
Pipeline stage StressChecker completed in 317.06s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.43s
Shutdown handler de-registered
chaiml-pony-d3a-mv1-plc-_5598_v2 status is now deployed due to DeploymentManager action
chaiml-pony-d3a-mv1-plc-_5598_v2 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-d3a-mv1-plc-_5598_v2 status is now torndown due to DeploymentManager action