developer_uid: zonemercy
submission_id: chaiml-pony-v3-q27b-lr5_64169_v1
model_name: chaiml-pony-v3-q27b-lr5_64169_v1
model_group: ChaiML/pony-v3-q27b-lr5e
status: torndown
timestamp: 2026-04-02T10:21:15+00:00
num_battles: 11669
num_wins: 6026
celo_rating: 1301.04
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/pony-v3-q27b-lr5e6ep1g8-shuffle
model_architecture: Qwen3_5ForConditionalGeneration
model_num_parameters: 23564784640.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-pony-v3-q27b-lr5_64169_v1
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/pony-v3-q27b-lr5e6ep1g8-shuffle
model_size: 24B
ranking_group: single
us_pacific_date: 2026-03-30
win_ratio: 0.516411003513583
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['####', '<|assistant|>', '<|im_end|>', '<|user|>', '</s>'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': "<|im_start|>system\n{bot_name}'s persona: {memory}<|im_end|>\n", 'prompt_template': '', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-pony-v3-q27b-lr5-64169-v1-uploader
Waiting for job on chaiml-pony-v3-q27b-lr5-64169-v1-uploader to finish
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Using quantization_mode: fp8
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Checking if ChaiML/pony-v3-q27b-lr5e6ep1g8-shuffle-FP8 already exists in ChaiML
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Downloading snapshot of ChaiML/pony-v3-q27b-lr5e6ep1g8-shuffle...
2026-03-30T07:34:16.176581+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Downloaded in 50.842s
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Loading /tmp/model_input...
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: The fast path is not available because one of the required library is not installed. Falling back to torch implementation. To install follow https://github.com/fla-org/flash-linear-attention#installation and https://github.com/Dao-AILab/causal-conv1d
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Applying quantization...
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: 2026-03-30T07:34:49.875652+0000 | __init__ | WARNING - Disabling tokenizer parallelism due to threading conflict between FastTokenizer and Datasets. Set TOKENIZERS_PARALLELISM=false to suppress this warning.
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: 2026-03-30T07:34:51.897369+0000 | reset | INFO - Compression lifecycle reset
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: 2026-03-30T07:34:51.901214+0000 | norm_calibration_context | INFO - Found 161 offset-norm modules to convert
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: 2026-03-30T07:34:51.910137+0000 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: 2026-03-30T07:34:51.957731+0000 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: 2026-03-30T07:34:51.957966+0000 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: 2026-03-30T07:34:51.970257+0000 | dispatch_model | WARNING - Forced to offload modules due to insufficient gpu resources
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: 2026-03-30T07:34:58.569076+0000 | norm_calibration_context | INFO - Restoring 161 norm modules to offset convention
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: 2026-03-30T07:34:58.575788+0000 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: 2026-03-30T07:34:58.575880+0000 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Saving to /dev/shm/model_output...
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: /usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py:3344: UserWarning: Attempting to save a model with offloaded modules. Ensure that unallocated cpu memory exceeds the `shard_size` (50GB default)
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: warnings.warn(
2026-03-30T07:35:16.284009+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Updating config in /dev/shm/model_output
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Pushing to ChaiML/pony-v3-q27b-lr5e6ep1g8-shuffle-FP8
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Checking if ChaiML/pony-v3-q27b-lr5e6ep1g8-shuffle-FP8 already exists in ChaiML
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Creating repo ChaiML/pony-v3-q27b-lr5e6ep1g8-shuffle-FP8 and uploading /dev/shm/model_output to it
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Found 1 files larger than 20GB (recommended limit):
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: - model.safetensors: 35.9GB
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Large files may slow down loading and processing.
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: ---------- 2026-03-30 07:35:48 (0:00:00) ----------
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Files: hashed 5/7 (34.3K/35.9G) | pre-uploaded: 0/0 (0.0/35.9G) (+7 unsure) | committed: 0/7 (0.0/35.9G) | ignored: 0
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Workers: hashing: 2 | get upload mode: 5 | pre-uploading: 0 | committing: 0 | waiting: 57
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: ---------------------------------------------------
2026-03-30T07:36:16.400126+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
chaiml-pony-v3-q27b-lr5-64169-v1-uploader:       
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: ---------- 2026-03-30 07:36:48 (0:01:00) ----------
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Files: hashed 7/7 (35.9G/35.9G) | pre-uploaded: 1/2 (20.0M/35.9G) | committed: 0/7 (0.0/35.9G) | ignored: 0
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 1 | committing: 0 | waiting: 63
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: ---------------------------------------------------
2026-03-30T07:37:16.515255+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Processed model ChaiML/pony-v3-q27b-lr5e6ep1g8-shuffle in 214.057s
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: creating bucket guanaco-vllm-models
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v1/default
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v1/default/chat_template.jinja
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v1/default/generation_config.json
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v1/default/recipe.yaml
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v1/default/tokenizer_config.json
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v1/default/config.json
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v1/default/tokenizer.json
2026-03-30T07:38:16.644037+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
chaiml-pony-v3-q27b-lr5-64169-v1-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-pony-v3-q27b-lr5-64169-v1/default/model.safetensors
Job chaiml-pony-v3-q27b-lr5-64169-v1-uploader completed after 327.59s with status: succeeded
Stopping job with name chaiml-pony-v3-q27b-lr5-64169-v1-uploader
Pipeline stage VLLMUploader completed in 328.24s
run pipeline stage %s
Running pipeline stage VLLMUploaderAMD
Pipeline stage vllm_upload_amd skipped, reason=not amd cluster
Pipeline stage VLLMUploaderAMD completed in 0.15s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 4.81s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-pony-v3-q27b-lr5-64169-v1
Waiting for inference service chaiml-pony-v3-q27b-lr5-64169-v1 to be ready
2026-03-30T07:39:16.745568+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
2026-03-30T07:40:16.967545+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
2026-03-30T07:41:17.083874+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
Inference service chaiml-pony-v3-q27b-lr5-64169-v1 ready after 181.6122043132782s
Pipeline stage VLLMDeployer completed in 182.11s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-30T07:42:17.178801+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-30T07:43:17.275160+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.0873184204101562s
2026-03-30T07:44:17.463650+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
Received healthy response to inference request in 4.126624584197998s
Received healthy response to inference request in 2.0017330646514893s
Received healthy response to inference request in 2.099916458129883s
Received healthy response to inference request in 1.8681252002716064s
Received healthy response to inference request in 1.9609003067016602s
Received healthy response to inference request in 4.244542837142944s
Received healthy response to inference request in 15.270749807357788s
Received healthy response to inference request in 1.9105925559997559s
Received healthy response to inference request in 1.9328808784484863s
Received healthy response to inference request in 1.9057819843292236s
Received healthy response to inference request in 2.0249061584472656s
Received healthy response to inference request in 1.8783581256866455s
Received healthy response to inference request in 1.894721508026123s
2026-03-30T07:45:17.569376+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.150639295578003s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.023019552230835s
Received healthy response to inference request in 7.6479527950286865s
Received healthy response to inference request in 2.2748196125030518s
2026-03-30T07:46:17.700476+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
Received healthy response to inference request in 2.235903263092041s
Received healthy response to inference request in 2.0388669967651367s
30 requests
10 failed requests
5th percentile: 1.8857216477394103
10th percentile: 1.9046759366989137
20th percentile: 1.9552964210510253
30th percentile: 2.0243401765823363
40th percentile: 2.130350160598755
50th percentile: 2.681069016456604
60th percentile: 5.605906820297236
70th percentile: 20.122228503227234
80th percentile: 20.15168151855469
90th percentile: 20.212839794158935
95th percentile: 20.230070316791533
99th percentile: 20.36440356731415
mean time: 8.885377907752991
%s, retrying in %s seconds...
Received healthy response to inference request in 1.8795418739318848s
Received healthy response to inference request in 1.8739168643951416s
Received healthy response to inference request in 2.309835195541382s
Received healthy response to inference request in 1.8203136920928955s
Received healthy response to inference request in 1.8626797199249268s
Received healthy response to inference request in 1.8516430854797363s
Received healthy response to inference request in 2.1685800552368164s
Received healthy response to inference request in 1.8268747329711914s
Received healthy response to inference request in 2.0657358169555664s
Received healthy response to inference request in 1.8311183452606201s
Received healthy response to inference request in 1.8665897846221924s
Received healthy response to inference request in 1.81730318069458s
Received healthy response to inference request in 1.9812829494476318s
Received healthy response to inference request in 1.9889862537384033s
Received healthy response to inference request in 1.908432960510254s
Received healthy response to inference request in 1.9793484210968018s
Received healthy response to inference request in 2.333632469177246s
Received healthy response to inference request in 1.9492177963256836s
Received healthy response to inference request in 1.9746525287628174s
Received healthy response to inference request in 2.4134387969970703s
Received healthy response to inference request in 1.993591547012329s
Received healthy response to inference request in 2.319836139678955s
Received healthy response to inference request in 2.016343116760254s
Received healthy response to inference request in 2.81020188331604s
Received healthy response to inference request in 2.232095241546631s
2026-03-30T07:47:17.810353+00:00 monitor updated for chaiml-pony-v3-q27b-lr5_64169_v1
Received healthy response to inference request in 1.9786272048950195s
Received healthy response to inference request in 2.0439529418945312s
Received healthy response to inference request in 2.095338821411133s
Received healthy response to inference request in 2.382643461227417s
Received healthy response to inference request in 2.5287880897521973s
30 requests
0 failed requests
5th percentile: 1.8232661604881286
10th percentile: 1.8306939840316772
20th percentile: 1.8658077716827393
30th percentile: 1.899765634536743
40th percentile: 1.9770373344421386
50th percentile: 1.9851346015930176
60th percentile: 2.0273870468139648
70th percentile: 2.1173111915588376
80th percentile: 2.3118353843688966
90th percentile: 2.3857229948043823
95th percentile: 2.4768809080123897
99th percentile: 2.728591883182526
mean time: 2.070151432355245
Pipeline stage StressChecker completed in 336.88s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.89s
Shutdown handler de-registered
chaiml-pony-v3-q27b-lr5_64169_v1 status is now deployed due to DeploymentManager action
chaiml-pony-v3-q27b-lr5_64169_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-pony-v3-q27b-lr5_64169_v1 status is now torndown due to DeploymentManager action