developer_uid: chai_backend_admin
submission_id: chaiml-merged-qwen-35-d_39140_v3
model_name: chaiml-merged-qwen-35-d_39140_v3
model_group: ChaiML/merged_qwen_35_dp
status: torndown
timestamp: 2026-03-28T23:01:56+00:00
num_battles: 10263
num_wins: 5421
celo_rating: 9999.0
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: ChaiML/merged_qwen_35_dpo_lower_lr_v
model_architecture: Qwen3_5ForConditionalGeneration
model_num_parameters: 23564784640.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: chaiml-merged-qwen-35-d_39140_v3
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: ChaiML/merged_qwen_35_dpo_lower_lr_v
model_size: 24B
ranking_group: single
us_pacific_date: 2026-03-25
win_ratio: 0.5282081262788658
generation_params: {'temperature': 1.0, 'top_p': 0.95, 'min_p': 0.05, 'top_k': 60, 'presence_penalty': 0.1, 'frequency_penalty': 0.0, 'stopping_words': ['You:', '</s>', '<|im_start|>', '<|im_end|>', '###'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': '<|im_start|>system\nRespond as a high quality storyteller.<|im_end|>\n<|im_start|>user\n', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '<|im_end|>\n<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name chaiml-merged-qwen-35-d-39140-v3-uploader
Waiting for job on chaiml-merged-qwen-35-d-39140-v3-uploader to finish
chaiml-merged-qwen-35-d-39140-v3-uploader: Using quantization_mode: fp8
chaiml-merged-qwen-35-d-39140-v3-uploader: Checking if ChaiML/merged_qwen_35_dpo_lower_lr_v-FP8 already exists in ChaiML
chaiml-merged-qwen-35-d-39140-v3-uploader: Downloading snapshot of ChaiML/merged_qwen_35_dpo_lower_lr_v...
2026-03-25T21:11:53.182512+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
chaiml-merged-qwen-35-d-39140-v3-uploader: Downloaded in 50.556s
chaiml-merged-qwen-35-d-39140-v3-uploader: Loading /tmp/model_input...
chaiml-merged-qwen-35-d-39140-v3-uploader: The fast path is not available because one of the required library is not installed. Falling back to torch implementation. To install follow https://github.com/fla-org/flash-linear-attention#installation and https://github.com/Dao-AILab/causal-conv1d
chaiml-merged-qwen-35-d-39140-v3-uploader: Applying quantization...
chaiml-merged-qwen-35-d-39140-v3-uploader: 2026-03-25T21:12:15.352786+0000 | __init__ | WARNING - Disabling tokenizer parallelism due to threading conflict between FastTokenizer and Datasets. Set TOKENIZERS_PARALLELISM=false to suppress this warning.
chaiml-merged-qwen-35-d-39140-v3-uploader: 2026-03-25T21:12:15.358251+0000 | reset | INFO - Compression lifecycle reset
chaiml-merged-qwen-35-d-39140-v3-uploader: 2026-03-25T21:12:15.359661+0000 | from_modifiers | INFO - Creating recipe from modifiers
chaiml-merged-qwen-35-d-39140-v3-uploader: 2026-03-25T21:12:15.403663+0000 | initialize | INFO - Compression lifecycle initialized for 1 modifiers
chaiml-merged-qwen-35-d-39140-v3-uploader: 2026-03-25T21:12:15.403897+0000 | IndependentPipeline | INFO - Inferred `DataFreePipeline` for `QuantizationModifier`
chaiml-merged-qwen-35-d-39140-v3-uploader: 2026-03-25T21:12:15.418739+0000 | dispatch_model | WARNING - Forced to offload modules due to insufficient gpu resources
chaiml-merged-qwen-35-d-39140-v3-uploader: 2026-03-25T21:12:21.963903+0000 | finalize | INFO - Compression lifecycle finalized for 1 modifiers
chaiml-merged-qwen-35-d-39140-v3-uploader: 2026-03-25T21:12:21.964104+0000 | post_process | WARNING - Optimized model is not saved. To save, please provide`output_dir` as input arg.Ex. `oneshot(..., output_dir=...)`
chaiml-merged-qwen-35-d-39140-v3-uploader: Saving to /dev/shm/model_output...
chaiml-merged-qwen-35-d-39140-v3-uploader: /usr/local/lib/python3.12/dist-packages/transformers/modeling_utils.py:3344: UserWarning: Attempting to save a model with offloaded modules. Ensure that unallocated cpu memory exceeds the `shard_size` (50GB default)
chaiml-merged-qwen-35-d-39140-v3-uploader: warnings.warn(
2026-03-25T21:12:53.384463+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
chaiml-merged-qwen-35-d-39140-v3-uploader: Cleaning quantization config in /dev/shm/model_output
chaiml-merged-qwen-35-d-39140-v3-uploader: Pushing to ChaiML/merged_qwen_35_dpo_lower_lr_v-FP8
chaiml-merged-qwen-35-d-39140-v3-uploader: Checking if ChaiML/merged_qwen_35_dpo_lower_lr_v-FP8 already exists in ChaiML
chaiml-merged-qwen-35-d-39140-v3-uploader: Creating repo ChaiML/merged_qwen_35_dpo_lower_lr_v-FP8 and uploading /dev/shm/model_output to it
chaiml-merged-qwen-35-d-39140-v3-uploader: Found 1 files larger than 20GB (recommended limit):
chaiml-merged-qwen-35-d-39140-v3-uploader: - model.safetensors: 35.9GB
chaiml-merged-qwen-35-d-39140-v3-uploader: Large files may slow down loading and processing.
chaiml-merged-qwen-35-d-39140-v3-uploader: ---------- 2026-03-25 21:13:10 (0:00:00) ----------
chaiml-merged-qwen-35-d-39140-v3-uploader: Files: hashed 5/7 (34.0K/35.9G) | pre-uploaded: 0/0 (0.0/35.9G) (+7 unsure) | committed: 0/7 (0.0/35.9G) | ignored: 0
chaiml-merged-qwen-35-d-39140-v3-uploader: Workers: hashing: 2 | get upload mode: 5 | pre-uploading: 0 | committing: 0 | waiting: 57
chaiml-merged-qwen-35-d-39140-v3-uploader: ---------------------------------------------------
2026-03-25T21:13:53.556576+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
chaiml-merged-qwen-35-d-39140-v3-uploader:       
chaiml-merged-qwen-35-d-39140-v3-uploader: ---------- 2026-03-25 21:14:10 (0:01:00) ----------
chaiml-merged-qwen-35-d-39140-v3-uploader: Files: hashed 7/7 (35.9G/35.9G) | pre-uploaded: 1/2 (20.0M/35.9G) | committed: 0/7 (0.0/35.9G) | ignored: 0
chaiml-merged-qwen-35-d-39140-v3-uploader: Workers: hashing: 0 | get upload mode: 0 | pre-uploading: 1 | committing: 0 | waiting: 63
chaiml-merged-qwen-35-d-39140-v3-uploader: ---------------------------------------------------
2026-03-25T21:14:53.729385+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
chaiml-merged-qwen-35-d-39140-v3-uploader: Processed model ChaiML/merged_qwen_35_dpo_lower_lr_v in 210.851s
chaiml-merged-qwen-35-d-39140-v3-uploader: creating bucket guanaco-vllm-models
chaiml-merged-qwen-35-d-39140-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
chaiml-merged-qwen-35-d-39140-v3-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
chaiml-merged-qwen-35-d-39140-v3-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
chaiml-merged-qwen-35-d-39140-v3-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
chaiml-merged-qwen-35-d-39140-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
chaiml-merged-qwen-35-d-39140-v3-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
chaiml-merged-qwen-35-d-39140-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
chaiml-merged-qwen-35-d-39140-v3-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
chaiml-merged-qwen-35-d-39140-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
chaiml-merged-qwen-35-d-39140-v3-uploader: if re.search("-\.", bucket, re.UNICODE):
chaiml-merged-qwen-35-d-39140-v3-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
chaiml-merged-qwen-35-d-39140-v3-uploader: if re.search("\.\.", bucket, re.UNICODE):
chaiml-merged-qwen-35-d-39140-v3-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
chaiml-merged-qwen-35-d-39140-v3-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
chaiml-merged-qwen-35-d-39140-v3-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
chaiml-merged-qwen-35-d-39140-v3-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
chaiml-merged-qwen-35-d-39140-v3-uploader: Bucket 's3://guanaco-vllm-models/' created
chaiml-merged-qwen-35-d-39140-v3-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v3/default
chaiml-merged-qwen-35-d-39140-v3-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v3/default/chat_template.jinja
chaiml-merged-qwen-35-d-39140-v3-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v3/default/tokenizer_config.json
chaiml-merged-qwen-35-d-39140-v3-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v3/default/config.json
chaiml-merged-qwen-35-d-39140-v3-uploader: cp /dev/shm/model_output/recipe.yaml s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v3/default/recipe.yaml
chaiml-merged-qwen-35-d-39140-v3-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v3/default/generation_config.json
chaiml-merged-qwen-35-d-39140-v3-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v3/default/tokenizer.json
chaiml-merged-qwen-35-d-39140-v3-uploader: cp /dev/shm/model_output/model.safetensors s3://guanaco-vllm-models/chaiml-merged-qwen-35-d-39140-v3/default/model.safetensors
2026-03-25T21:15:53.910489+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
Job chaiml-merged-qwen-35-d-39140-v3-uploader completed after 308.98s with status: succeeded
Stopping job with name chaiml-merged-qwen-35-d-39140-v3-uploader
Pipeline stage VLLMUploader completed in 310.91s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.24s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-merged-qwen-35-d-39140-v3
Waiting for inference service chaiml-merged-qwen-35-d-39140-v3 to be ready
2026-03-25T21:16:54.091153+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
2026-03-25T21:17:54.328416+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
2026-03-25T21:18:54.535242+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
Inference service chaiml-merged-qwen-35-d-39140-v3 ready after 172.26041913032532s
Pipeline stage VLLMDeployer completed in 173.46s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='chaiml-merged-qwen-35-d-39140-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='chaiml-merged-qwen-35-d-39140-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
2026-03-25T21:19:54.925723+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
HTTPConnectionPool(host='chaiml-merged-qwen-35-d-39140-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 11.345331907272339s
HTTPConnectionPool(host='chaiml-merged-qwen-35-d-39140-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 5.7207441329956055s
2026-03-25T21:20:55.140233+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
HTTPConnectionPool(host='chaiml-merged-qwen-35-d-39140-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='chaiml-merged-qwen-35-d-39140-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 11.164727449417114s
Received healthy response to inference request in 5.531982898712158s
Received healthy response to inference request in 10.854549646377563s
2026-03-25T21:21:55.348477+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
Received healthy response to inference request in 10.730952978134155s
Received healthy response to inference request in 5.749801158905029s
Received healthy response to inference request in 5.73622465133667s
Received healthy response to inference request in 5.639586687088013s
Received healthy response to inference request in 5.981947183609009s
Received healthy response to inference request in 5.475398778915405s
Received healthy response to inference request in 5.583833932876587s
Received healthy response to inference request in 5.598241806030273s
2026-03-25T21:22:55.549783+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
HTTPConnectionPool(host='chaiml-merged-qwen-35-d-39140-v3-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 5.731566905975342s
Received healthy response to inference request in 5.451979637145996s
Received healthy response to inference request in 5.603993654251099s
Received healthy response to inference request in 5.60810112953186s
Received healthy response to inference request in 5.69504714012146s
Received healthy response to inference request in 5.652104139328003s
Received healthy response to inference request in 5.503662347793579s
Received healthy response to inference request in 10.696894645690918s
2026-03-25T21:23:55.754603+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
Received healthy response to inference request in 6.074241399765015s
Received healthy response to inference request in 5.58165431022644s
30 requests
7 failed requests
5th percentile: 5.488117384910583
10th percentile: 5.5291508436203
20th percentile: 5.595360231399536
30th percentile: 5.630141019821167
40th percentile: 5.710465335845948
50th percentile: 5.74301290512085
60th percentile: 7.92330269813537
70th percentile: 10.947602987289429
80th percentile: 20.4331223487854
90th percentile: 20.541559195518495
95th percentile: 20.719315099716187
99th percentile: 20.849985718727112
mean time: 10.027796093622843
%s, retrying in %s seconds...
Received healthy response to inference request in 5.684507131576538s
Received healthy response to inference request in 5.533602952957153s
Received healthy response to inference request in 5.431418418884277s
Received healthy response to inference request in 5.573312282562256s
Received healthy response to inference request in 5.576819896697998s
Received healthy response to inference request in 5.230697393417358s
Received healthy response to inference request in 5.4248316287994385s
Received healthy response to inference request in 5.483814477920532s
2026-03-25T21:24:55.943512+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
Received healthy response to inference request in 5.7171525955200195s
Received healthy response to inference request in 5.59294056892395s
Received healthy response to inference request in 5.8363752365112305s
Received healthy response to inference request in 5.776278734207153s
Received healthy response to inference request in 5.935762405395508s
Received healthy response to inference request in 5.566188335418701s
Received healthy response to inference request in 5.577783823013306s
Received healthy response to inference request in 6.136255502700806s
Received healthy response to inference request in 5.6040167808532715s
2026-03-25T21:25:56.214145+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
Received healthy response to inference request in 5.402003526687622s
Received healthy response to inference request in 5.55898380279541s
Received healthy response to inference request in 5.676032781600952s
Received healthy response to inference request in 5.631608247756958s
Received healthy response to inference request in 5.616333246231079s
Received healthy response to inference request in 5.655226707458496s
Received healthy response to inference request in 5.5204126834869385s
Received healthy response to inference request in 5.523636102676392s
Received healthy response to inference request in 5.442488193511963s
Received healthy response to inference request in 5.478363752365112s
Received healthy response to inference request in 5.830968141555786s
2026-03-25T21:26:56.511115+00:00 monitor updated for chaiml-merged-qwen-35-d_39140_v3
Received healthy response to inference request in 5.567286252975464s
Received healthy response to inference request in 5.678037643432617s
30 requests
0 failed requests
5th percentile: 5.41227617263794
10th percentile: 5.430759739875794
20th percentile: 5.482724332809449
30th percentile: 5.530612897872925
40th percentile: 5.566847085952759
50th percentile: 5.577301859855652
60th percentile: 5.608943367004395
70th percentile: 5.661468529701233
80th percentile: 5.691036224365234
90th percentile: 5.83150885105133
95th percentile: 5.891038179397583
99th percentile: 6.078112504482269
mean time: 5.608771308263143
Pipeline stage StressChecker completed in 489.83s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.31s
Shutdown handler de-registered
chaiml-merged-qwen-35-d_39140_v3 status is now deployed due to DeploymentManager action
chaiml-merged-qwen-35-d_39140_v3 status is now inactive due to auto deactivation removed underperforming models
chaiml-merged-qwen-35-d_39140_v3 status is now torndown due to DeploymentManager action