developer_uid: chai_backend_admin
submission_id: qwen-qwen3-5-35b-a3b-fp8_v8
model_name: qwen-qwen3-5-35b-a3b-fp8_v8
model_group: Qwen/Qwen3.5-35B-A3B-FP8
status: torndown
timestamp: 2026-03-28T20:27:05+00:00
num_battles: 11340
num_wins: 4915
celo_rating: 8467.11
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: Qwen/Qwen3.5-35B-A3B-FP8
model_architecture: Qwen3_5MoeForConditionalGeneration
model_num_parameters: 33753909248.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: qwen-qwen3-5-35b-a3b-fp8_v8
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: Qwen/Qwen3.5-35B-A3B-FP8
model_size: 34B
ranking_group: single
us_pacific_date: 2026-03-25
win_ratio: 0.4334215167548501
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['<|endoftext|>', '<|im_end|>', 'You:', '####\n', '</s>', '####'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}</s>\n', 'user_template': 'You: {message}\n', 'response_template': '<think></think>{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-fp8-v8-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-fp8-v8-uploader to finish
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: Using quantization_mode: fp8
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: Repo Qwen/Qwen3.5-35B-A3B-FP8 already ends in FP8. Skipping...
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: Checking if Qwen/Qwen3.5-35B-A3B-FP8 already exists in ChaiML
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: Model already exists. Downloading to /dev/shm/model_output...
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: Downloading snapshot of Qwen/Qwen3.5-35B-A3B-FP8...
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/config.json
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/README.md
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/generation_config.json
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/configuration.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/configuration.json
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/LICENSE
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/video_preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/.gitattributes
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/vocab.json
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/merges.txt
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors.index.json
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/tokenizer.json
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00014-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00014-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00008-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00008-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00001-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00012-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00012-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00011-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00005-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00005-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00010-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00002-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00007-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00004-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00006-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00006-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00003-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00003-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v8-uploader: cp /dev/shm/model_output/model.safetensors-00013-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v8/default/model.safetensors-00013-of-00014.safetensors
2026-03-25T18:55:42.428452+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v8
Job qwen-qwen3-5-35b-a3b-fp8-v8-uploader completed after 60.27s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-fp8-v8-uploader
Pipeline stage VLLMUploader completed in 61.52s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.28s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-fp8-v8
Waiting for inference service qwen-qwen3-5-35b-a3b-fp8-v8 to be ready
2026-03-25T18:56:42.615973+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v8
2026-03-25T18:57:42.807731+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v8
Inference service qwen-qwen3-5-35b-a3b-fp8-v8 ready after 162.2300832271576s
Pipeline stage VLLMDeployer completed in 163.73s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-25T18:58:43.041866+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v8
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v8-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v8-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v8-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
2026-03-25T18:59:43.232211+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v8
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v8-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.777357339859009s
Received healthy response to inference request in 2.1198618412017822s
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v8-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v8-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
2026-03-25T19:00:43.416904+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v8
HTTPConnectionPool(host='qwen-qwen3-5-35b-a3b-fp8-v8-predictor.tenant-chaiml-guanaco.kchai-coreweave-us-east-04a.chaiverse.com', port=80): Read timed out. (read timeout=20.0)
Received unhealthy response to inference request!
Received healthy response to inference request in 4.0662922859191895s
Received healthy response to inference request in 7.337435960769653s
Received healthy response to inference request in 1.986358642578125s
Received healthy response to inference request in 2.552730083465576s
Received healthy response to inference request in 2.867893934249878s
Received healthy response to inference request in 2.010570526123047s
Received healthy response to inference request in 2.1852030754089355s
Received healthy response to inference request in 1.992370843887329s
Received healthy response to inference request in 6.720621347427368s
Received healthy response to inference request in 3.7039055824279785s
Received healthy response to inference request in 2.8065521717071533s
Received healthy response to inference request in 2.0538883209228516s
2026-03-25T19:01:43.612638+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v8
Received healthy response to inference request in 3.04217267036438s
Received healthy response to inference request in 2.2645833492279053s
Received healthy response to inference request in 2.933668851852417s
Received healthy response to inference request in 2.1563198566436768s
Received healthy response to inference request in 2.142961025238037s
Received healthy response to inference request in 2.092928171157837s
Received healthy response to inference request in 2.098498582839966s
Received healthy response to inference request in 2.497096300125122s
Received healthy response to inference request in 3.2522802352905273s
30 requests
7 failed requests
5th percentile: 2.000560700893402
10th percentile: 2.049556541442871
20th percentile: 2.1155891895294188
30th percentile: 2.176538109779358
40th percentile: 2.5304765701293945
50th percentile: 2.9007813930511475
60th percentile: 3.4329303741455073
70th percentile: 4.862591004371636
80th percentile: 20.427984142303465
90th percentile: 20.501350855827333
95th percentile: 20.729447269439696
99th percentile: 20.948784534931182
mean time: 7.097915983200073
%s, retrying in %s seconds...
Received healthy response to inference request in 2.109525203704834s
Received healthy response to inference request in 2.2654035091400146s
Received healthy response to inference request in 1.9155547618865967s
Received healthy response to inference request in 2.1111953258514404s
Received healthy response to inference request in 2.133516788482666s
Received healthy response to inference request in 1.9461634159088135s
Received healthy response to inference request in 2.157848596572876s
Received healthy response to inference request in 2.220315456390381s
Received healthy response to inference request in 3.618072986602783s
Received healthy response to inference request in 1.9222958087921143s
Received healthy response to inference request in 2.068927049636841s
Received healthy response to inference request in 2.0049362182617188s
Received healthy response to inference request in 2.025395154953003s
Received healthy response to inference request in 1.9697990417480469s
Received healthy response to inference request in 2.112291097640991s
2026-03-25T19:02:43.816689+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v8
Received healthy response to inference request in 2.0809693336486816s
Received healthy response to inference request in 2.13810658454895s
Received healthy response to inference request in 2.268092393875122s
Received healthy response to inference request in 2.5427486896514893s
Received healthy response to inference request in 2.059312105178833s
Received healthy response to inference request in 1.9666717052459717s
Received healthy response to inference request in 2.142324447631836s
Received healthy response to inference request in 2.070594310760498s
Received healthy response to inference request in 2.1578848361968994s
Received healthy response to inference request in 2.0569255352020264s
Received healthy response to inference request in 2.087961196899414s
Received healthy response to inference request in 2.409456968307495s
Received healthy response to inference request in 2.0620944499969482s
Received healthy response to inference request in 2.132045030593872s
Received healthy response to inference request in 2.2252073287963867s
30 requests
0 failed requests
5th percentile: 1.9330362319946288
10th percentile: 1.964620876312256
20th percentile: 2.0213033676147463
30th percentile: 2.0612597465515137
40th percentile: 2.076819324493408
50th percentile: 2.110360264778137
60th percentile: 2.1326337337493895
70th percentile: 2.146981692314148
80th percentile: 2.221293830871582
90th percentile: 2.2822288513183597
95th percentile: 2.4827674150466916
99th percentile: 3.3062289404869087
mean time: 2.1660545110702514
Pipeline stage StressChecker completed in 288.40s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.27s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b-fp8_v8 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b-fp8_v8 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b-fp8_v8 status is now torndown due to DeploymentManager action