developer_uid: chai_backend_admin
submission_id: qwen-qwen3-5-35b-a3b-fp8_v7
model_name: qwen-qwen3-5-35b-a3b-fp8_v7
model_group: Qwen/Qwen3.5-35B-A3B-FP8
status: torndown
timestamp: 2026-03-28T20:21:42+00:00
num_battles: 11160
num_wins: 4951
celo_rating: 8467.11
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: Qwen/Qwen3.5-35B-A3B-FP8
model_architecture: Qwen3_5MoeForConditionalGeneration
model_num_parameters: 33753909248.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: qwen-qwen3-5-35b-a3b-fp8_v7
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: Qwen/Qwen3.5-35B-A3B-FP8
model_size: 34B
ranking_group: single
us_pacific_date: 2026-03-25
win_ratio: 0.4436379928315412
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['<|im_end|>', '####\n', 'You:', '\n', '<|endoftext|>', '</s>', '####'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}</s>\n', 'user_template': 'You: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-fp8-v7-uploader
Failed to get response for submission chaiml-kimid-v12-mv1-wi_79693_v1: ('http://chaiml-kimid-v12-mv1-wi-79693-v1-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/completions', 'request timeout')
Waiting for job on qwen-qwen3-5-35b-a3b-fp8-v7-uploader to finish
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: Using quantization_mode: none
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: Downloading snapshot of Qwen/Qwen3.5-35B-A3B-FP8...
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: Downloaded in 14.654s
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: Processed model Qwen/Qwen3.5-35B-A3B-FP8 in 29.350s
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/.gitattributes
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/config.json
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/README.md
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/LICENSE
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/generation_config.json
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/configuration.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/configuration.json
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/video_preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/merges.txt
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/vocab.json
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors.index.json
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/tokenizer.json
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00007-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00012-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00012-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00001-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00008-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00008-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00005-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00005-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00002-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00003-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00003-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00006-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00006-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00011-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00010-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00004-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00013-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00013-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v7-uploader: cp /dev/shm/model_output/model.safetensors-00009-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v7/default/model.safetensors-00009-of-00014.safetensors
Job qwen-qwen3-5-35b-a3b-fp8-v7-uploader completed after 52.8s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-fp8-v7-uploader
Pipeline stage VLLMUploader completed in 53.25s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 2.02s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-fp8-v7
Waiting for inference service qwen-qwen3-5-35b-a3b-fp8-v7 to be ready
2026-03-25T18:52:07.233800+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v7
2026-03-25T18:53:07.322145+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v7
2026-03-25T18:54:07.431473+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v7
Inference service qwen-qwen3-5-35b-a3b-fp8-v7 ready after 170.65557169914246s
Pipeline stage VLLMDeployer completed in 171.21s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-25T18:55:07.524363+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v7
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
2026-03-25T18:56:07.619174+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v7
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.846122980117798s
Received healthy response to inference request in 1.9492988586425781s
Received healthy response to inference request in 2.8073530197143555s
Received healthy response to inference request in 4.322614908218384s
Received healthy response to inference request in 2.6217689514160156s
Received healthy response to inference request in 17.559486865997314s
2026-03-25T18:57:07.722882+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v7
Received healthy response to inference request in 1.5764284133911133s
Received healthy response to inference request in 1.8724427223205566s
Received healthy response to inference request in 0.9571645259857178s
Received healthy response to inference request in 1.2764127254486084s
Received healthy response to inference request in 1.1927516460418701s
Received healthy response to inference request in 1.4314196109771729s
Received healthy response to inference request in 2.1091983318328857s
Received healthy response to inference request in 1.750230312347412s
Received healthy response to inference request in 1.7086129188537598s
Received healthy response to inference request in 1.0529866218566895s
Received healthy response to inference request in 1.007601022720337s
Received healthy response to inference request in 1.436960220336914s
Received healthy response to inference request in 1.345949411392212s
Received healthy response to inference request in 0.8172860145568848s
Received healthy response to inference request in 1.3516736030578613s
Received healthy response to inference request in 1.2182683944702148s
Received healthy response to inference request in 1.435819387435913s
Received healthy response to inference request in 1.5394062995910645s
Received healthy response to inference request in 1.04945707321167s
30 requests
5 failed requests
5th percentile: 0.9798609495162964
10th percentile: 1.0452714681625366
20th percentile: 1.213165044784546
30th percentile: 1.3499563455581665
40th percentile: 1.4365038871765137
50th percentile: 1.6425206661224365
60th percentile: 1.903185176849365
70th percentile: 2.677444171905517
80th percentile: 6.969989299774207
90th percentile: 20.125386953353882
95th percentile: 20.13925141096115
99th percentile: 20.143375833034515
mean time: 5.296521313985189
%s, retrying in %s seconds...
Received healthy response to inference request in 1.2530896663665771s
Received healthy response to inference request in 1.37736177444458s
Received healthy response to inference request in 1.0921781063079834s
Received healthy response to inference request in 0.6911189556121826s
Received healthy response to inference request in 1.2154524326324463s
Received healthy response to inference request in 1.3283607959747314s
Received healthy response to inference request in 1.038508415222168s
Received healthy response to inference request in 0.8411884307861328s
Received healthy response to inference request in 1.3466718196868896s
Received healthy response to inference request in 1.7142229080200195s
Received healthy response to inference request in 1.3245654106140137s
Received healthy response to inference request in 1.4675524234771729s
Received healthy response to inference request in 1.151540994644165s
Received healthy response to inference request in 1.0873479843139648s
Received healthy response to inference request in 1.2198774814605713s
Received healthy response to inference request in 1.139646053314209s
Received healthy response to inference request in 1.4135444164276123s
Received healthy response to inference request in 1.394524097442627s
Received healthy response to inference request in 1.2767882347106934s
Received healthy response to inference request in 1.2325444221496582s
Received healthy response to inference request in 1.2035713195800781s
Received healthy response to inference request in 1.1644866466522217s
Received healthy response to inference request in 1.1496171951293945s
Received healthy response to inference request in 1.420804738998413s
Received healthy response to inference request in 1.047318458557129s
2026-03-25T18:58:07.820062+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v7
Received healthy response to inference request in 1.198822021484375s
Received healthy response to inference request in 0.7680082321166992s
Received healthy response to inference request in 1.608919620513916s
Received healthy response to inference request in 1.4643263816833496s
Received healthy response to inference request in 1.2588820457458496s
30 requests
0 failed requests
5th percentile: 0.8009393215179443
10th percentile: 1.0187764167785645
20th percentile: 1.0912120819091797
30th percentile: 1.1509638547897338
40th percentile: 1.2016716003417969
50th percentile: 1.2262109518051147
60th percentile: 1.266044521331787
70th percentile: 1.3338541030883788
80th percentile: 1.3983281612396241
90th percentile: 1.4646489858627318
95th percentile: 1.5453043818473813
99th percentile: 1.6836849546432495
mean time: 1.2296947161356608
Pipeline stage StressChecker completed in 201.03s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.83s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b-fp8_v7 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b-fp8_v7 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b-fp8_v7 status is now torndown due to DeploymentManager action