developer_uid: chai_backend_admin
submission_id: qwen-qwen3-5-35b-a3b-fp8_v1
model_name: qwen-qwen3-5-35b-a3b-fp8_v1
model_group: Qwen/Qwen3.5-35B-A3B-FP8
status: torndown
timestamp: 2026-03-27T22:52:11+00:00
num_battles: 11622
num_wins: 4519
celo_rating: 8477.39
family_friendly_score: 0.0
family_friendly_standard_error: 0.0
submission_type: basic
model_repo: Qwen/Qwen3.5-35B-A3B-FP8
model_architecture: Qwen3_5MoeForConditionalGeneration
model_num_parameters: 33753909248.0
best_of: 8
max_input_tokens: 2048
max_output_tokens: 80
reward_model: default
display_name: qwen-qwen3-5-35b-a3b-fp8_v1
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: Qwen/Qwen3.5-35B-A3B-FP8
model_size: 34B
ranking_group: single
us_pacific_date: 2026-03-24
win_ratio: 0.388831526415419
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['You:', '<|endoftext|>', '####\n', '<|im_end|>', '####', '</s>', '\n'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 80}
formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}</s>\n', 'user_template': 'You: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-5-35b-a3b-fp8-v1-uploader
Waiting for job on qwen-qwen3-5-35b-a3b-fp8-v1-uploader to finish
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: Using quantization_mode: fp8
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: Repo Qwen/Qwen3.5-35B-A3B-FP8 already ends in FP8. Skipping...
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: Checking if Qwen/Qwen3.5-35B-A3B-FP8 already exists in ChaiML
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: Model already exists. Downloading to /dev/shm/model_output...
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: Downloading snapshot of Qwen/Qwen3.5-35B-A3B-FP8...
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: Downloaded in 9.492s
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: Processed model Qwen/Qwen3.5-35B-A3B-FP8 in 11.989s
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/video_preprocessor_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/video_preprocessor_config.json
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/LICENSE
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/configuration.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/configuration.json
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/.gitattributes
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/config.json
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/README.md
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/tokenizer_config.json
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/generation_config.json
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/chat_template.jinja s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/chat_template.jinja
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/merges.txt
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/vocab.json
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors.index.json
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/tokenizer.json
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00001-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00001-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00007-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00007-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00012-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00012-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00003-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00003-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00010-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00010-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00004-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00004-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00008-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00008-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00005-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00005-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00011-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00011-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00006-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00006-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00002-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00002-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00013-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00013-of-00014.safetensors
qwen-qwen3-5-35b-a3b-fp8-v1-uploader: cp /dev/shm/model_output/model.safetensors-00009-of-00014.safetensors s3://guanaco-vllm-models/qwen-qwen3-5-35b-a3b-fp8-v1/default/model.safetensors-00009-of-00014.safetensors
Job qwen-qwen3-5-35b-a3b-fp8-v1-uploader completed after 47.79s with status: succeeded
Stopping job with name qwen-qwen3-5-35b-a3b-fp8-v1-uploader
Pipeline stage VLLMUploader completed in 48.94s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 1.05s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-5-35b-a3b-fp8-v1
Waiting for inference service qwen-qwen3-5-35b-a3b-fp8-v1 to be ready
2026-03-24T21:58:48.798339+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v1
2026-03-24T21:59:49.187626+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v1
2026-03-24T22:00:49.365829+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v1
Inference service qwen-qwen3-5-35b-a3b-fp8-v1 ready after 172.2458472251892s
Pipeline stage VLLMDeployer completed in 173.43s
run pipeline stage %s
Running pipeline stage StressChecker
2026-03-24T22:01:49.543484+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.642056941986084s
Received healthy response to inference request in 0.9865198135375977s
2026-03-24T22:02:49.748415+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 7.119216442108154s
Received healthy response to inference request in 1.2843024730682373s
Received healthy response to inference request in 1.182426929473877s
Received healthy response to inference request in 2.659660816192627s
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.4402105808258057s
Received healthy response to inference request in 1.1346807479858398s
2026-03-24T22:03:49.944019+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v1
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPConnectionPool(host='guanaco-submitter-v2.guanaco-backend.kchai-google-us-east4.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 1.403249740600586s
Received healthy response to inference request in 2.7996232509613037s
Received healthy response to inference request in 7.141952037811279s
Received healthy response to inference request in 1.734424352645874s
Received healthy response to inference request in 1.3121817111968994s
Received healthy response to inference request in 1.4527852535247803s
Received healthy response to inference request in 0.9290182590484619s
Received healthy response to inference request in 1.1178975105285645s
Received healthy response to inference request in 2.4080488681793213s
Received healthy response to inference request in 0.9865572452545166s
Received healthy response to inference request in 1.547229290008545s
Received healthy response to inference request in 0.8058164119720459s
Received healthy response to inference request in 1.9525015354156494s
Received healthy response to inference request in 1.2215065956115723s
Received healthy response to inference request in 1.6901187896728516s
30 requests
7 failed requests
5th percentile: 0.954893958568573
10th percentile: 0.9865535020828247
20th percentile: 1.1728776931762697
30th percentile: 1.3038179397583007
40th percentile: 1.4477553844451905
50th percentile: 1.7122715711593628
60th percentile: 2.501652097702026
70th percentile: 4.095501208305347
80th percentile: 20.211543607711793
90th percentile: 20.30621964931488
95th percentile: 20.316555047035216
99th percentile: 20.33509120464325
mean time: 6.296256415049235
%s, retrying in %s seconds...
Received healthy response to inference request in 1.049604892730713s
Received healthy response to inference request in 0.9558162689208984s
Received healthy response to inference request in 1.273728609085083s
2026-03-24T22:04:50.132468+00:00 monitor updated for qwen-qwen3-5-35b-a3b-fp8_v1
Received healthy response to inference request in 1.034703254699707s
Received healthy response to inference request in 1.0591001510620117s
Received healthy response to inference request in 1.1073217391967773s
Received healthy response to inference request in 1.3864092826843262s
Received healthy response to inference request in 1.121574878692627s
Received healthy response to inference request in 0.8349800109863281s
Received healthy response to inference request in 0.7363533973693848s
Received healthy response to inference request in 1.3922314643859863s
Received healthy response to inference request in 1.732581377029419s
Received healthy response to inference request in 1.2011454105377197s
Received healthy response to inference request in 0.6110725402832031s
Received healthy response to inference request in 1.4259073734283447s
Received healthy response to inference request in 1.0172481536865234s
Received healthy response to inference request in 0.8751332759857178s
Received healthy response to inference request in 1.0230152606964111s
Received healthy response to inference request in 1.8461637496948242s
Received healthy response to inference request in 1.1209361553192139s
Received healthy response to inference request in 1.0229504108428955s
Received healthy response to inference request in 1.3881492614746094s
Received healthy response to inference request in 1.262742042541504s
Received healthy response to inference request in 1.2383456230163574s
Received healthy response to inference request in 1.5109920501708984s
Received healthy response to inference request in 1.8141510486602783s
Received healthy response to inference request in 1.377734899520874s
Received healthy response to inference request in 1.1670260429382324s
Received healthy response to inference request in 1.6925256252288818s
Received healthy response to inference request in 2.0363166332244873s
30 requests
0 failed requests
5th percentile: 0.7807353734970093
10th percentile: 0.8711179494857788
20th percentile: 1.021809959411621
30th percentile: 1.0451344013214112
40th percentile: 1.1154903888702392
50th percentile: 1.184085726737976
60th percentile: 1.2671366691589356
70th percentile: 1.3869312763214112
80th percentile: 1.4429243087768557
90th percentile: 1.740738344192505
95th percentile: 1.8317580342292785
99th percentile: 1.9811722970008852
mean time: 1.2438653628031413
Pipeline stage StressChecker completed in 235.52s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.22s
Shutdown handler de-registered
qwen-qwen3-5-35b-a3b-fp8_v1 status is now deployed due to DeploymentManager action
qwen-qwen3-5-35b-a3b-fp8_v1 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-5-35b-a3b-fp8_v1 status is now torndown due to DeploymentManager action