qwen-qwen3-14b

developer_uid: chai_evaluation_service

submission_id: qwen-qwen3-14b_v8

model_name: qwen-qwen3-14b_v8

model_group: Qwen/Qwen3-14B

status: torndown

timestamp: 2026-02-11T02:21:35+00:00

num_battles: 12801

num_wins: 4847

celo_rating: 1221.22

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: basic

model_repo: Qwen/Qwen3-14B

model_architecture: Qwen3ForCausalLM

model_num_parameters: 14768296960.0

best_of: 8

max_input_tokens: 2048

max_output_tokens: 64

reward_model: default

display_name: qwen-qwen3-14b_v8

is_internal_developer: True

language_model: Qwen/Qwen3-14B

model_size: 15B

ranking_group: single

us_pacific_date: 2026-02-07

win_ratio: 0.3786422935708148

generation_params: {'temperature': 0.85, 'top_p': 0.9, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.2, 'frequency_penalty': 0.3, 'stopping_words': ['\n'], 'max_input_tokens': 2048, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': '<|im_start|>system\n{memory}<|im_end|>\n', 'prompt_template': '<|im_start|>user\n{prompt}<|im_end|>\n', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{user_name}: {message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMUploader
Starting job with name qwen-qwen3-14b-v8-uploader
Waiting for job on qwen-qwen3-14b-v8-uploader to finish
qwen-qwen3-14b-v8-uploader: Using quantization_mode: none
qwen-qwen3-14b-v8-uploader: Downloading snapshot of Qwen/Qwen3-14B...
qwen-qwen3-14b-v8-uploader: 
Fetching 18 files:   0%|          | 0/18 [00:00<?, ?it/s]
Fetching 18 files:   6%|▌         | 1/18 [00:00<00:05,  3.22it/s]
Fetching 18 files:  33%|███▎      | 6/18 [00:00<00:00, 16.93it/s]
Fetching 18 files:  50%|█████     | 9/18 [00:09<00:12,  1.39s/it]
Fetching 18 files: 100%|██████████| 18/18 [00:09<00:00,  1.82it/s]
qwen-qwen3-14b-v8-uploader: Downloaded in 10.039s
qwen-qwen3-14b-v8-uploader: Processed model Qwen/Qwen3-14B in 21.060s
qwen-qwen3-14b-v8-uploader: creating bucket guanaco-vllm-models
qwen-qwen3-14b-v8-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:56: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-14b-v8-uploader:   RE_S3_DATESTRING = re.compile('\.[0-9]*(?:[Z\\-\\+]*?)')
qwen-qwen3-14b-v8-uploader: /usr/lib/python3/dist-packages/S3/BaseUtils.py:57: SyntaxWarning: invalid escape sequence '\s'
qwen-qwen3-14b-v8-uploader:   RE_XML_NAMESPACE = re.compile(b'^(<?[^>]+?>\s*|\s*)(<\w+) xmlns=[\'"](https?://[^\'"]+)[\'"]', re.MULTILINE)
qwen-qwen3-14b-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:240: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-14b-v8-uploader:   invalid = re.search("([^a-z0-9\.-])", bucket, re.UNICODE)
qwen-qwen3-14b-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:244: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-14b-v8-uploader:   invalid = re.search("([^A-Za-z0-9\._-])", bucket, re.UNICODE)
qwen-qwen3-14b-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:255: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-14b-v8-uploader:   if re.search("-\.", bucket, re.UNICODE):
qwen-qwen3-14b-v8-uploader: /usr/lib/python3/dist-packages/S3/Utils.py:257: SyntaxWarning: invalid escape sequence '\.'
qwen-qwen3-14b-v8-uploader:   if re.search("\.\.", bucket, re.UNICODE):
qwen-qwen3-14b-v8-uploader: /usr/lib/python3/dist-packages/S3/S3Uri.py:155: SyntaxWarning: invalid escape sequence '\w'
qwen-qwen3-14b-v8-uploader:   _re = re.compile("^(\w+://)?(.*)", re.UNICODE)
qwen-qwen3-14b-v8-uploader: /usr/lib/python3/dist-packages/S3/FileLists.py:480: SyntaxWarning: invalid escape sequence '\*'
qwen-qwen3-14b-v8-uploader:   wildcard_split_result = re.split("\*|\?", uri_str, maxsplit=1)
qwen-qwen3-14b-v8-uploader: Bucket 's3://guanaco-vllm-models/' created
qwen-qwen3-14b-v8-uploader: uploading /dev/shm/model_output to s3://guanaco-vllm-models/qwen-qwen3-14b-v8
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/.gitattributes s3://guanaco-vllm-models/qwen-qwen3-14b-v8/.gitattributes
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/tokenizer_config.json s3://guanaco-vllm-models/qwen-qwen3-14b-v8/tokenizer_config.json
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/config.json s3://guanaco-vllm-models/qwen-qwen3-14b-v8/config.json
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/LICENSE s3://guanaco-vllm-models/qwen-qwen3-14b-v8/LICENSE
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/generation_config.json s3://guanaco-vllm-models/qwen-qwen3-14b-v8/generation_config.json
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/README.md s3://guanaco-vllm-models/qwen-qwen3-14b-v8/README.md
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/model.safetensors.index.json s3://guanaco-vllm-models/qwen-qwen3-14b-v8/model.safetensors.index.json
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/merges.txt s3://guanaco-vllm-models/qwen-qwen3-14b-v8/merges.txt
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/vocab.json s3://guanaco-vllm-models/qwen-qwen3-14b-v8/vocab.json
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/tokenizer.json s3://guanaco-vllm-models/qwen-qwen3-14b-v8/tokenizer.json
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/model-00008-of-00008.safetensors s3://guanaco-vllm-models/qwen-qwen3-14b-v8/model-00008-of-00008.safetensors
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/model-00002-of-00008.safetensors s3://guanaco-vllm-models/qwen-qwen3-14b-v8/model-00002-of-00008.safetensors
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/model-00007-of-00008.safetensors s3://guanaco-vllm-models/qwen-qwen3-14b-v8/model-00007-of-00008.safetensors
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/model-00006-of-00008.safetensors s3://guanaco-vllm-models/qwen-qwen3-14b-v8/model-00006-of-00008.safetensors
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/model-00003-of-00008.safetensors s3://guanaco-vllm-models/qwen-qwen3-14b-v8/model-00003-of-00008.safetensors
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/model-00001-of-00008.safetensors s3://guanaco-vllm-models/qwen-qwen3-14b-v8/model-00001-of-00008.safetensors
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/model-00004-of-00008.safetensors s3://guanaco-vllm-models/qwen-qwen3-14b-v8/model-00004-of-00008.safetensors
qwen-qwen3-14b-v8-uploader: cp /dev/shm/model_output/model-00005-of-00008.safetensors s3://guanaco-vllm-models/qwen-qwen3-14b-v8/model-00005-of-00008.safetensors
Job qwen-qwen3-14b-v8-uploader completed after 113.67s with status: succeeded
Stopping job with name qwen-qwen3-14b-v8-uploader
Pipeline stage VLLMUploader completed in 114.13s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-14b-v8
Waiting for inference service qwen-qwen3-14b-v8 to be ready
HTTP Request: %s %s "%s %d %s"
HTTP Request: %s %s "%s %d %s"
Inference service qwen-qwen3-14b-v8 ready after 161.19430327415466s
Pipeline stage VLLMDeployer completed in 162.02s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.5829808712005615s
Received healthy response to inference request in 1.5854027271270752s
Received healthy response to inference request in 1.598313808441162s
Received healthy response to inference request in 1.6576051712036133s
Received healthy response to inference request in 1.6356019973754883s
Received healthy response to inference request in 1.6851365566253662s
Received healthy response to inference request in 1.5073001384735107s
Received healthy response to inference request in 1.478743553161621s
Received healthy response to inference request in 1.5970757007598877s
Received healthy response to inference request in 1.8981902599334717s
Received healthy response to inference request in 1.5816454887390137s
Received healthy response to inference request in 1.5820629596710205s
Received healthy response to inference request in 1.608506679534912s
Received healthy response to inference request in 1.6338233947753906s
Received healthy response to inference request in 1.6679120063781738s
Received healthy response to inference request in 1.6805226802825928s
Received healthy response to inference request in 1.5940539836883545s
Received healthy response to inference request in 1.756605863571167s
Received healthy response to inference request in 1.5842163562774658s
Received healthy response to inference request in 1.602038860321045s
Received healthy response to inference request in 1.7849998474121094s
Received healthy response to inference request in 1.6789085865020752s
Received healthy response to inference request in 1.7514588832855225s
Received healthy response to inference request in 1.9111452102661133s
Received healthy response to inference request in 3.6248421669006348s
Received healthy response to inference request in 1.5267541408538818s
Received healthy response to inference request in 1.7218883037567139s
Received healthy response to inference request in 1.9958555698394775s
Received healthy response to inference request in 1.4080231189727783s
Received healthy response to inference request in 1.7872803211212158s
30 requests
0 failed requests
5th percentile: 1.4915940165519714
10th percentile: 1.5248087406158448
20th percentile: 1.5827972888946533
30th percentile: 1.5914586067199707
40th percentile: 1.6005488395690919
50th percentile: 1.6347126960754395
60th percentile: 1.6723106384277344
70th percentile: 1.6961620807647704
80th percentile: 1.7622846603393556
90th percentile: 1.8994857549667359
95th percentile: 1.9577359080314634
99th percentile: 3.1524360537529006
mean time: 1.7236298402150472
Pipeline stage StressChecker completed in 54.17s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
qwen-qwen3-14b_v8 status is now deployed due to DeploymentManager action
qwen-qwen3-14b_v8 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-14b_v8 status is now torndown due to DeploymentManager action