qwen-qwen3-235b-a22b-in_47730

developer_uid: chai_backend_admin

submission_id: qwen-qwen3-235b-a22b-in_47730_v5

model_name: qwen-qwen3-235b-a22b-in_47730_v5

model_group: Qwen/Qwen3-235B-A22B-Ins

status: torndown

timestamp: 2025-12-09T02:46:07+00:00

num_battles: 9710

num_wins: 4883

celo_rating: 1296.78

family_friendly_score: 0.599

family_friendly_standard_error: 0.006931074952703946

submission_type: basic

model_repo: Qwen/Qwen3-235B-A22B-Instruct-2507

model_architecture: Qwen3MoeForCausalLM

model_num_parameters: 18790207488.0

best_of: 8

max_input_tokens: 3000

max_output_tokens: 70

reward_model: default

display_name: qwen-qwen3-235b-a22b-in_47730_v5

ineligible_reason: max_output_tokens!=64

is_internal_developer: True

language_model: Qwen/Qwen3-235B-A22B-Instruct-2507

model_size: 19B

ranking_group: single

us_pacific_date: 2025-12-08

win_ratio: 0.5028836251287333

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['<|im_end|>', '<|assistant|>', '</s>', '</think>', '<|user|>', '####'], 'max_input_tokens': 3000, 'best_of': 8, 'max_output_tokens': 70}

formatter: {'memory_template': "<|im_start|>system\nYou are {bot_name}. Engaged in a highly engaging roleplay with user. NSFW allowed. Be descriptive and creative.\n{bot_name}'s persona: {memory}<|im_end|>\n", 'prompt_template': '', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.26s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service qwen-qwen3-235b-a22b-in-47730-v5
Waiting for inference service qwen-qwen3-235b-a22b-in-47730-v5 to be ready
Inference service qwen-qwen3-235b-a22b-in-47730-v5 ready after 447.5690848827362s
Pipeline stage VLLMDeployer completed in 449.02s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.018477439880371s
Received healthy response to inference request in 2.9202146530151367s
Received healthy response to inference request in 2.094311475753784s
Received healthy response to inference request in 1.7364625930786133s
Received healthy response to inference request in 1.8249716758728027s
Received healthy response to inference request in 2.256934404373169s
Received healthy response to inference request in 2.6806578636169434s
Received healthy response to inference request in 1.7853403091430664s
Received healthy response to inference request in 2.061307430267334s
Received healthy response to inference request in 1.6513969898223877s
Received healthy response to inference request in 2.481142997741699s
Received healthy response to inference request in 2.9803309440612793s
Received healthy response to inference request in 2.627743721008301s
Received healthy response to inference request in 2.550788164138794s
Received healthy response to inference request in 2.51863169670105s
Received healthy response to inference request in 1.8213887214660645s
Received healthy response to inference request in 1.8275234699249268s
Received healthy response to inference request in 2.3915488719940186s
Received healthy response to inference request in 2.1378092765808105s
Received healthy response to inference request in 2.5219485759735107s
Received healthy response to inference request in 2.019998788833618s
Received healthy response to inference request in 2.646488666534424s
Received healthy response to inference request in 2.235523223876953s
Received healthy response to inference request in 1.7006323337554932s
Received healthy response to inference request in 3.527831554412842s
Received healthy response to inference request in 1.8690094947814941s
Received healthy response to inference request in 2.985544443130493s
Received healthy response to inference request in 2.186023712158203s
Received healthy response to inference request in 1.8105888366699219s
Received healthy response to inference request in 1.855496883392334s
30 requests
0 failed requests
5th percentile: 1.7167559504508971
10th percentile: 1.7804525375366211
20th percentile: 1.8242550849914552
30th percentile: 1.8649557113647461
40th percentile: 2.081109857559204
50th percentile: 2.210773468017578
60th percentile: 2.427386522293091
70th percentile: 2.5306004524230956
80th percentile: 2.653322505950928
90th percentile: 2.9808522939682005
95th percentile: 3.0036575913429258
99th percentile: 3.380118861198426
mean time: 2.2908689737319947
Pipeline stage StressChecker completed in 73.30s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 1.61s
Shutdown handler de-registered
qwen-qwen3-235b-a22b-in_47730_v5 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 2117.31s
Shutdown handler de-registered
qwen-qwen3-235b-a22b-in_47730_v5 status is now inactive due to auto deactivation removed underperforming models
qwen-qwen3-235b-a22b-in_47730_v5 status is now torndown due to DeploymentManager action