chaiml-chaiwill-qwen3-2_84869

developer_uid: chai_backend_admin

submission_id: chaiml-chaiwill-qwen3-2_84869_v4

model_name: chaiml-chaiwill-qwen3-2_84869_v4

model_group: ChaiML/chaiwill-qwen3-23

status: torndown

timestamp: 2025-12-23T07:44:52+00:00

num_battles: 16541

num_wins: 9772

celo_rating: 1357.01

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: basic

model_repo: ChaiML/chaiwill-qwen3-235b-a22b-instruct-2507-opus-distil-500k-qwen235b-dpo-round2-20faf44c-int4-mixed

model_architecture: Qwen3MoeForCausalLM

model_num_parameters: 18790207488.0

best_of: 8

max_input_tokens: 1992

max_output_tokens: 80

reward_model: default

display_name: chaiml-chaiwill-qwen3-2_84869_v4

ineligible_reason: max_output_tokens!=64

is_internal_developer: True

language_model: ChaiML/chaiwill-qwen3-235b-a22b-instruct-2507-opus-distil-500k-qwen235b-dpo-round2-20faf44c-int4-mixed

model_size: 19B

ranking_group: single

us_pacific_date: 2025-12-22

win_ratio: 0.5907744392721117

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['<|im_end|>', '<|assistant|>', '</s>', '<|user|>', '####', '</think>'], 'max_input_tokens': 1992, 'best_of': 8, 'max_output_tokens': 80}

formatter: {'memory_template': '<|im_start|>system\nYou are {bot_name} engaged in a roleplay with user.<|im_end|>\n', 'prompt_template': '', 'bot_template': '<|im_start|>assistant\n{message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n', 'truncate_by_message': True}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-chaiwill-qwen3-2-84869-v4
Waiting for inference service chaiml-chaiwill-qwen3-2-84869-v4 to be ready
Inference service chaiml-chaiwill-qwen3-2-84869-v4 ready after 210.95209312438965s
Pipeline stage VLLMDeployer completed in 211.45s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.9268882274627686s
Received healthy response to inference request in 1.9666941165924072s
Received healthy response to inference request in 1.9659838676452637s
Received healthy response to inference request in 2.56149959564209s
Received healthy response to inference request in 2.486198663711548s
Received healthy response to inference request in 2.315843105316162s
Received healthy response to inference request in 2.041867256164551s
Received healthy response to inference request in 1.8305964469909668s
Received healthy response to inference request in 1.8701245784759521s
Received healthy response to inference request in 2.311310052871704s
Received healthy response to inference request in 2.2262418270111084s
Received healthy response to inference request in 2.1110877990722656s
Received healthy response to inference request in 1.867809772491455s
Received healthy response to inference request in 1.8552005290985107s
Received healthy response to inference request in 1.8931658267974854s
Received healthy response to inference request in 1.8956305980682373s
Received healthy response to inference request in 1.9104394912719727s
Received healthy response to inference request in 2.528343677520752s
Received healthy response to inference request in 1.904381513595581s
Received healthy response to inference request in 1.876939296722412s
Received healthy response to inference request in 1.9607312679290771s
Received healthy response to inference request in 2.1923272609710693s
Received healthy response to inference request in 1.906418800354004s
Received healthy response to inference request in 2.114596366882324s
Received healthy response to inference request in 1.897125005722046s
Received healthy response to inference request in 1.8480520248413086s
Received healthy response to inference request in 2.4210686683654785s
Received healthy response to inference request in 1.9926097393035889s
Received healthy response to inference request in 1.923288106918335s
Received healthy response to inference request in 1.8712117671966553s
30 requests
0 failed requests
5th percentile: 1.8512688517570495
10th percentile: 1.8665488481521606
20th percentile: 1.8757937908172608
30th percentile: 1.8966766834259032
40th percentile: 1.9088312149047852
50th percentile: 1.9438097476959229
60th percentile: 1.97706036567688
70th percentile: 2.1121403694152834
80th percentile: 2.2432554721832276
90th percentile: 2.4275816679000854
95th percentile: 2.50937842130661
99th percentile: 2.5518843793869017
mean time: 2.0491225083669025
Pipeline stage StressChecker completed in 63.75s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.59s
Shutdown handler de-registered
chaiml-chaiwill-qwen3-2_84869_v4 status is now deployed due to DeploymentManager action
chaiml-chaiwill-qwen3-2_84869_v4 status is now inactive due to auto deactivation removed underperforming models
chaiml-chaiwill-qwen3-2_84869_v4 status is now torndown due to DeploymentManager action