chaiml-chaiwill-qwen3-2_84869

developer_uid: chai_backend_admin

submission_id: chaiml-chaiwill-qwen3-2_84869_v7

model_name: chaiml-chaiwill-qwen3-2_84869_v7

model_group: ChaiML/chaiwill-qwen3-23

status: torndown

timestamp: 2025-12-23T22:17:28+00:00

num_battles: 6873

num_wins: 3923

celo_rating: 1342.9

family_friendly_score: 0.0

family_friendly_standard_error: 0.0

submission_type: basic

model_repo: ChaiML/chaiwill-qwen3-235b-a22b-instruct-2507-opus-distil-500k-qwen235b-dpo-round2-20faf44c-int4-mixed

model_architecture: Qwen3MoeForCausalLM

model_num_parameters: 18790207488.0

best_of: 6

max_input_tokens: 1992

max_output_tokens: 80

reward_model: default

display_name: chaiml-chaiwill-qwen3-2_84869_v7

ineligible_reason: max_output_tokens!=64

is_internal_developer: True

language_model: ChaiML/chaiwill-qwen3-235b-a22b-instruct-2507-opus-distil-500k-qwen235b-dpo-round2-20faf44c-int4-mixed

model_size: 19B

ranking_group: single

us_pacific_date: 2025-12-23

win_ratio: 0.570784228139095

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['<|assistant|>', '</s>', '<|user|>', '</think>', '####', '<|im_end|>'], 'max_input_tokens': 1992, 'best_of': 6, 'max_output_tokens': 80}

formatter: {'memory_template': '<|im_start|>system\nYou are {bot_name} engaged in a roleplay with user.<|im_end|>\n', 'prompt_template': '', 'bot_template': '<|im_start|>assistant\n{message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n', 'truncate_by_message': True}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-chaiwill-qwen3-2-84869-v7
Waiting for inference service chaiml-chaiwill-qwen3-2-84869-v7 to be ready
Inference service chaiml-chaiwill-qwen3-2-84869-v7 ready after 280.34086418151855s
Pipeline stage VLLMDeployer completed in 280.69s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.429105281829834s
Received healthy response to inference request in 2.1489055156707764s
Received healthy response to inference request in 2.2772228717803955s
Received healthy response to inference request in 2.220627784729004s
Received healthy response to inference request in 2.2061214447021484s
Received healthy response to inference request in 2.201582193374634s
Received healthy response to inference request in 2.0587570667266846s
Received healthy response to inference request in 2.0771591663360596s
Received healthy response to inference request in 2.100740432739258s
Received healthy response to inference request in 2.0174596309661865s
Received healthy response to inference request in 2.5240907669067383s
Received healthy response to inference request in 2.2248308658599854s
Received healthy response to inference request in 2.0831186771392822s
Received healthy response to inference request in 2.0944952964782715s
Received healthy response to inference request in 2.08239483833313s
Received healthy response to inference request in 2.0782198905944824s
Received healthy response to inference request in 2.117701768875122s
Received healthy response to inference request in 2.1067256927490234s
Received healthy response to inference request in 2.323915719985962s
Received healthy response to inference request in 2.233671188354492s
Received healthy response to inference request in 2.3041627407073975s
Received healthy response to inference request in 2.0752675533294678s
Received healthy response to inference request in 2.256763219833374s
Received healthy response to inference request in 2.1430001258850098s
Received healthy response to inference request in 2.1019954681396484s
Received healthy response to inference request in 2.2568609714508057s
Received healthy response to inference request in 2.1273386478424072s
Received healthy response to inference request in 2.075596809387207s
Received healthy response to inference request in 2.3717827796936035s
Received healthy response to inference request in 2.0841310024261475s
30 requests
0 failed requests
5th percentile: 2.066186785697937
10th percentile: 2.075563883781433
20th percentile: 2.0815598487854006
30th percentile: 2.091386008262634
40th percentile: 2.1048336029052734
50th percentile: 2.1351693868637085
60th percentile: 2.2033978939056396
70th percentile: 2.2274829626083372
80th percentile: 2.260933351516724
90th percentile: 2.328702425956726
95th percentile: 2.40331015586853
99th percentile: 2.496544976234436
mean time: 2.180124847094218
Pipeline stage StressChecker completed in 68.20s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.84s
Shutdown handler de-registered
chaiml-chaiwill-qwen3-2_84869_v7 status is now deployed due to DeploymentManager action
chaiml-chaiwill-qwen3-2_84869_v7 status is now inactive due to auto deactivation removed underperforming models
chaiml-chaiwill-qwen3-2_84869_v7 status is now protected due to ABTestQueueItem
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMDeleter
%s, retrying in %s seconds...
%s, retrying in %s seconds...
clean up pipeline due to error=TeardownError('File does not exist: /var/folders/xd/58mw244n4_l1_l7jvj0f_wf40000gn/T/tmpmqr2ah9z')
Shutdown handler de-registered
chaiml-chaiwill-qwen3-2_84869_v7 status is now torndown due to DeploymentManager action