chaiml-2a6f-69d4-linear_82448

developer_uid: richhx

submission_id: chaiml-2a6f-69d4-linear_82448_v2

model_name: chaiml-2a6f-69d4-linear_82448_v2

model_group: ChaiML/2a6f-69d4-linear-

status: torndown

timestamp: 2026-02-07T20:58:44+00:00

num_battles: 6923

num_wins: 3555

celo_rating: 1324.13

family_friendly_score: 0.5342

family_friendly_standard_error: 0.007054507211705152

submission_type: basic

model_repo: ChaiML/2a6f-69d4-linear-w01-W4A16-G128-AutoRound

model_architecture: MistralForCausalLM

model_num_parameters: 24096691200.0

best_of: 7

max_input_tokens: 1280

max_output_tokens: 64

reward_model: default

display_name: chaiml-2a6f-69d4-linear_82448_v2

ineligible_reason: model is not deployable

is_internal_developer: True

language_model: ChaiML/2a6f-69d4-linear-w01-W4A16-G128-AutoRound

model_size: 24B

ranking_group: single

us_pacific_date: 2026-01-07

win_ratio: 0.5135057056189514

generation_params: {'temperature': 0.9, 'top_p': 0.95, 'min_p': 0.05, 'top_k': 80, 'presence_penalty': 0.45, 'frequency_penalty': 0.45, 'stopping_words': ['\n'], 'max_input_tokens': 1280, 'best_of': 7, 'max_output_tokens': 64}

formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage VLLMTemplater
Pipeline stage VLLMTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage VLLMDeployer
Creating inference service chaiml-2a6f-69d4-linear-82448-v2
Waiting for inference service chaiml-2a6f-69d4-linear-82448-v2 to be ready
Inference service chaiml-2a6f-69d4-linear-82448-v2 ready after 151.03681540489197s
Pipeline stage VLLMDeployer completed in 151.95s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.3601205348968506s
Received healthy response to inference request in 0.997992992401123s
Received healthy response to inference request in 1.0269622802734375s
Received healthy response to inference request in 1.3255112171173096s
Received healthy response to inference request in 1.0353014469146729s
Received healthy response to inference request in 1.1709234714508057s
Received healthy response to inference request in 1.5243051052093506s
Received healthy response to inference request in 0.9719281196594238s
Received healthy response to inference request in 1.008401870727539s
Received healthy response to inference request in 0.9447660446166992s
Received healthy response to inference request in 1.1937289237976074s
Received healthy response to inference request in 1.6687159538269043s
Received healthy response to inference request in 0.9354424476623535s
Received healthy response to inference request in 0.915428876876831s
Received healthy response to inference request in 1.1256523132324219s
Received healthy response to inference request in 1.1783854961395264s
Received healthy response to inference request in 0.9585361480712891s
Received healthy response to inference request in 1.3262262344360352s
Received healthy response to inference request in 1.0221123695373535s
Received healthy response to inference request in 0.9218683242797852s
Received healthy response to inference request in 1.1557464599609375s
Received healthy response to inference request in 0.9930903911590576s
Received healthy response to inference request in 0.9702913761138916s
Received healthy response to inference request in 1.2135426998138428s
Received healthy response to inference request in 1.3506691455841064s
Received healthy response to inference request in 1.0783710479736328s
Received healthy response to inference request in 1.151092767715454s
Received healthy response to inference request in 0.9210467338562012s
Received healthy response to inference request in 0.9922094345092773s
Received healthy response to inference request in 1.2018563747406006s
30 requests
0 failed requests
5th percentile: 0.921416449546814
10th percentile: 0.9340850353240967
20th percentile: 0.9679403305053711
30th percentile: 0.9928261041641235
40th percentile: 1.0166281700134276
50th percentile: 1.0568362474441528
60th percentile: 1.1529542446136474
70th percentile: 1.1829885244369507
80th percentile: 1.2359364032745364
90th percentile: 1.351614284515381
95th percentile: 1.4504220485687251
99th percentile: 1.626836807727814
mean time: 1.1213408867518107
Pipeline stage StressChecker completed in 36.53s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.83s
Shutdown handler de-registered
chaiml-2a6f-69d4-linear_82448_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Generating Leaderboard row for %s
Generated Leaderboard row for %s
Pipeline stage OfflineFamilyFriendlyScorer completed in 2142.91s
Shutdown handler de-registered
chaiml-2a6f-69d4-linear_82448_v2 status is now protected due to ABTestQueueItem
chaiml-2a6f-69d4-linear_82448_v2 status is now inactive due to ABTestQueueItem
chaiml-2a6f-69d4-linear_82448_v2 status is now torndown due to DeploymentManager action