developer_uid: rirv938
submission_id: chaiml-mistral31-24b-s_69496_v16
model_name: chaiml-mistral31-24b-s_69496_v16
model_group: ChaiML/mistral31-24b-sim
status: torndown
timestamp: 2025-06-30T20:30:12+00:00
num_battles: 9087
num_wins: 4662
celo_rating: 1293.43
family_friendly_score: 0.493
family_friendly_standard_error: 0.007070374813261317
submission_type: basic
model_repo: ChaiML/mistral31-24b-simpoexp1-s1-new-sft-retryv2top20lex-2e
model_architecture: MistralForCausalLM
model_num_parameters: 24096691200.0
best_of: 8
max_input_tokens: 768
max_output_tokens: 64
reward_model: default
latencies: [{'batch_size': 1, 'throughput': 0.5517822922026744, 'latency_mean': 1.812209082841873, 'latency_p50': 1.8120911121368408, 'latency_p90': 2.009126162528992}, {'batch_size': 3, 'throughput': 1.1378819123077653, 'latency_mean': 2.628240548372269, 'latency_p50': 2.643193483352661, 'latency_p90': 2.8893970251083374}, {'batch_size': 5, 'throughput': 1.4772532623640078, 'latency_mean': 3.35999893784523, 'latency_p50': 3.359512448310852, 'latency_p90': 3.720719337463379}, {'batch_size': 6, 'throughput': 1.589083618579658, 'latency_mean': 3.7441518115997314, 'latency_p50': 3.7229156494140625, 'latency_p90': 4.20874662399292}, {'batch_size': 8, 'throughput': 1.7530409189361267, 'latency_mean': 4.509547457695008, 'latency_p50': 4.496449708938599, 'latency_p90': 5.072184228897095}, {'batch_size': 10, 'throughput': 1.872715778282665, 'latency_mean': 5.275798701047897, 'latency_p50': 5.2802876234054565, 'latency_p90': 5.924876713752747}]
gpu_counts: {'NVIDIA A100-SXM4-80GB': 1}
display_name: chaiml-mistral31-24b-s_69496_v16
is_internal_developer: True
language_model: ChaiML/mistral31-24b-simpoexp1-s1-new-sft-retryv2top20lex-2e
model_size: 24B
ranking_group: single
throughput_3p7s: 1.59
us_pacific_date: 2025-06-30
win_ratio: 0.5130406074612083
generation_params: {'temperature': 1.0, 'top_p': 0.95, 'min_p': 0.05, 'top_k': 80, 'presence_penalty': 0.45, 'frequency_penalty': 0.45, 'stopping_words': ['###', '</s>', '\n', 'You:'], 'max_input_tokens': 768, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name chaiml-mistral31-24b-s-69496-v16-mkmlizer
Waiting for job on chaiml-mistral31-24b-s-69496-v16-mkmlizer to finish
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ██████ ██████ █████ ████ ████ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ░░██████ ██████ ░░███ ███░ ░░███ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ░███░█████░███ ░███ ███ ░███ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ░███░░███ ░███ ░███████ ░███ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ░███ ░░░ ░███ ░███░░███ ░███ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ░███ ░███ ░███ ░░███ ░███ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ █████ █████ █████ ░░████ █████ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ░░░░░ ░░░░░ ░░░░░ ░░░░ ░░░░░ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ Version: 0.29.3 ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ Features: FLYWHEEL, CUDA ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ Copyright 2023-2025 MK ONE TECHNOLOGIES Inc. ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ https://mk1.ai ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ belonging to: ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ Chai Research Corp. ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ Expiration: 2028-03-31 23:59:59 ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ║ ║
chaiml-mistral31-24b-s-69496-v16-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-mistral31-24b-s-69496-v16-mkmlizer: Downloaded to shared memory in 48.869s
chaiml-mistral31-24b-s-69496-v16-mkmlizer: Checking if ChaiML/mistral31-24b-simpoexp1-s1-new-sft-retryv2top20lex-2e already exists in ChaiML
chaiml-mistral31-24b-s-69496-v16-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp1oa4f1nk, device:0
chaiml-mistral31-24b-s-69496-v16-mkmlizer: Saving flywheel model at /dev/shm/model_cache
chaiml-mistral31-24b-s-69496-v16-mkmlizer: quantized model in 48.812s
chaiml-mistral31-24b-s-69496-v16-mkmlizer: Processed model ChaiML/mistral31-24b-simpoexp1-s1-new-sft-retryv2top20lex-2e in 97.681s
chaiml-mistral31-24b-s-69496-v16-mkmlizer: creating bucket guanaco-mkml-models
chaiml-mistral31-24b-s-69496-v16-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
chaiml-mistral31-24b-s-69496-v16-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/chaiml-mistral31-24b-s-69496-v16
chaiml-mistral31-24b-s-69496-v16-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/chaiml-mistral31-24b-s-69496-v16/config.json
chaiml-mistral31-24b-s-69496-v16-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/chaiml-mistral31-24b-s-69496-v16/special_tokens_map.json
chaiml-mistral31-24b-s-69496-v16-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/chaiml-mistral31-24b-s-69496-v16/tokenizer_config.json
chaiml-mistral31-24b-s-69496-v16-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-mistral31-24b-s-69496-v16/tokenizer.json
chaiml-mistral31-24b-s-69496-v16-mkmlizer: cp /dev/shm/model_cache/flywheel_model.1.safetensors s3://guanaco-mkml-models/chaiml-mistral31-24b-s-69496-v16/flywheel_model.1.safetensors
chaiml-mistral31-24b-s-69496-v16-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-mistral31-24b-s-69496-v16/flywheel_model.0.safetensors
chaiml-mistral31-24b-s-69496-v16-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%| | 4/363 [00:00<00:09, 38.88it/s] Loading 0: 2%|▏ | 8/363 [00:00<00:12, 29.21it/s] Loading 0: 3%|▎ | 12/363 [00:00<00:11, 31.59it/s] Loading 0: 4%|▍ | 16/363 [00:00<00:12, 27.72it/s] Loading 0: 6%|▌ | 21/363 [00:00<00:10, 32.38it/s] Loading 0: 7%|▋ | 25/363 [00:00<00:11, 28.92it/s] Loading 0: 9%|▉ | 32/363 [00:00<00:09, 35.25it/s] Loading 0: 10%|▉ | 36/363 [00:01<00:15, 21.05it/s] Loading 0: 11%|█ | 40/363 [00:01<00:13, 23.74it/s] Loading 0: 12%|█▏ | 44/363 [00:01<00:12, 24.86it/s] Loading 0: 13%|█▎ | 48/363 [00:01<00:12, 25.58it/s] Loading 0: 14%|█▍ | 51/363 [00:01<00:13, 22.67it/s] Loading 0: 15%|█▌ | 55/363 [00:02<00:12, 25.66it/s] Loading 0: 16%|█▌ | 58/363 [00:02<00:11, 25.71it/s] Loading 0: 17%|█▋ | 61/363 [00:02<00:12, 24.48it/s] Loading 0: 18%|█▊ | 64/363 [00:02<00:11, 25.58it/s] Loading 0: 19%|█▉ | 69/363 [00:02<00:10, 27.42it/s] Loading 0: 20%|█▉ | 72/363 [00:02<00:15, 19.19it/s] Loading 0: 21%|██ | 75/363 [00:03<00:14, 19.64it/s] Loading 0: 21%|██▏ | 78/363 [00:03<00:13, 21.55it/s] Loading 0: 22%|██▏ | 81/363 [00:03<00:14, 19.65it/s] Loading 0: 24%|██▎ | 86/363 [00:03<00:11, 23.71it/s] Loading 0: 25%|██▍ | 89/363 [00:03<00:11, 23.62it/s] Loading 0: 25%|██▌ | 92/363 [00:03<00:14, 19.06it/s] Loading 0: 27%|██▋ | 97/363 [00:03<00:10, 24.67it/s] Loading 0: 28%|██▊ | 100/363 [00:04<00:10, 24.91it/s] Loading 0: 28%|██▊ | 103/363 [00:04<00:10, 24.17it/s] Loading 0: 29%|██▉ | 107/363 [00:04<00:12, 19.72it/s] Loading 0: 31%|███ | 111/363 [00:04<00:10, 23.33it/s] Loading 0: 31%|███▏ | 114/363 [00:04<00:11, 21.28it/s] Loading 0: 33%|███▎ | 120/363 [00:04<00:09, 26.70it/s] Loading 0: 34%|███▍ | 123/363 [00:05<00:09, 24.24it/s] Loading 0: 36%|███▌ | 129/363 [00:05<00:08, 28.90it/s] Loading 0: 37%|███▋ | 133/363 [00:05<00:08, 27.50it/s] Loading 0: 38%|███▊ | 138/363 [00:05<00:07, 30.14it/s] Loading 0: 39%|███▉ | 142/363 [00:05<00:08, 27.58it/s] Loading 0: 41%|████ | 148/363 [00:05<00:06, 34.26it/s] Loading 0: 42%|████▏ | 152/363 [00:06<00:09, 23.42it/s] Loading 0: 43%|████▎ | 155/363 [00:06<00:08, 23.60it/s] Loading 0: 44%|████▎ | 158/363 [00:06<00:10, 19.94it/s] Loading 0: 45%|████▌ | 165/363 [00:06<00:07, 26.57it/s] Loading 0: 47%|████▋ | 169/363 [00:06<00:07, 25.66it/s] Loading 0: 48%|████▊ | 174/363 [00:06<00:06, 28.41it/s] Loading 0: 49%|████▉ | 178/363 [00:07<00:06, 27.13it/s] Loading 0: 50%|█████ | 182/363 [00:07<00:06, 27.46it/s] Loading 0: 52%|█████▏ | 187/363 [00:07<00:07, 24.66it/s] Loading 0: 52%|█████▏ | 190/363 [00:07<00:07, 22.68it/s] Loading 0: 53%|█████▎ | 193/363 [00:07<00:07, 23.54it/s] Loading 0: 54%|█████▍ | 196/363 [00:07<00:06, 23.88it/s] Loading 0: 55%|█████▌ | 200/363 [00:22<00:06, 23.88it/s] Loading 0: 55%|█████▌ | 201/363 [00:22<03:03, 1.13s/it] Loading 0: 56%|█████▌ | 203/363 [00:22<02:32, 1.05it/s] Loading 0: 57%|█████▋ | 208/363 [00:22<01:32, 1.68it/s] Loading 0: 58%|█████▊ | 211/363 [00:23<01:10, 2.16it/s] Loading 0: 59%|█████▉ | 214/363 [00:23<00:52, 2.84it/s] Loading 0: 60%|██████ | 218/363 [00:23<00:35, 4.06it/s] Loading 0: 61%|██████ | 221/363 [00:23<00:27, 5.17it/s] Loading 0: 62%|██████▏ | 224/363 [00:23<00:22, 6.16it/s] Loading 0: 63%|██████▎ | 229/363 [00:23<00:14, 9.06it/s] Loading 0: 64%|██████▍ | 232/363 [00:24<00:12, 10.68it/s] Loading 0: 65%|██████▌ | 237/363 [00:24<00:08, 14.28it/s] Loading 0: 66%|██████▌ | 240/363 [00:24<00:08, 14.81it/s] Loading 0: 68%|██████▊ | 246/363 [00:24<00:05, 20.10it/s] Loading 0: 69%|██████▊ | 249/363 [00:24<00:05, 19.57it/s] Loading 0: 70%|███████ | 255/363 [00:24<00:04, 24.55it/s] Loading 0: 71%|███████▏ | 259/363 [00:25<00:04, 23.67it/s] Loading 0: 73%|███████▎ | 264/363 [00:25<00:03, 28.41it/s] Loading 0: 74%|███████▍ | 268/363 [00:25<00:04, 22.25it/s] Loading 0: 75%|███████▍ | 271/363 [00:25<00:04, 20.85it/s] Loading 0: 75%|███████▌ | 274/363 [00:25<00:04, 21.94it/s] Loading 0: 76%|███████▋ | 277/363 [00:25<00:03, 22.43it/s] Loading 0: 78%|███████▊ | 282/363 [00:25<00:03, 25.19it/s] Loading 0: 79%|███████▊ | 285/363 [00:26<00:03, 22.85it/s] Loading 0: 80%|████████ | 291/363 [00:26<00:02, 28.06it/s] Loading 0: 81%|████████ | 294/363 [00:26<00:02, 25.04it/s] Loading 0: 82%|████████▏ | 299/363 [00:26<00:02, 27.85it/s] Loading 0: 84%|████████▎ | 304/363 [00:26<00:02, 24.56it/s] Loading 0: 85%|████████▍ | 307/363 [00:27<00:02, 22.64it/s] Loading 0: 85%|████████▌ | 310/363 [00:27<00:02, 23.55it/s] Loading 0: 86%|████████▌ | 313/363 [00:27<00:02, 23.76it/s] Loading 0: 88%|████████▊ | 318/363 [00:27<00:01, 26.94it/s] Loading 0: 88%|████████▊ | 321/363 [00:27<00:01, 24.39it/s] Loading 0: 90%|█████████ | 327/363 [00:27<00:01, 29.39it/s] Loading 0: 91%|█████████ | 330/363 [00:27<00:01, 25.83it/s] Loading 0: 92%|█████████▏| 335/363 [00:27<00:00, 28.51it/s] Loading 0: 93%|█████████▎| 338/363 [00:28<00:00, 27.02it/s] Loading 0: 94%|█████████▍| 341/363 [00:28<00:01, 15.92it/s] Loading 0: 96%|█████████▌| 347/363 [00:28<00:00, 21.31it/s] Loading 0: 96%|█████████▋| 350/363 [00:28<00:00, 21.88it/s] Loading 0: 98%|█████████▊| 355/363 [00:28<00:00, 25.28it/s] Loading 0: 99%|█████████▊| 358/363 [00:29<00:00, 22.94it/s]
Job chaiml-mistral31-24b-s-69496-v16-mkmlizer completed after 126.35s with status: succeeded
Stopping job with name chaiml-mistral31-24b-s-69496-v16-mkmlizer
Pipeline stage MKMLizer completed in 126.94s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.17s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service chaiml-mistral31-24b-s-69496-v16
Waiting for inference service chaiml-mistral31-24b-s-69496-v16 to be ready
Inference service chaiml-mistral31-24b-s-69496-v16 ready after 211.01552081108093s
Pipeline stage MKMLDeployer completed in 211.55s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.2990052700042725s
Received healthy response to inference request in 2.368696451187134s
Received healthy response to inference request in 2.190910577774048s
Received healthy response to inference request in 2.5616915225982666s
5 requests
1 failed requests
5th percentile: 2.226467752456665
10th percentile: 2.262024927139282
20th percentile: 2.3331392765045167
30th percentile: 2.4072954654693604
40th percentile: 2.4844934940338135
50th percentile: 2.5616915225982666
60th percentile: 2.8566170215606688
70th percentile: 3.151542520523071
80th percentile: 6.692668390274051
90th percentile: 13.4799946308136
95th percentile: 16.87365775108337
99th percentile: 19.588588247299192
mean time: 6.137524938583374
%s, retrying in %s seconds...
Received healthy response to inference request in 2.2261693477630615s
Received healthy response to inference request in 2.2522988319396973s
Received healthy response to inference request in 2.2098395824432373s
Received healthy response to inference request in 2.217709541320801s
Received healthy response to inference request in 2.2107253074645996s
5 requests
0 failed requests
5th percentile: 2.2100167274475098
10th percentile: 2.2101938724517822
20th percentile: 2.210548162460327
30th percentile: 2.21212215423584
40th percentile: 2.2149158477783204
50th percentile: 2.217709541320801
60th percentile: 2.221093463897705
70th percentile: 2.2244773864746095
80th percentile: 2.2313952445983887
90th percentile: 2.241847038269043
95th percentile: 2.24707293510437
99th percentile: 2.2512536525726317
mean time: 2.223348522186279
Pipeline stage StressChecker completed in 45.52s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.86s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.83s
Shutdown handler de-registered
chaiml-mistral31-24b-s_69496_v16 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 5304.13s
Shutdown handler de-registered
chaiml-mistral31-24b-s_69496_v16 status is now inactive due to auto deactivation removed underperforming models
chaiml-mistral31-24b-s_69496_v16 status is now torndown due to DeploymentManager action