developer_uid: NischayDnk
submission_id: arliai-mistral-small-24_72454_v1
model_name: mistral-small-3-24b-arlisft
model_group: ArliAI/Mistral-Small-24B
status: torndown
timestamp: 2025-02-21T06:09:28+00:00
num_battles: 7695
num_wins: 3561
celo_rating: 1251.33
family_friendly_score: 0.5688
family_friendly_standard_error: 0.00700380696478708
submission_type: basic
model_repo: ArliAI/Mistral-Small-24B-ArliAI-RPMax-v1.4
model_architecture: MistralForCausalLM
model_num_parameters: 24096691200.0
best_of: 8
max_input_tokens: 1024
max_output_tokens: 64
reward_model: default
latencies: [{'batch_size': 1, 'throughput': 0.3690653768422302, 'latency_mean': 2.7094593262672424, 'latency_p50': 2.72083055973053, 'latency_p90': 3.0185778856277468}, {'batch_size': 2, 'throughput': 0.5685750555551543, 'latency_mean': 3.50814888715744, 'latency_p50': 3.5360794067382812, 'latency_p90': 3.871333885192871}, {'batch_size': 3, 'throughput': 0.7251695867446892, 'latency_mean': 4.125117843151092, 'latency_p50': 4.090817451477051, 'latency_p90': 4.513516569137574}, {'batch_size': 4, 'throughput': 0.851494805643532, 'latency_mean': 4.685441383123398, 'latency_p50': 4.6797837018966675, 'latency_p90': 5.2357570886611935}, {'batch_size': 5, 'throughput': 0.9319248374795278, 'latency_mean': 5.332217888832092, 'latency_p50': 5.332700133323669, 'latency_p90': 6.002668261528015}]
gpu_counts: {'NVIDIA RTX A6000': 1}
display_name: mistral-small-3-24b-arlisft
is_internal_developer: False
language_model: ArliAI/Mistral-Small-24B-ArliAI-RPMax-v1.4
model_size: 24B
ranking_group: single
throughput_3p7s: 0.62
us_pacific_date: 2025-02-20
win_ratio: 0.46276803118908383
generation_params: {'temperature': 0.9, 'top_p': 0.95, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name arliai-mistral-small-24-72454-v1-mkmlizer
Waiting for job on arliai-mistral-small-24-72454-v1-mkmlizer to finish
arliai-mistral-small-24-72454-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
arliai-mistral-small-24-72454-v1-mkmlizer: ║ _____ __ __ ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ /___/ ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ Version: 0.12.8 ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ https://mk1.ai ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ The license key for the current software has been verified as ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ belonging to: ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ Chai Research Corp. ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ Expiration: 2025-04-15 23:59:59 ║
arliai-mistral-small-24-72454-v1-mkmlizer: ║ ║
arliai-mistral-small-24-72454-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
arliai-mistral-small-24-72454-v1-mkmlizer: Downloaded to shared memory in 88.357s
arliai-mistral-small-24-72454-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpqaqh_5nh, device:0
arliai-mistral-small-24-72454-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
arliai-mistral-small-24-72454-v1-mkmlizer: quantized model in 54.509s
arliai-mistral-small-24-72454-v1-mkmlizer: Processed model ArliAI/Mistral-Small-24B-ArliAI-RPMax-v1.4 in 142.867s
arliai-mistral-small-24-72454-v1-mkmlizer: creating bucket guanaco-mkml-models
arliai-mistral-small-24-72454-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
arliai-mistral-small-24-72454-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/arliai-mistral-small-24-72454-v1
arliai-mistral-small-24-72454-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/arliai-mistral-small-24-72454-v1/config.json
arliai-mistral-small-24-72454-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/arliai-mistral-small-24-72454-v1/special_tokens_map.json
arliai-mistral-small-24-72454-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/arliai-mistral-small-24-72454-v1/tokenizer_config.json
arliai-mistral-small-24-72454-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/arliai-mistral-small-24-72454-v1/tokenizer.json
arliai-mistral-small-24-72454-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.1.safetensors s3://guanaco-mkml-models/arliai-mistral-small-24-72454-v1/flywheel_model.1.safetensors
arliai-mistral-small-24-72454-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/arliai-mistral-small-24-72454-v1/flywheel_model.0.safetensors
arliai-mistral-small-24-72454-v1-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%| | 4/363 [00:00<00:09, 37.19it/s] Loading 0: 2%|▏ | 8/363 [00:00<00:12, 29.05it/s] Loading 0: 3%|▎ | 12/363 [00:00<00:11, 30.81it/s] Loading 0: 4%|▍ | 16/363 [00:00<00:12, 28.23it/s] Loading 0: 6%|▌ | 21/363 [00:00<00:10, 32.76it/s] Loading 0: 7%|▋ | 25/363 [00:00<00:11, 29.27it/s] Loading 0: 9%|▉ | 32/363 [00:00<00:09, 35.55it/s] Loading 0: 10%|▉ | 36/363 [00:01<00:15, 21.18it/s] Loading 0: 11%|█ | 40/363 [00:01<00:13, 23.77it/s] Loading 0: 12%|█▏ | 44/363 [00:01<00:12, 25.29it/s] Loading 0: 13%|█▎ | 48/363 [00:01<00:11, 27.33it/s] Loading 0: 14%|█▍ | 52/363 [00:01<00:11, 26.51it/s] Loading 0: 16%|█▌ | 57/363 [00:02<00:10, 28.68it/s] Loading 0: 17%|█▋ | 61/363 [00:02<00:11, 26.26it/s] Loading 0: 18%|█▊ | 64/363 [00:02<00:11, 26.15it/s] Loading 0: 19%|█▉ | 69/363 [00:02<00:10, 28.72it/s] Loading 0: 20%|█▉ | 72/363 [00:02<00:14, 20.06it/s] Loading 0: 21%|██ | 75/363 [00:02<00:13, 20.65it/s] Loading 0: 22%|██▏ | 79/363 [00:03<00:12, 22.80it/s] Loading 0: 23%|██▎ | 82/363 [00:03<00:12, 23.06it/s] Loading 0: 24%|██▎ | 86/363 [00:03<00:10, 25.56it/s] Loading 0: 25%|██▍ | 89/363 [00:03<00:10, 26.18it/s] Loading 0: 25%|██▌ | 92/363 [00:03<00:12, 20.90it/s] Loading 0: 27%|██▋ | 99/363 [00:03<00:09, 28.30it/s] Loading 0: 28%|██▊ | 103/363 [00:03<00:09, 27.11it/s] Loading 0: 29%|██▉ | 107/363 [00:04<00:11, 22.09it/s] Loading 0: 31%|███ | 112/363 [00:04<00:09, 25.21it/s] Loading 0: 32%|███▏ | 115/363 [00:04<00:09, 25.36it/s] Loading 0: 33%|███▎ | 120/363 [00:04<00:08, 27.89it/s] Loading 0: 34%|███▍ | 123/363 [00:04<00:09, 24.25it/s] Loading 0: 35%|███▍ | 127/363 [00:04<00:09, 26.09it/s] Loading 0: 36%|███▌ | 130/363 [00:05<00:08, 26.19it/s] Loading 0: 37%|███▋ | 133/363 [00:05<00:08, 25.92it/s] Loading 0: 38%|███▊ | 138/363 [00:05<00:07, 28.57it/s] Loading 0: 39%|███▉ | 141/363 [00:05<00:08, 24.92it/s] Loading 0: 40%|███▉ | 145/363 [00:05<00:07, 28.25it/s] Loading 0: 41%|████▏ | 150/363 [00:05<00:08, 24.33it/s] Loading 0: 42%|████▏ | 153/363 [00:06<00:09, 21.05it/s] Loading 0: 43%|████▎ | 157/363 [00:06<00:08, 24.41it/s] Loading 0: 44%|████▍ | 160/363 [00:06<00:08, 25.06it/s] Loading 0: 45%|████▌ | 165/363 [00:06<00:07, 27.93it/s] Loading 0: 46%|████▋ | 168/363 [00:06<00:07, 25.35it/s] Loading 0: 48%|████▊ | 174/363 [00:06<00:06, 30.27it/s] Loading 0: 49%|████▉ | 178/363 [00:06<00:06, 28.43it/s] Loading 0: 50%|█████ | 182/363 [00:06<00:06, 27.94it/s] Loading 0: 52%|█████▏ | 187/363 [00:07<00:07, 24.65it/s] Loading 0: 52%|█████▏ | 190/363 [00:07<00:07, 22.75it/s] Loading 0: 53%|█████▎ | 193/363 [00:07<00:07, 23.78it/s] Loading 0: 54%|█████▍ | 196/363 [00:07<00:06, 24.38it/s] Loading 0: 55%|█████▌ | 200/363 [00:22<00:06, 24.38it/s] Loading 0: 55%|█████▌ | 201/363 [00:22<03:02, 1.13s/it] Loading 0: 56%|█████▌ | 203/363 [00:22<02:31, 1.05it/s] Loading 0: 57%|█████▋ | 208/363 [00:22<01:31, 1.69it/s] Loading 0: 58%|█████▊ | 211/363 [00:22<01:09, 2.18it/s] Loading 0: 59%|█████▉ | 214/363 [00:22<00:51, 2.87it/s] Loading 0: 60%|██████ | 218/363 [00:22<00:35, 4.10it/s] Loading 0: 61%|██████ | 221/363 [00:23<00:27, 5.24it/s] Loading 0: 62%|██████▏ | 224/363 [00:23<00:22, 6.27it/s] Loading 0: 63%|██████▎ | 228/363 [00:23<00:15, 8.79it/s] Loading 0: 64%|██████▎ | 231/363 [00:23<00:12, 10.28it/s] Loading 0: 65%|██████▌ | 237/363 [00:23<00:08, 15.16it/s] Loading 0: 66%|██████▌ | 240/363 [00:23<00:07, 15.96it/s] Loading 0: 68%|██████▊ | 246/363 [00:24<00:05, 21.19it/s] Loading 0: 69%|██████▉ | 250/363 [00:24<00:05, 22.07it/s] Loading 0: 70%|███████ | 255/363 [00:24<00:04, 25.10it/s] Loading 0: 71%|███████▏ | 259/363 [00:24<00:04, 24.48it/s] Loading 0: 72%|███████▏ | 262/363 [00:24<00:03, 25.49it/s] Loading 0: 73%|███████▎ | 266/363 [00:24<00:03, 27.67it/s] Loading 0: 74%|███████▍ | 270/363 [00:25<00:05, 16.84it/s] Loading 0: 75%|███████▌ | 274/363 [00:25<00:04, 19.87it/s] Loading 0: 76%|███████▋ | 277/363 [00:25<00:04, 20.76it/s] Loading 0: 78%|███████▊ | 282/363 [00:25<00:03, 23.98it/s] Loading 0: 79%|███████▊ | 285/363 [00:25<00:03, 22.27it/s] Loading 0: 80%|████████ | 291/363 [00:25<00:02, 27.38it/s] Loading 0: 81%|████████ | 294/363 [00:26<00:02, 24.43it/s] Loading 0: 82%|████████▏ | 299/363 [00:26<00:02, 26.70it/s] Loading 0: 84%|████████▎ | 304/363 [00:26<00:02, 24.40it/s] Loading 0: 85%|████████▍ | 307/363 [00:26<00:02, 22.86it/s] Loading 0: 85%|████████▌ | 310/363 [00:26<00:02, 23.56it/s] Loading 0: 86%|████████▌ | 313/363 [00:26<00:02, 24.02it/s] Loading 0: 88%|████████▊ | 318/363 [00:26<00:01, 27.15it/s] Loading 0: 88%|████████▊ | 321/363 [00:27<00:01, 24.05it/s] Loading 0: 90%|████████▉ | 325/363 [00:27<00:01, 26.87it/s] Loading 0: 90%|█████████ | 328/363 [00:27<00:01, 26.79it/s] Loading 0: 91%|█████████ | 331/363 [00:27<00:01, 24.76it/s] Loading 0: 92%|█████████▏| 335/363 [00:27<00:01, 25.22it/s] Loading 0: 93%|█████████▎| 338/363 [00:27<00:01, 24.00it/s] Loading 0: 94%|█████████▍| 341/363 [00:34<00:14, 1.54it/s] Loading 0: 96%|█████████▌| 347/363 [00:34<00:05, 2.68it/s] Loading 0: 96%|█████████▋| 350/363 [00:34<00:03, 3.40it/s] Loading 0: 98%|█████████▊| 355/363 [00:34<00:01, 5.10it/s] Loading 0: 99%|█████████▊| 358/363 [00:35<00:00, 6.14it/s]
Job arliai-mistral-small-24-72454-v1-mkmlizer completed after 175.61s with status: succeeded
Stopping job with name arliai-mistral-small-24-72454-v1-mkmlizer
Pipeline stage MKMLizer completed in 177.14s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service arliai-mistral-small-24-72454-v1
Waiting for inference service arliai-mistral-small-24-72454-v1 to be ready
Inference service arliai-mistral-small-24-72454-v1 ready after 220.83780980110168s
Pipeline stage MKMLDeployer completed in 221.35s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.280107021331787s
Received healthy response to inference request in 1.6039063930511475s
Received healthy response to inference request in 2.465754508972168s
Received healthy response to inference request in 2.4036600589752197s
Received healthy response to inference request in 2.5761122703552246s
5 requests
0 failed requests
5th percentile: 1.7391465187072754
10th percentile: 1.8743866443634034
20th percentile: 2.1448668956756594
30th percentile: 2.304817628860474
40th percentile: 2.3542388439178468
50th percentile: 2.4036600589752197
60th percentile: 2.428497838973999
70th percentile: 2.4533356189727784
80th percentile: 2.4878260612487795
90th percentile: 2.531969165802002
95th percentile: 2.554040718078613
99th percentile: 2.571697959899902
mean time: 2.265908050537109
Pipeline stage StressChecker completed in 12.56s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.70s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.64s
Shutdown handler de-registered
arliai-mistral-small-24_72454_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.10s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.09s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service arliai-mistral-small-24-72454-v1-profiler
Waiting for inference service arliai-mistral-small-24-72454-v1-profiler to be ready
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 5028.80s
Shutdown handler de-registered
arliai-mistral-small-24_72454_v1 status is now inactive due to auto deactivation removed underperforming models
arliai-mistral-small-24_72454_v1 status is now torndown due to DeploymentManager action