meta-llama-llama-3-1-8b_7331

developer_uid: bogoconic1

submission_id: meta-llama-llama-3-1-8b_7331_v15

model_name: meta-llama-llama-3-1-8b_7331_v15

model_group: meta-llama/Llama-3.1-8B-

status: torndown

timestamp: 2025-06-24T06:51:04+00:00

num_battles: 10665

num_wins: 4281

celo_rating: 1210.16

family_friendly_score: 0.6462

family_friendly_standard_error: 0.006762034605057859

submission_type: basic

model_repo: meta-llama/Llama-3.1-8B-Instruct

model_architecture: LlamaForCausalLM

model_num_parameters: 8030261248.0

best_of: 8

max_input_tokens: 1024

max_output_tokens: 64

reward_model: default

latencies: [{'batch_size': 1, 'throughput': 0.8524445456039822, 'latency_mean': 1.172925248146057, 'latency_p50': 1.1724826097488403, 'latency_p90': 1.2944273948669434}, {'batch_size': 4, 'throughput': 1.742591803429816, 'latency_mean': 2.286733229160309, 'latency_p50': 2.283012628555298, 'latency_p90': 2.523360013961792}, {'batch_size': 5, 'throughput': 1.8701622033654612, 'latency_mean': 2.663074880838394, 'latency_p50': 2.6555776596069336, 'latency_p90': 3.001347064971924}, {'batch_size': 8, 'throughput': 2.0906217652135055, 'latency_mean': 3.8023793792724607, 'latency_p50': 3.798878788948059, 'latency_p90': 4.248833346366882}, {'batch_size': 10, 'throughput': 2.125671884594985, 'latency_mean': 4.663769690990448, 'latency_p50': 4.661526560783386, 'latency_p90': 5.187494373321533}, {'batch_size': 12, 'throughput': 2.1684616835659782, 'latency_mean': 5.4824209094047545, 'latency_p50': 5.488163948059082, 'latency_p90': 6.230831599235534}, {'batch_size': 15, 'throughput': 2.1595396370255644, 'latency_mean': 6.843355921506881, 'latency_p50': 6.852862358093262, 'latency_p90': 7.610356545448303}]

gpu_counts: {'NVIDIA RTX A5000': 1}

display_name: meta-llama-llama-3-1-8b_7331_v15

is_internal_developer: False

language_model: meta-llama/Llama-3.1-8B-Instruct

model_size: 8B

ranking_group: single

throughput_3p7s: 2.09

us_pacific_date: 2025-06-23

win_ratio: 0.40140646976090016

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name meta-llama-llama-3-1-8b-7331-v15-mkmlizer
Waiting for job on meta-llama-llama-3-1-8b-7331-v15-mkmlizer to finish
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║                                                                     ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║   ██████   ██████  █████   ████  ████                               ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║  ░░██████ ██████  ░░███   ███░  ░░███                               ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║   ░███░█████░███   ░███  ███     ░███                               ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║   ░███░░███ ░███   ░███████      ░███                               ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║   ░███ ░░░  ░███   ░███░░███     ░███                               ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║   ░███      ░███   ░███ ░░███    ░███                               ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║   █████     █████  █████ ░░████  █████                              ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║  ░░░░░     ░░░░░  ░░░░░   ░░░░  ░░░░░                               ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║                                                                     ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║  Version: 0.29.3                                                    ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║  Features: FLYWHEEL, CUDA                                           ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║  Copyright 2023-2025 MK ONE TECHNOLOGIES Inc.                       ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║  https://mk1.ai                                                     ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║                                                                     ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║  The license key for the current software has been verified as      ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║  belonging to:                                                      ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║                                                                     ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║  Chai Research Corp.                                                ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║  Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f                   ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║  Expiration: 2028-03-31 23:59:59                                    ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ║                                                                     ║
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: Downloaded to shared memory in 43.910s
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: Checking if meta-llama/Llama-3.1-8B-Instruct already exists in ChaiML
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: Creating repo ChaiML/Llama-3.1-8B-Instruct and uploading /tmp/tmpq6x9btb_ to it
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: 
  0%|          | 0/6 [00:00<?, ?it/s]
 17%|█▋        | 1/6 [00:03<00:18,  3.79s/it]
 33%|███▎      | 2/6 [00:09<00:19,  4.90s/it]
 50%|█████     | 3/6 [00:14<00:14,  4.95s/it]
 67%|██████▋   | 4/6 [00:16<00:07,  3.69s/it]
 83%|████████▎ | 5/6 [00:29<00:07,  7.32s/it]
100%|██████████| 6/6 [00:31<00:00,  5.19s/it]
100%|██████████| 6/6 [00:31<00:00,  5.17s/it]
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpq6x9btb_, device:0
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: Saving flywheel model at /dev/shm/model_cache
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: quantized model in 22.299s
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: Processed model meta-llama/Llama-3.1-8B-Instruct in 138.664s
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: creating bucket guanaco-mkml-models
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-7331-v15
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-7331-v15/config.json
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-7331-v15/special_tokens_map.json
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-7331-v15/tokenizer_config.json
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-7331-v15/tokenizer.json
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-7331-v15/flywheel_model.0.safetensors
meta-llama-llama-3-1-8b-7331-v15-mkmlizer: 
Loading 0:   0%|          | 0/291 [00:00<?, ?it/s]
Loading 0:   2%|▏         | 5/291 [00:00<00:08, 34.78it/s]
Loading 0:   4%|▍         | 13/291 [00:00<00:05, 55.09it/s]
Loading 0:   7%|▋         | 19/291 [00:00<00:05, 46.77it/s]
Loading 0:   8%|▊         | 24/291 [00:00<00:05, 45.34it/s]
Loading 0:  11%|█         | 31/291 [00:00<00:05, 51.89it/s]
Loading 0:  13%|█▎        | 37/291 [00:00<00:05, 47.80it/s]
Loading 0:  14%|█▍        | 42/291 [00:00<00:05, 46.56it/s]
Loading 0:  17%|█▋        | 49/291 [00:00<00:04, 51.91it/s]
Loading 0:  19%|█▉        | 55/291 [00:01<00:04, 48.04it/s]
Loading 0:  21%|██        | 60/291 [00:01<00:04, 46.78it/s]
Loading 0:  23%|██▎       | 67/291 [00:01<00:04, 51.23it/s]
Loading 0:  25%|██▌       | 73/291 [00:01<00:04, 46.85it/s]
Loading 0:  27%|██▋       | 78/291 [00:01<00:04, 45.36it/s]
Loading 0:  29%|██▊       | 83/291 [00:01<00:06, 31.05it/s]
Loading 0:  30%|██▉       | 87/291 [00:02<00:06, 31.53it/s]
Loading 0:  32%|███▏      | 93/291 [00:02<00:05, 37.47it/s]
Loading 0:  34%|███▎      | 98/291 [00:02<00:04, 39.50it/s]
Loading 0:  35%|███▌      | 103/291 [00:02<00:04, 41.35it/s]
Loading 0:  37%|███▋      | 109/291 [00:02<00:04, 39.87it/s]
Loading 0:  39%|███▉      | 114/291 [00:02<00:04, 40.27it/s]
Loading 0:  42%|████▏     | 121/291 [00:02<00:03, 45.94it/s]
Loading 0:  43%|████▎     | 126/291 [00:02<00:03, 46.80it/s]
Loading 0:  45%|████▌     | 131/291 [00:03<00:04, 38.03it/s]
Loading 0:  48%|████▊     | 139/291 [00:03<00:03, 46.60it/s]
Loading 0:  50%|████▉     | 145/291 [00:03<00:03, 43.95it/s]
Loading 0:  52%|█████▏    | 150/291 [00:03<00:03, 43.52it/s]
Loading 0:  54%|█████▍    | 157/291 [00:03<00:02, 48.70it/s]
Loading 0:  56%|█████▌    | 163/291 [00:03<00:02, 44.58it/s]
Loading 0:  58%|█████▊    | 168/291 [00:03<00:02, 43.96it/s]
Loading 0:  59%|█████▉    | 173/291 [00:03<00:02, 44.45it/s]
Loading 0:  62%|██████▏   | 179/291 [00:04<00:02, 48.07it/s]
Loading 0:  63%|██████▎   | 184/291 [00:04<00:02, 48.06it/s]
Loading 0:  65%|██████▍   | 189/291 [00:04<00:03, 27.96it/s]
Loading 0:  67%|██████▋   | 194/291 [00:04<00:03, 30.19it/s]
Loading 0:  69%|██████▉   | 201/291 [00:04<00:02, 37.98it/s]
Loading 0:  71%|███████   | 206/291 [00:04<00:02, 39.03it/s]
Loading 0:  73%|███████▎  | 211/291 [00:04<00:01, 40.73it/s]
Loading 0:  75%|███████▍  | 217/291 [00:05<00:01, 40.36it/s]
Loading 0:  76%|███████▋  | 222/291 [00:05<00:01, 41.31it/s]
Loading 0:  78%|███████▊  | 228/291 [00:05<00:01, 45.59it/s]
Loading 0:  80%|████████  | 233/291 [00:05<00:01, 44.69it/s]
Loading 0:  82%|████████▏ | 238/291 [00:05<00:01, 44.95it/s]
Loading 0:  84%|████████▍ | 244/291 [00:05<00:01, 42.38it/s]
Loading 0:  86%|████████▌ | 249/291 [00:05<00:01, 41.76it/s]
Loading 0:  88%|████████▊ | 255/291 [00:05<00:00, 46.17it/s]
Loading 0:  89%|████████▉ | 260/291 [00:06<00:00, 46.75it/s]
Loading 0:  91%|█████████ | 265/291 [00:06<00:00, 47.41it/s]
Loading 0:  93%|█████████▎| 271/291 [00:06<00:00, 42.61it/s]
Loading 0:  95%|█████████▍| 276/291 [00:06<00:00, 41.61it/s]
Loading 0:  97%|█████████▋| 282/291 [00:06<00:00, 39.25it/s]
Loading 0:  99%|█████████▊| 287/291 [00:07<00:00, 24.66it/s]
                                                            
Job meta-llama-llama-3-1-8b-7331-v15-mkmlizer completed after 166.59s with status: succeeded
Stopping job with name meta-llama-llama-3-1-8b-7331-v15-mkmlizer
Pipeline stage MKMLizer completed in 167.42s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.17s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service meta-llama-llama-3-1-8b-7331-v15
Waiting for inference service meta-llama-llama-3-1-8b-7331-v15 to be ready
Failed to get response for submission chaiml-giyu-disabled-hu_13897_v2: ('http://chaiml-giyu-disabled-hu-13897-v2-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Inference service meta-llama-llama-3-1-8b-7331-v15 ready after 130.90097284317017s
Pipeline stage MKMLDeployer completed in 131.46s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.103200912475586s
Received healthy response to inference request in 1.2529664039611816s
Received healthy response to inference request in 1.242323875427246s
Received healthy response to inference request in 1.1505465507507324s
Received healthy response to inference request in 1.0377750396728516s
5 requests
0 failed requests
5th percentile: 1.0603293418884276
10th percentile: 1.082883644104004
20th percentile: 1.1279922485351563
30th percentile: 1.1689020156860352
40th percentile: 1.2056129455566407
50th percentile: 1.242323875427246
60th percentile: 1.2465808868408204
70th percentile: 1.2508378982543946
80th percentile: 1.4230133056640626
90th percentile: 1.7631071090698243
95th percentile: 1.933154010772705
99th percentile: 2.06919153213501
mean time: 1.3573625564575196
Pipeline stage StressChecker completed in 8.12s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.70s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.72s
Shutdown handler de-registered
meta-llama-llama-3-1-8b_7331_v15 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 2952.96s
Shutdown handler de-registered
meta-llama-llama-3-1-8b_7331_v15 status is now inactive due to auto deactivation removed underperforming models
meta-llama-llama-3-1-8b_7331_v15 status is now torndown due to DeploymentManager action