developer_uid: richhx
submission_id: meta-llama-llama-3-1-8b_v4
model_name: meta-llama-llama-3-1-8b_v4
model_group: meta-llama/Llama-3.1-8B
status: torndown
timestamp: 2025-07-24T18:27:27+00:00
num_battles: 6756
num_wins: 2451
celo_rating: 1184.85
family_friendly_score: 0.5640000000000001
family_friendly_standard_error: 0.007012902394871898
submission_type: basic
model_repo: meta-llama/Llama-3.1-8B
model_architecture: LlamaForCausalLM
model_num_parameters: 8030261248.0
best_of: 8
max_input_tokens: 1024
max_output_tokens: 64
reward_model: default
latencies: [{'batch_size': 1, 'throughput': 0.9363883635946357, 'latency_mean': 1.0677747464179992, 'latency_p50': 1.0684548616409302, 'latency_p90': 1.1674275875091553}, {'batch_size': 4, 'throughput': 1.7985515831053085, 'latency_mean': 2.2143308007717133, 'latency_p50': 2.1971276998519897, 'latency_p90': 2.506149697303772}, {'batch_size': 5, 'throughput': 1.887162651409839, 'latency_mean': 2.634226838350296, 'latency_p50': 2.62212598323822, 'latency_p90': 2.924300193786621}, {'batch_size': 8, 'throughput': 2.0505611639099808, 'latency_mean': 3.877354167699814, 'latency_p50': 3.850661516189575, 'latency_p90': 4.382985234260559}, {'batch_size': 10, 'throughput': 2.093940162767685, 'latency_mean': 4.736622945070267, 'latency_p50': 4.751409411430359, 'latency_p90': 5.296436333656311}, {'batch_size': 12, 'throughput': 2.117669180800859, 'latency_mean': 5.605080338716507, 'latency_p50': 5.59574031829834, 'latency_p90': 6.346574306488036}, {'batch_size': 15, 'throughput': 2.143373172664391, 'latency_mean': 6.909559860229492, 'latency_p50': 6.933189868927002, 'latency_p90': 7.7593690156936646}]
gpu_counts: {'NVIDIA RTX A5000': 1}
display_name: meta-llama-llama-3-1-8b_v4
ineligible_reason: num_battles<10000
is_internal_developer: True
language_model: meta-llama/Llama-3.1-8B
model_size: 8B
ranking_group: single
throughput_3p7s: 2.05
us_pacific_date: 2025-07-24
win_ratio: 0.3627886323268206
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name meta-llama-llama-3-1-8b-v4-mkmlizer
Waiting for job on meta-llama-llama-3-1-8b-v4-mkmlizer to finish
meta-llama-llama-3-1-8b-v4-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ██████ ██████ █████ ████ ████ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ░░██████ ██████ ░░███ ███░ ░░███ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ░███░█████░███ ░███ ███ ░███ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ░███░░███ ░███ ░███████ ░███ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ░███ ░░░ ░███ ░███░░███ ░███ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ░███ ░███ ░███ ░░███ ░███ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ █████ █████ █████ ░░████ █████ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ░░░░░ ░░░░░ ░░░░░ ░░░░ ░░░░░ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ Version: 0.29.15 ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ Features: FLYWHEEL, CUDA ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ Copyright 2023-2025 MK ONE TECHNOLOGIES Inc. ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ https://mk1.ai ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ The license key for the current software has been verified as ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ belonging to: ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ Chai Research Corp. ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ Expiration: 2028-03-31 23:59:59 ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ║ ║
meta-llama-llama-3-1-8b-v4-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
meta-llama-llama-3-1-8b-v4-mkmlizer: Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
meta-llama-llama-3-1-8b-v4-mkmlizer: Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
meta-llama-llama-3-1-8b-v4-mkmlizer: Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
meta-llama-llama-3-1-8b-v4-mkmlizer: Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
meta-llama-llama-3-1-8b-v4-mkmlizer: Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
meta-llama-llama-3-1-8b-v4-mkmlizer: Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
meta-llama-llama-3-1-8b-v4-mkmlizer: Downloaded to shared memory in 38.466s
meta-llama-llama-3-1-8b-v4-mkmlizer: Checking if meta-llama/Llama-3.1-8B already exists in ChaiML
meta-llama-llama-3-1-8b-v4-mkmlizer: Creating repo ChaiML/Llama-3.1-8B and uploading /tmp/tmp4ycey17s to it
meta-llama-llama-3-1-8b-v4-mkmlizer: 0%| | 0/6 [00:00<?, ?it/s] 17%|█▋ | 1/6 [00:04<00:22, 4.41s/it] 33%|███▎ | 2/6 [00:10<00:21, 5.35s/it] 50%|█████ | 3/6 [00:14<00:14, 4.74s/it] 67%|██████▋ | 4/6 [00:16<00:07, 3.85s/it] 83%|████████▎ | 5/6 [00:27<00:06, 6.24s/it] 100%|██████████| 6/6 [00:28<00:00, 4.51s/it] 100%|██████████| 6/6 [00:28<00:00, 4.76s/it]
meta-llama-llama-3-1-8b-v4-mkmlizer: quantizing model to /dev/shm/model_cache, profile:q4, folder:/tmp/tmp4ycey17s, device:0
meta-llama-llama-3-1-8b-v4-mkmlizer: Saving flywheel model at /dev/shm/model_cache
meta-llama-llama-3-1-8b-v4-mkmlizer: quantized model in 106.311s
meta-llama-llama-3-1-8b-v4-mkmlizer: Processed model meta-llama/Llama-3.1-8B in 209.652s
meta-llama-llama-3-1-8b-v4-mkmlizer: creating bucket guanaco-mkml-models
meta-llama-llama-3-1-8b-v4-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
meta-llama-llama-3-1-8b-v4-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-v4/nvidia
meta-llama-llama-3-1-8b-v4-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-v4/nvidia/config.json
meta-llama-llama-3-1-8b-v4-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-v4/nvidia/special_tokens_map.json
meta-llama-llama-3-1-8b-v4-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-v4/nvidia/tokenizer_config.json
meta-llama-llama-3-1-8b-v4-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-v4/nvidia/tokenizer.json
meta-llama-llama-3-1-8b-v4-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/meta-llama-llama-3-1-8b-v4/nvidia/flywheel_model.0.safetensors
meta-llama-llama-3-1-8b-v4-mkmlizer: Loading 0: 0%| | 0/291 [00:00<?, ?it/s] Loading 0: 1%| | 3/291 [00:00<01:15, 3.82it/s] Loading 0: 1%|▏ | 4/291 [00:01<02:03, 2.33it/s] Loading 0: 2%|▏ | 5/291 [00:02<02:42, 1.76it/s] Loading 0: 3%|▎ | 8/291 [00:02<01:23, 3.41it/s] Loading 0: 3%|▎ | 9/291 [00:02<01:21, 3.45it/s] Loading 0: 3%|▎ | 10/291 [00:03<01:10, 3.96it/s] Loading 0: 4%|▍ | 12/291 [00:03<01:25, 3.26it/s] Loading 0: 4%|▍ | 13/291 [00:04<01:55, 2.41it/s] Loading 0: 5%|▍ | 14/291 [00:05<02:23, 1.92it/s] Loading 0: 6%|▌ | 17/291 [00:05<01:22, 3.31it/s] Loading 0: 6%|▌ | 18/291 [00:06<01:18, 3.47it/s] Loading 0: 7%|▋ | 19/291 [00:06<01:08, 3.98it/s] Loading 0: 7%|▋ | 21/291 [00:06<01:21, 3.29it/s] Loading 0: 8%|▊ | 22/291 [00:07<01:48, 2.47it/s] Loading 0: 8%|▊ | 23/291 [00:08<02:17, 1.95it/s] Loading 0: 9%|▉ | 26/291 [00:08<01:19, 3.35it/s] Loading 0: 9%|▉ | 27/291 [00:09<01:15, 3.51it/s] Loading 0: 10%|▉ | 28/291 [00:09<01:05, 4.01it/s] Loading 0: 10%|█ | 30/291 [00:09<01:19, 3.30it/s] Loading 0: 11%|█ | 31/291 [00:10<01:45, 2.47it/s] Loading 0: 11%|█ | 32/291 [00:11<02:11, 1.97it/s] Loading 0: 12%|█▏ | 35/291 [00:11<01:16, 3.34it/s] Loading 0: 12%|█▏ | 36/291 [00:12<01:12, 3.51it/s] Loading 0: 13%|█▎ | 37/291 [00:12<01:03, 4.02it/s] Loading 0: 13%|█▎ | 39/291 [00:12<01:16, 3.31it/s] Loading 0: 14%|█▎ | 40/291 [00:13<01:41, 2.48it/s] Loading 0: 14%|█▍ | 41/291 [00:14<02:07, 1.96it/s] Loading 0: 15%|█▌ | 44/291 [00:14<01:14, 3.32it/s] Loading 0: 15%|█▌ | 45/291 [00:15<01:10, 3.49it/s] Loading 0: 16%|█▌ | 46/291 [00:15<01:01, 3.98it/s] Loading 0: 16%|█▋ | 48/291 [00:16<01:13, 3.30it/s] Loading 0: 17%|█▋ | 49/291 [00:16<01:37, 2.48it/s] Loading 0: 17%|█▋ | 50/291 [00:17<02:03, 1.95it/s] Loading 0: 18%|█▊ | 53/291 [00:17<01:11, 3.32it/s] Loading 0: 19%|█▊ | 54/291 [00:18<01:07, 3.49it/s] Loading 0: 19%|█▉ | 55/291 [00:18<00:59, 4.00it/s] Loading 0: 20%|█▉ | 57/291 [00:19<01:10, 3.30it/s] Loading 0: 20%|█▉ | 58/291 [00:19<01:33, 2.48it/s] Loading 0: 20%|██ | 59/291 [00:20<01:57, 1.97it/s] Loading 0: 21%|██▏ | 62/291 [00:20<01:08, 3.35it/s] Loading 0: 22%|██▏ | 63/291 [00:21<01:04, 3.52it/s] Loading 0: 22%|██▏ | 64/291 [00:21<00:56, 4.03it/s] Loading 0: 23%|██▎ | 66/291 [00:22<01:08, 3.31it/s] Loading 0: 23%|██▎ | 67/291 [00:22<01:30, 2.47it/s] Loading 0: 23%|██▎ | 68/291 [00:23<01:53, 1.97it/s] Loading 0: 24%|██▍ | 71/291 [00:23<01:05, 3.34it/s] Loading 0: 25%|██▍ | 72/291 [00:24<01:02, 3.52it/s] Loading 0: 25%|██▌ | 73/291 [00:24<00:54, 4.00it/s] Loading 0: 26%|██▌ | 75/291 [00:25<01:05, 3.29it/s] Loading 0: 26%|██▌ | 76/291 [00:25<01:27, 2.46it/s] Loading 0: 26%|██▋ | 77/291 [00:26<01:49, 1.95it/s] Loading 0: 27%|██▋ | 80/291 [00:26<01:03, 3.31it/s] Loading 0: 28%|██▊ | 81/291 [00:27<01:00, 3.49it/s] Loading 0: 28%|██▊ | 82/291 [00:27<00:52, 3.98it/s] Loading 0: 29%|██▊ | 83/291 [00:27<00:48, 4.28it/s] Loading 0: 29%|██▉ | 84/291 [00:28<01:16, 2.70it/s] Loading 0: 29%|██▉ | 85/291 [00:29<01:38, 2.09it/s] Loading 0: 30%|██▉ | 86/291 [00:29<01:59, 1.72it/s] Loading 0: 31%|███ | 89/291 [00:30<01:03, 3.16it/s] Loading 0: 31%|███ | 90/291 [00:30<00:59, 3.36it/s] Loading 0: 31%|███▏ | 91/291 [00:30<00:51, 3.85it/s] Loading 0: 32%|███▏ | 93/291 [00:31<01:01, 3.22it/s] Loading 0: 32%|███▏ | 94/291 [00:32<01:21, 2.42it/s] Loading 0: 33%|███▎ | 95/291 [00:32<01:41, 1.93it/s] Loading 0: 34%|███▎ | 98/291 [00:33<00:58, 3.31it/s] Loading 0: 34%|███▍ | 99/291 [00:33<00:55, 3.49it/s] Loading 0: 34%|███▍ | 100/291 [00:33<00:47, 3.99it/s] Loading 0: 35%|███▌ | 102/291 [00:34<00:57, 3.29it/s] Loading 0: 35%|███▌ | 103/291 [00:35<01:16, 2.46it/s] Loading 0: 36%|███▌ | 104/291 [00:36<01:35, 1.95it/s] Loading 0: 37%|███▋ | 107/291 [00:36<00:55, 3.32it/s] Loading 0: 37%|███▋ | 108/291 [00:36<00:52, 3.49it/s] Loading 0: 37%|███▋ | 109/291 [00:36<00:45, 3.96it/s] Loading 0: 38%|███▊ | 111/291 [00:37<00:54, 3.28it/s] Loading 0: 38%|███▊ | 112/291 [00:38<01:12, 2.46it/s] Loading 0: 39%|███▉ | 113/291 [00:39<01:30, 1.97it/s] Loading 0: 40%|███▉ | 116/291 [00:39<00:52, 3.34it/s] Loading 0: 40%|████ | 117/291 [00:39<00:49, 3.52it/s] Loading 0: 41%|████ | 118/291 [00:39<00:43, 4.02it/s] Loading 0: 41%|████ | 120/291 [00:40<00:51, 3.31it/s] Loading 0: 42%|████▏ | 121/291 [00:41<01:08, 2.47it/s] Loading 0: 42%|████▏ | 122/291 [00:42<01:26, 1.96it/s] Loading 0: 43%|████▎ | 125/291 [00:42<00:49, 3.32it/s] Loading 0: 43%|████▎ | 126/291 [00:42<00:47, 3.49it/s] Loading 0: 44%|████▎ | 127/291 [00:42<00:41, 3.99it/s] Loading 0: 44%|████▍ | 129/291 [00:43<00:49, 3.29it/s] Loading 0: 45%|████▍ | 130/291 [00:44<01:05, 2.47it/s] Loading 0: 45%|████▌ | 131/291 [00:45<01:21, 1.97it/s] Loading 0: 46%|████▌ | 134/291 [00:45<00:46, 3.35it/s] Loading 0: 46%|████▋ | 135/291 [00:45<00:44, 3.51it/s] Loading 0: 47%|████▋ | 136/291 [00:45<00:38, 4.00it/s] Loading 0: 47%|████▋ | 138/291 [00:46<00:46, 3.29it/s] Loading 0: 48%|████▊ | 139/291 [00:47<01:01, 2.47it/s] Loading 0: 48%|████▊ | 140/291 [00:48<01:17, 1.96it/s] Loading 0: 49%|████▉ | 143/291 [00:48<00:44, 3.33it/s] Loading 0: 49%|████▉ | 144/291 [00:48<00:41, 3.50it/s] Loading 0: 50%|████▉ | 145/291 [00:48<00:36, 3.99it/s] Loading 0: 51%|█████ | 147/291 [00:49<00:43, 3.29it/s] Loading 0: 51%|█████ | 148/291 [00:50<00:57, 2.47it/s] Loading 0: 51%|█████ | 149/291 [00:51<01:12, 1.95it/s] Loading 0: 52%|█████▏ | 152/291 [00:51<00:41, 3.32it/s] Loading 0: 53%|█████▎ | 153/291 [00:51<00:39, 3.50it/s] Loading 0: 53%|█████▎ | 154/291 [00:51<00:34, 4.00it/s] Loading 0: 54%|█████▎ | 156/291 [00:52<00:40, 3.30it/s] Loading 0: 54%|█████▍ | 157/291 [00:53<00:54, 2.48it/s] Loading 0: 54%|█████▍ | 158/291 [00:54<01:07, 1.96it/s] Loading 0: 55%|█████▌ | 161/291 [00:54<00:38, 3.33it/s] Loading 0: 56%|█████▌ | 162/291 [00:54<00:36, 3.51it/s] Loading 0: 56%|█████▌ | 163/291 [00:54<00:32, 3.99it/s] Loading 0: 57%|█████▋ | 165/291 [00:55<00:38, 3.29it/s] Loading 0: 57%|█████▋ | 166/291 [00:56<00:50, 2.46it/s] Loading 0: 57%|█████▋ | 167/291 [00:57<01:03, 1.96it/s] Loading 0: 58%|█████▊ | 170/291 [00:57<00:36, 3.30it/s] Loading 0: 59%|█████▉ | 171/291 [00:57<00:34, 3.48it/s] Loading 0: 59%|█████▉ | 172/291 [00:57<00:29, 3.98it/s] Loading 0: 59%|█████▉ | 173/291 [00:58<00:44, 2.67it/s] Loading 0: 60%|██████ | 175/291 [00:58<00:32, 3.60it/s] Loading 0: 60%|██████ | 176/291 [00:59<00:30, 3.76it/s] Loading 0: 61%|██████ | 177/291 [00:59<00:26, 4.34it/s] Loading 0: 62%|██████▏ | 179/291 [01:00<00:33, 3.38it/s] Loading 0: 62%|██████▏ | 180/291 [01:00<00:44, 2.47it/s] Loading 0: 62%|██████▏ | 181/291 [01:01<00:56, 1.94it/s] Loading 0: 63%|██████▎ | 184/291 [01:01<00:31, 3.35it/s] Loading 0: 64%|██████▎ | 185/291 [01:02<00:30, 3.52it/s] Loading 0: 64%|██████▍ | 186/291 [01:02<00:26, 4.04it/s] Loading 0: 64%|██████▍ | 187/291 [01:02<00:23, 4.36it/s] Loading 0: 65%|██████▍ | 188/291 [01:03<00:37, 2.72it/s] Loading 0: 65%|██████▍ | 189/291 [01:04<00:50, 2.03it/s] Loading 0: 66%|██████▌ | 192/291 [01:04<00:36, 2.72it/s] Loading 0: 66%|██████▋ | 193/291 [01:05<00:44, 2.22it/s] Loading 0: 67%|██████▋ | 194/291 [01:06<00:52, 1.85it/s] Loading 0: 68%|██████▊ | 197/291 [01:06<00:30, 3.12it/s] Loading 0: 68%|██████▊ | 198/291 [01:06<00:28, 3.31it/s] Loading 0: 68%|██████▊ | 199/291 [01:07<00:24, 3.80it/s] Loading 0: 69%|██████▉ | 201/291 [01:07<00:27, 3.23it/s] Loading 0: 69%|██████▉ | 202/291 [01:08<00:36, 2.45it/s] Loading 0: 70%|██████▉ | 203/291 [01:09<00:44, 1.97it/s] Loading 0: 71%|███████ | 206/291 [01:09<00:25, 3.34it/s] Loading 0: 71%|███████ | 207/291 [01:09<00:23, 3.51it/s] Loading 0: 71%|███████▏ | 208/291 [01:10<00:20, 3.98it/s] Loading 0: 72%|███████▏ | 210/291 [01:10<00:24, 3.31it/s] Loading 0: 73%|███████▎ | 211/291 [01:11<00:32, 2.48it/s] Loading 0: 73%|███████▎ | 212/291 [01:12<00:40, 1.97it/s] Loading 0: 74%|███████▍ | 215/291 [01:12<00:22, 3.35it/s] Loading 0: 74%|███████▍ | 216/291 [01:12<00:21, 3.52it/s] Loading 0: 75%|███████▍ | 217/291 [01:13<00:18, 4.01it/s] Loading 0: 75%|███████▌ | 219/291 [01:13<00:21, 3.32it/s] Loading 0: 76%|███████▌ | 220/291 [01:14<00:28, 2.49it/s] Loading 0: 76%|███████▌ | 221/291 [01:15<00:35, 1.97it/s] Loading 0: 77%|███████▋ | 224/291 [01:15<00:20, 3.35it/s] Loading 0: 77%|███████▋ | 225/291 [01:15<00:18, 3.52it/s] Loading 0: 78%|███████▊ | 226/291 [01:16<00:16, 4.01it/s] Loading 0: 78%|███████▊ | 228/291 [01:16<00:19, 3.29it/s] Loading 0: 79%|███████▊ | 229/291 [01:17<00:25, 2.46it/s] Loading 0: 79%|███████▉ | 230/291 [01:18<00:31, 1.95it/s] Loading 0: 80%|████████ | 233/291 [01:18<00:17, 3.31it/s] Loading 0: 80%|████████ | 234/291 [01:19<00:16, 3.49it/s] Loading 0: 81%|████████ | 235/291 [01:19<00:14, 3.99it/s] Loading 0: 81%|████████▏ | 237/291 [01:19<00:16, 3.29it/s] Loading 0: 82%|████████▏ | 238/291 [01:20<00:21, 2.47it/s] Loading 0: 82%|████████▏ | 239/291 [01:21<00:26, 1.96it/s] Loading 0: 83%|████████▎ | 242/291 [01:21<00:14, 3.34it/s] Loading 0: 84%|████████▎ | 243/291 [01:22<00:13, 3.51it/s] Loading 0: 84%|████████▍ | 244/291 [01:22<00:11, 3.98it/s] Loading 0: 85%|████████▍ | 246/291 [01:22<00:13, 3.29it/s] Loading 0: 85%|████████▍ | 247/291 [01:23<00:17, 2.48it/s] Loading 0: 85%|████████▌ | 248/291 [01:24<00:21, 1.97it/s] Loading 0: 86%|████████▋ | 251/291 [01:24<00:11, 3.35it/s] Loading 0: 87%|████████▋ | 252/291 [01:25<00:11, 3.52it/s] Loading 0: 87%|████████▋ | 253/291 [01:25<00:09, 4.02it/s] Loading 0: 88%|████████▊ | 255/291 [01:25<00:10, 3.30it/s] Loading 0: 88%|████████▊ | 256/291 [01:26<00:14, 2.48it/s] Loading 0: 88%|████████▊ | 257/291 [01:27<00:17, 1.97it/s] Loading 0: 89%|████████▉ | 260/291 [01:27<00:09, 3.34it/s] Loading 0: 90%|████████▉ | 261/291 [01:28<00:08, 3.51it/s] Loading 0: 90%|█████████ | 262/291 [01:28<00:07, 4.00it/s] Loading 0: 91%|█████████ | 264/291 [01:28<00:08, 3.29it/s] Loading 0: 91%|█████████ | 265/291 [01:29<00:10, 2.45it/s] Loading 0: 91%|█████████▏| 266/291 [01:30<00:12, 1.95it/s] Loading 0: 92%|█████████▏| 269/291 [01:30<00:06, 3.32it/s] Loading 0: 93%|█████████▎| 270/291 [01:31<00:06, 3.49it/s] Loading 0: 93%|█████████▎| 271/291 [01:31<00:05, 3.97it/s] Loading 0: 94%|█████████▍| 273/291 [01:32<00:05, 3.28it/s] Loading 0: 94%|█████████▍| 274/291 [01:32<00:06, 2.47it/s] Loading 0: 95%|█████████▍| 275/291 [01:33<00:08, 1.96it/s] Loading 0: 96%|█████████▌| 278/291 [01:33<00:03, 3.34it/s] Loading 0: 96%|█████████▌| 279/291 [01:34<00:03, 3.51it/s] Loading 0: 96%|█████████▌| 280/291 [01:34<00:02, 4.00it/s] Loading 0: 97%|█████████▋| 281/291 [01:35<00:03, 2.68it/s] Loading 0: 97%|█████████▋| 282/291 [01:35<00:04, 2.03it/s] Loading 0: 98%|█████████▊| 284/291 [01:36<00:02, 2.91it/s] Loading 0: 98%|█████████▊| 285/291 [01:36<00:01, 3.16it/s] Loading 0: 98%|█████████▊| 286/291 [01:36<00:01, 3.73it/s] Loading 0: 99%|█████████▊| 287/291 [01:36<00:01, 3.54it/s] Loading 0: 99%|█████████▉| 289/291 [01:37<00:00, 3.01it/s]
Job meta-llama-llama-3-1-8b-v4-mkmlizer completed after 232.02s with status: succeeded
Stopping job with name meta-llama-llama-3-1-8b-v4-mkmlizer
Pipeline stage MKMLizer completed in 232.73s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service meta-llama-llama-3-1-8b-v4
Waiting for inference service meta-llama-llama-3-1-8b-v4 to be ready
Failed to get response for submission chaiml-next-door-annoyi_15417_v1: ('http://chaiml-next-door-annoyi-15417-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Inference service meta-llama-llama-3-1-8b-v4 ready after 191.0990173816681s
Pipeline stage MKMLDeployer completed in 191.58s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.1767518520355225s
Received healthy response to inference request in 1.5398852825164795s
Received healthy response to inference request in 1.379866361618042s
Received healthy response to inference request in 1.2108993530273438s
5 requests
1 failed requests
5th percentile: 1.2446927547454834
10th percentile: 1.278486156463623
20th percentile: 1.3460729598999024
30th percentile: 1.4118701457977294
40th percentile: 1.4758777141571044
50th percentile: 1.5398852825164795
60th percentile: 1.7946319103240966
70th percentile: 2.049378538131714
80th percentile: 5.813892221450809
90th percentile: 13.088172960281373
95th percentile: 16.725313329696654
99th percentile: 19.63502562522888
mean time: 5.333971309661865
%s, retrying in %s seconds...
Received healthy response to inference request in 1.9720127582550049s
Received healthy response to inference request in 2.232377052307129s
Received healthy response to inference request in 1.215904951095581s
Received healthy response to inference request in 3.211947202682495s
Received healthy response to inference request in 1.347721815109253s
5 requests
0 failed requests
5th percentile: 1.2422683238983154
10th percentile: 1.2686316967010498
20th percentile: 1.3213584423065186
30th percentile: 1.4725800037384034
40th percentile: 1.7222963809967042
50th percentile: 1.9720127582550049
60th percentile: 2.0761584758758547
70th percentile: 2.180304193496704
80th percentile: 2.428291082382202
90th percentile: 2.8201191425323486
95th percentile: 3.016033172607422
99th percentile: 3.1727643966674806
mean time: 1.9959927558898927
Pipeline stage StressChecker completed in 40.29s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.74s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.80s
Shutdown handler de-registered
meta-llama-llama-3-1-8b_v4 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3036.95s
Shutdown handler de-registered
meta-llama-llama-3-1-8b_v4 status is now inactive due to auto deactivation removed underperforming models
meta-llama-llama-3-1-8b_v4 status is now torndown due to DeploymentManager action