developer_uid: azuruce
submission_id: chaiml-espresso-llama-24_9292_v2
model_name: chaiml-espresso-llama-24_9292_v2
model_group: ChaiML/espresso_llama_24
status: inactive
timestamp: 2024-12-05T05:54:11+00:00
num_battles: 8670
num_wins: 4217
celo_rating: 1253.51
family_friendly_score: 0.5826
family_friendly_standard_error: 0.006973911958148023
submission_type: basic
model_repo: ChaiML/espresso_llama_241204_albert_v2_sft_2epoch_128alpha
model_architecture: MistralForCausalLM
model_num_parameters: 22247282688.0
best_of: 4
max_input_tokens: 1024
max_output_tokens: 64
latencies: [{'batch_size': 1, 'throughput': 0.38479831429306427, 'latency_mean': 2.598700462579727, 'latency_p50': 2.590915322303772, 'latency_p90': 2.8902148962020875}, {'batch_size': 3, 'throughput': 0.8062907248707989, 'latency_mean': 3.713191522359848, 'latency_p50': 3.7139694690704346, 'latency_p90': 4.051502084732055}, {'batch_size': 5, 'throughput': 1.067340066513242, 'latency_mean': 4.653591929674149, 'latency_p50': 4.683766961097717, 'latency_p90': 5.243802690505982}, {'batch_size': 6, 'throughput': 1.15537009111368, 'latency_mean': 5.174456565380097, 'latency_p50': 5.208216905593872, 'latency_p90': 5.780211472511291}, {'batch_size': 10, 'throughput': 1.3549117027113549, 'latency_mean': 7.31310973405838, 'latency_p50': 7.297738671302795, 'latency_p90': 8.222592163085938}]
gpu_counts: {'NVIDIA RTX A6000': 1}
display_name: chaiml-espresso-llama-24_9292_v2
is_internal_developer: True
language_model: ChaiML/espresso_llama_241204_albert_v2_sft_2epoch_128alpha
model_size: 22B
ranking_group: single
throughput_3p7s: 0.8
us_pacific_date: 2024-12-04
win_ratio: 0.4863898500576701
generation_params: {'temperature': 0.9, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 100, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '</s>', '####', 'Bot:', 'User:', 'You:', '<|im_end|>', '<|eot_id|>'], 'max_input_tokens': 1024, 'best_of': 4, 'max_output_tokens': 64}
formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name chaiml-espresso-llama-24-9292-v2-mkmlizer
Waiting for job on chaiml-espresso-llama-24-9292-v2-mkmlizer to finish
chaiml-espresso-llama-24-9292-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ _____ __ __ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ /___/ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ Version: 0.11.12 ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ https://mk1.ai ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ belonging to: ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ Chai Research Corp. ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ║ ║
chaiml-espresso-llama-24-9292-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-espresso-llama-24-9292-v2-mkmlizer: Downloaded to shared memory in 46.995s
chaiml-espresso-llama-24-9292-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpif7p5txa, device:0
chaiml-espresso-llama-24-9292-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
chaiml-espresso-llama-24-9292-v2-mkmlizer: quantized model in 45.302s
chaiml-espresso-llama-24-9292-v2-mkmlizer: Processed model ChaiML/espresso_llama_241204_albert_v2_sft_2epoch_128alpha in 92.297s
chaiml-espresso-llama-24-9292-v2-mkmlizer: creating bucket guanaco-mkml-models
chaiml-espresso-llama-24-9292-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
chaiml-espresso-llama-24-9292-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/special_tokens_map.json
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/config.json
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/tokenizer_config.json
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/tokenizer.json
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.1.safetensors s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/flywheel_model.1.safetensors
chaiml-espresso-llama-24-9292-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-espresso-llama-24-9292-v2/flywheel_model.0.safetensors
chaiml-espresso-llama-24-9292-v2-mkmlizer: Loading 0: 0%| | 0/507 [00:00<?, ?it/s] Loading 0: 1%| | 5/507 [00:00<00:21, 23.20it/s] Loading 0: 2%|▏ | 12/507 [00:00<00:13, 36.96it/s] Loading 0: 3%|▎ | 17/507 [00:00<00:13, 36.10it/s] Loading 0: 4%|▍ | 22/507 [00:00<00:13, 36.74it/s] Loading 0: 5%|▌ | 26/507 [00:00<00:13, 36.60it/s] Loading 0: 6%|▌ | 31/507 [00:00<00:12, 36.93it/s] Loading 0: 7%|▋ | 35/507 [00:00<00:12, 37.08it/s] Loading 0: 8%|▊ | 40/507 [00:01<00:12, 37.73it/s] Loading 0: 9%|▊ | 44/507 [00:01<00:12, 37.99it/s] Loading 0: 9%|▉ | 48/507 [00:01<00:15, 30.25it/s] Loading 0: 10%|█ | 53/507 [00:01<00:19, 22.84it/s] Loading 0: 11%|█ | 56/507 [00:01<00:20, 21.64it/s] Loading 0: 12%|█▏ | 63/507 [00:02<00:15, 29.18it/s] Loading 0: 13%|█▎ | 67/507 [00:02<00:14, 29.89it/s] Loading 0: 14%|█▍ | 72/507 [00:02<00:12, 33.65it/s] Loading 0: 15%|█▌ | 78/507 [00:02<00:11, 37.88it/s] Loading 0: 16%|█▋ | 83/507 [00:02<00:12, 34.55it/s] Loading 0: 17%|█▋ | 87/507 [00:02<00:12, 33.72it/s] Loading 0: 18%|█▊ | 91/507 [00:02<00:12, 32.11it/s] Loading 0: 19%|█▉ | 96/507 [00:02<00:11, 34.52it/s] Loading 0: 20%|█▉ | 100/507 [00:03<00:12, 32.98it/s] Loading 0: 21%|██ | 105/507 [00:03<00:11, 35.62it/s] Loading 0: 21%|██▏ | 109/507 [00:03<00:11, 34.00it/s] Loading 0: 22%|██▏ | 113/507 [00:03<00:16, 24.11it/s] Loading 0: 23%|██▎ | 116/507 [00:03<00:17, 22.38it/s] Loading 0: 24%|██▍ | 122/507 [00:03<00:14, 25.88it/s] Loading 0: 25%|██▌ | 127/507 [00:04<00:12, 30.48it/s] Loading 0: 26%|██▌ | 131/507 [00:04<00:12, 29.12it/s] Loading 0: 27%|██▋ | 138/507 [00:04<00:09, 36.92it/s] Loading 0: 28%|██▊ | 143/507 [00:04<00:09, 37.14it/s] Loading 0: 29%|██▉ | 148/507 [00:04<00:09, 38.15it/s] Loading 0: 30%|███ | 153/507 [00:04<00:08, 39.93it/s] Loading 0: 31%|███ | 158/507 [00:04<00:10, 34.37it/s] Loading 0: 32%|███▏ | 164/507 [00:05<00:08, 39.11it/s] Loading 0: 33%|███▎ | 169/507 [00:05<00:11, 29.76it/s] Loading 0: 34%|███▍ | 173/507 [00:05<00:11, 30.36it/s] Loading 0: 35%|███▍ | 177/507 [00:05<00:11, 29.61it/s] Loading 0: 36%|███▌ | 183/507 [00:05<00:09, 35.12it/s] Loading 0: 37%|███▋ | 187/507 [00:05<00:09, 34.98it/s] Loading 0: 38%|███▊ | 192/507 [00:05<00:08, 37.93it/s] Loading 0: 39%|███▉ | 197/507 [00:06<00:08, 37.92it/s] Loading 0: 40%|███▉ | 201/507 [00:06<00:07, 38.31it/s] Loading 0: 40%|████ | 205/507 [00:06<00:08, 37.63it/s] Loading 0: 41%|████▏ | 210/507 [00:06<00:07, 40.61it/s] Loading 0: 42%|████▏ | 215/507 [00:06<00:07, 38.29it/s] Loading 0: 43%|████▎ | 219/507 [00:06<00:09, 29.87it/s] Loading 0: 44%|████▍ | 224/507 [00:06<00:11, 25.04it/s] Loading 0: 45%|████▌ | 230/507 [00:07<00:10, 26.99it/s] Loading 0: 47%|████▋ | 237/507 [00:07<00:08, 33.65it/s] Loading 0: 48%|████▊ | 241/507 [00:07<00:07, 33.80it/s] Loading 0: 49%|████▊ | 246/507 [00:07<00:07, 35.95it/s] Loading 0: 49%|████▉ | 250/507 [00:07<00:07, 35.48it/s] Loading 0: 50%|█████ | 255/507 [00:07<00:06, 37.76it/s] Loading 0: 51%|█████ | 259/507 [00:07<00:06, 35.71it/s] Loading 0: 52%|█████▏ | 264/507 [00:07<00:06, 37.86it/s] Loading 0: 53%|█████▎ | 268/507 [00:08<00:06, 35.32it/s] Loading 0: 54%|█████▍ | 273/507 [00:08<00:06, 37.72it/s] Loading 0: 55%|█████▍ | 277/507 [00:08<00:06, 35.27it/s] Loading 0: 56%|█████▌ | 283/507 [00:08<00:05, 37.66it/s] Loading 0: 57%|█████▋ | 287/507 [00:08<00:09, 23.78it/s] Loading 0: 58%|█████▊ | 293/507 [00:09<00:07, 27.21it/s] Loading 0: 59%|█████▉ | 299/507 [00:20<00:07, 27.21it/s] Loading 0: 59%|█████▉ | 300/507 [00:24<02:54, 1.19it/s] Loading 0: 60%|█████▉ | 302/507 [00:24<02:32, 1.34it/s] Loading 0: 61%|██████ | 307/507 [00:24<01:43, 1.94it/s] Loading 0: 61%|██████ | 310/507 [00:24<01:21, 2.41it/s] Loading 0: 62%|██████▏ | 314/507 [00:24<00:58, 3.32it/s] Loading 0: 63%|██████▎ | 319/507 [00:25<00:38, 4.83it/s] Loading 0: 64%|██████▎ | 323/507 [00:25<00:28, 6.39it/s] Loading 0: 65%|██████▍ | 328/507 [00:25<00:20, 8.88it/s] Loading 0: 65%|██████▌ | 332/507 [00:25<00:15, 11.28it/s] Loading 0: 66%|██████▋ | 337/507 [00:25<00:11, 15.01it/s] Loading 0: 67%|██████▋ | 341/507 [00:25<00:11, 14.77it/s] Loading 0: 68%|██████▊ | 345/507 [00:25<00:09, 17.25it/s] Loading 0: 69%|██████▉ | 349/507 [00:26<00:08, 19.42it/s] Loading 0: 70%|██████▉ | 354/507 [00:26<00:06, 23.69it/s] Loading 0: 71%|███████ | 358/507 [00:26<00:05, 25.13it/s] Loading 0: 72%|███████▏ | 363/507 [00:26<00:04, 29.10it/s] Loading 0: 72%|███████▏ | 367/507 [00:26<00:04, 29.93it/s] Loading 0: 73%|███████▎ | 372/507 [00:26<00:03, 34.10it/s] Loading 0: 74%|███████▍ | 376/507 [00:26<00:03, 33.89it/s] Loading 0: 75%|███████▌ | 381/507 [00:26<00:03, 36.62it/s] Loading 0: 76%|███████▌ | 385/507 [00:27<00:03, 34.44it/s] Loading 0: 77%|███████▋ | 389/507 [00:27<00:03, 35.18it/s] Loading 0: 78%|███████▊ | 393/507 [00:27<00:03, 35.30it/s] Loading 0: 78%|███████▊ | 397/507 [00:27<00:04, 25.13it/s] Loading 0: 79%|███████▉ | 401/507 [00:27<00:04, 25.23it/s] Loading 0: 80%|████████ | 408/507 [00:27<00:02, 33.05it/s] Loading 0: 81%|████████▏ | 412/507 [00:27<00:02, 33.18it/s] Loading 0: 82%|████████▏ | 417/507 [00:28<00:02, 35.15it/s] Loading 0: 83%|████████▎ | 421/507 [00:28<00:02, 34.99it/s] Loading 0: 84%|████████▍ | 426/507 [00:28<00:02, 38.33it/s] Loading 0: 85%|████████▍ | 430/507 [00:28<00:02, 36.28it/s] Loading 0: 86%|████████▌ | 435/507 [00:28<00:01, 37.28it/s] Loading 0: 87%|████████▋ | 439/507 [00:28<00:02, 33.98it/s] Loading 0: 88%|████████▊ | 444/507 [00:28<00:01, 35.70it/s] Loading 0: 88%|████████▊ | 448/507 [00:28<00:01, 35.57it/s] Loading 0: 90%|████████▉ | 454/507 [00:29<00:01, 38.26it/s] Loading 0: 90%|█████████ | 458/507 [00:31<00:08, 6.01it/s] Loading 0: 91%|█████████ | 461/507 [00:31<00:06, 7.21it/s] Loading 0: 92%|█████████▏| 465/507 [00:31<00:04, 9.24it/s] Loading 0: 93%|█████████▎| 472/507 [00:31<00:02, 14.35it/s] Loading 0: 94%|█████████▍| 476/507 [00:31<00:01, 16.75it/s] Loading 0: 95%|█████████▍| 481/507 [00:31<00:01, 20.93it/s] Loading 0: 96%|█████████▌| 485/507 [00:32<00:00, 23.33it/s] Loading 0: 97%|█████████▋| 490/507 [00:32<00:00, 27.48it/s] Loading 0: 97%|█████████▋| 494/507 [00:32<00:00, 28.71it/s] Loading 0: 98%|█████████▊| 499/507 [00:32<00:00, 31.20it/s] Loading 0: 99%|█████████▉| 503/507 [00:32<00:00, 30.13it/s] Loading 0: 100%|██████████| 507/507 [00:32<00:00, 32.07it/s]
Job chaiml-espresso-llama-24-9292-v2-mkmlizer completed after 122.89s with status: succeeded
Stopping job with name chaiml-espresso-llama-24-9292-v2-mkmlizer
Pipeline stage MKMLizer completed in 124.23s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service chaiml-espresso-llama-24-9292-v2
Waiting for inference service chaiml-espresso-llama-24-9292-v2 to be ready
Inference service chaiml-espresso-llama-24-9292-v2 ready after 184.75290846824646s
Pipeline stage MKMLDeployer completed in 186.49s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.983168363571167s
Received healthy response to inference request in 2.445164442062378s
Received healthy response to inference request in 2.7090847492218018s
Retrying (%r) after connection broken by '%r': %s
Received healthy response to inference request in 2.8613054752349854s
Received healthy response to inference request in 2.352644443511963s
5 requests
0 failed requests
5th percentile: 2.371148443222046
10th percentile: 2.389652442932129
20th percentile: 2.426660442352295
30th percentile: 2.497948503494263
40th percentile: 2.603516626358032
50th percentile: 2.7090847492218018
60th percentile: 2.769973039627075
70th percentile: 2.8308613300323486
80th percentile: 2.8856780529022217
90th percentile: 2.9344232082366943
95th percentile: 2.9587957859039307
99th percentile: 2.97829384803772
mean time: 2.670273494720459
Pipeline stage StressChecker completed in 14.69s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.43s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 2.26s
Shutdown handler de-registered
chaiml-espresso-llama-24_9292_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3197.92s
Shutdown handler de-registered
chaiml-espresso-llama-24_9292_v2 status is now inactive due to auto deactivation removed underperforming models