developer_uid: rirv938
submission_id: rirv938-llama-8b-256-to_50255_v6
model_name: rirv938-llama-8b-256-to_50255_v6
model_group: rirv938/llama_8b_256_tok
status: torndown
timestamp: 2025-02-28T08:24:50+00:00
num_battles: 14721
num_wins: 8058
celo_rating: 1308.84
family_friendly_score: 0.5349999999999999
family_friendly_standard_error: 0.007053722421530351
submission_type: basic
model_repo: rirv938/llama_8b_256_tokens_context_3m_step_11718
model_architecture: LlamaForSequenceClassification
model_num_parameters: 8030261248.0
best_of: 1
max_input_tokens: 512
max_output_tokens: 1
display_name: rirv938-llama-8b-256-to_50255_v6
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: rirv938/llama_8b_256_tokens_context_3m_step_11718
model_size: 8B
ranking_group: single
us_pacific_date: 2025-02-28
win_ratio: 0.5473812920317913
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 512, 'best_of': 1, 'max_output_tokens': 1}
formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rirv938-llama-8b-256-to-50255-v6-mkmlizer
Waiting for job on rirv938-llama-8b-256-to-50255-v6-mkmlizer to finish
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ _____ __ __ ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ /___/ ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ Version: 0.12.8 ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ https://mk1.ai ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ The license key for the current software has been verified as ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ belonging to: ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ Chai Research Corp. ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ Expiration: 2025-04-15 23:59:59 ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ║ ║
rirv938-llama-8b-256-to-50255-v6-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
rirv938-llama-8b-256-to-50255-v6-mkmlizer: Downloaded to shared memory in 22.840s
rirv938-llama-8b-256-to-50255-v6-mkmlizer: quantizing model to /dev/shm/model_cache, profile:t0, folder:/tmp/tmpxdurrame, device:0
rirv938-llama-8b-256-to-50255-v6-mkmlizer: Saving flywheel model at /dev/shm/model_cache
Failed to get response for submission nischaydnk-exp26-mistra_55673_v5: HTTPConnectionPool(host='nischaydnk-exp26-mistra-55673-v5-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission nischaydnk-exp26-mistra_55673_v5: HTTPConnectionPool(host='nischaydnk-exp26-mistra-55673-v5-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission nischaydnk-exp26-mistra_55673_v5: HTTPConnectionPool(host='nischaydnk-exp26-mistra-55673-v5-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
rirv938-llama-8b-256-to-50255-v6-mkmlizer: quantized model in 84.100s
rirv938-llama-8b-256-to-50255-v6-mkmlizer: Processed model rirv938/llama_8b_256_tokens_context_3m_step_11718 in 106.941s
rirv938-llama-8b-256-to-50255-v6-mkmlizer: creating bucket guanaco-mkml-models
rirv938-llama-8b-256-to-50255-v6-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rirv938-llama-8b-256-to-50255-v6-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rirv938-llama-8b-256-to-50255-v6
rirv938-llama-8b-256-to-50255-v6-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/rirv938-llama-8b-256-to-50255-v6/config.json
rirv938-llama-8b-256-to-50255-v6-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/rirv938-llama-8b-256-to-50255-v6/special_tokens_map.json
rirv938-llama-8b-256-to-50255-v6-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/rirv938-llama-8b-256-to-50255-v6/tokenizer_config.json
rirv938-llama-8b-256-to-50255-v6-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rirv938-llama-8b-256-to-50255-v6/tokenizer.json
rirv938-llama-8b-256-to-50255-v6-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/rirv938-llama-8b-256-to-50255-v6/flywheel_model.0.safetensors
rirv938-llama-8b-256-to-50255-v6-mkmlizer: Loading 0: 0%| | 0/291 [00:00<?, ?it/s] Loading 0: 1%| | 3/291 [00:00<00:56, 5.12it/s] Loading 0: 1%|▏ | 4/291 [00:01<01:31, 3.15it/s] Loading 0: 2%|▏ | 5/291 [00:01<02:02, 2.34it/s] Loading 0: 3%|▎ | 8/291 [00:02<01:02, 4.50it/s] Loading 0: 3%|▎ | 9/291 [00:02<01:01, 4.57it/s] Loading 0: 3%|▎ | 10/291 [00:02<00:54, 5.11it/s] Loading 0: 4%|▍ | 12/291 [00:02<01:04, 4.30it/s] Loading 0: 4%|▍ | 13/291 [00:03<01:25, 3.24it/s] Loading 0: 5%|▍ | 14/291 [00:04<01:47, 2.58it/s] Loading 0: 6%|▌ | 17/291 [00:04<01:01, 4.43it/s] Loading 0: 6%|▌ | 18/291 [00:04<00:58, 4.66it/s] Loading 0: 7%|▋ | 19/291 [00:04<00:51, 5.24it/s] Loading 0: 7%|▋ | 21/291 [00:05<01:01, 4.38it/s] Loading 0: 8%|▊ | 22/291 [00:05<01:21, 3.31it/s] Loading 0: 8%|▊ | 23/291 [00:06<01:42, 2.61it/s] Loading 0: 9%|▉ | 26/291 [00:06<00:59, 4.44it/s] Loading 0: 9%|▉ | 27/291 [00:06<00:56, 4.68it/s] Loading 0: 10%|█ | 30/291 [00:07<00:56, 4.60it/s] Loading 0: 11%|█ | 31/291 [00:08<01:12, 3.57it/s] Loading 0: 11%|█ | 32/291 [00:08<01:30, 2.85it/s] Loading 0: 12%|█▏ | 35/291 [00:08<00:56, 4.56it/s] Loading 0: 12%|█▏ | 36/291 [00:09<00:53, 4.78it/s] Loading 0: 13%|█▎ | 39/291 [00:09<00:54, 4.64it/s] Loading 0: 14%|█▎ | 40/291 [00:10<01:09, 3.61it/s] Loading 0: 14%|█▍ | 41/291 [00:10<01:28, 2.83it/s] Loading 0: 15%|█▌ | 44/291 [00:11<00:54, 4.53it/s] Loading 0: 15%|█▌ | 45/291 [00:11<00:51, 4.75it/s] Loading 0: 16%|█▌ | 46/291 [00:11<00:46, 5.29it/s] Loading 0: 16%|█▋ | 48/291 [00:11<00:54, 4.45it/s] Loading 0: 17%|█▋ | 49/291 [00:12<01:11, 3.37it/s] Loading 0: 17%|█▋ | 50/291 [00:13<01:31, 2.63it/s] Loading 0: 18%|█▊ | 53/291 [00:13<00:53, 4.42it/s] Loading 0: 19%|█▊ | 54/291 [00:13<00:50, 4.66it/s] Loading 0: 20%|█▉ | 57/291 [00:14<00:51, 4.57it/s] Loading 0: 20%|█▉ | 58/291 [00:14<01:05, 3.57it/s] Loading 0: 20%|██ | 59/291 [00:15<01:21, 2.84it/s] Loading 0: 21%|██▏ | 62/291 [00:15<00:50, 4.53it/s] Loading 0: 22%|██▏ | 63/291 [00:15<00:48, 4.75it/s] Loading 0: 23%|██▎ | 66/291 [00:16<00:48, 4.62it/s] Loading 0: 23%|██▎ | 67/291 [00:17<01:02, 3.61it/s] Loading 0: 23%|██▎ | 68/291 [00:17<01:17, 2.86it/s] Loading 0: 24%|██▍ | 71/291 [00:17<00:48, 4.54it/s] Loading 0: 25%|██▍ | 72/291 [00:18<00:46, 4.75it/s] Loading 0: 25%|██▌ | 73/291 [00:18<00:41, 5.30it/s] Loading 0: 26%|██▌ | 75/291 [00:18<00:48, 4.46it/s] Loading 0: 26%|██▌ | 76/291 [00:19<01:03, 3.37it/s] Loading 0: 26%|██▋ | 77/291 [00:19<01:20, 2.66it/s] Loading 0: 27%|██▋ | 80/291 [00:20<00:46, 4.51it/s] Loading 0: 28%|██▊ | 81/291 [00:20<00:44, 4.75it/s] Loading 0: 29%|██▊ | 83/291 [00:20<00:38, 5.42it/s] Loading 0: 29%|██▉ | 84/291 [00:21<00:54, 3.82it/s] Loading 0: 29%|██▉ | 85/291 [00:21<01:08, 3.02it/s] Loading 0: 30%|██▉ | 86/291 [00:22<01:22, 2.48it/s] Loading 0: 31%|███ | 89/291 [00:22<00:46, 4.34it/s] Loading 0: 31%|███ | 90/291 [00:22<00:43, 4.60it/s] Loading 0: 32%|███▏ | 93/291 [00:23<00:43, 4.53it/s] Loading 0: 32%|███▏ | 94/291 [00:23<00:55, 3.53it/s] Loading 0: 33%|███▎ | 95/291 [00:24<01:09, 2.81it/s] Loading 0: 34%|███▎ | 98/291 [00:24<00:42, 4.49it/s] Loading 0: 34%|███▍ | 99/291 [00:24<00:40, 4.71it/s] Loading 0: 34%|███▍ | 100/291 [00:25<00:36, 5.24it/s] Loading 0: 35%|███▌ | 102/291 [00:25<00:42, 4.42it/s] Loading 0: 35%|███▌ | 103/291 [00:26<00:55, 3.36it/s] Loading 0: 36%|███▌ | 104/291 [00:26<01:10, 2.66it/s] Loading 0: 37%|███▋ | 107/291 [00:27<00:41, 4.47it/s] Loading 0: 37%|███▋ | 108/291 [00:27<00:38, 4.71it/s] Loading 0: 37%|███▋ | 109/291 [00:27<00:34, 5.24it/s] Loading 0: 38%|███▊ | 111/291 [00:27<00:40, 4.41it/s] Loading 0: 38%|███▊ | 112/291 [00:28<00:53, 3.33it/s] Loading 0: 39%|███▉ | 113/291 [00:29<01:07, 2.65it/s] Loading 0: 40%|███▉ | 116/291 [00:29<00:39, 4.48it/s] Loading 0: 40%|████ | 117/291 [00:29<00:36, 4.73it/s] Loading 0: 41%|████ | 120/291 [00:30<00:36, 4.62it/s] Loading 0: 42%|████▏ | 121/291 [00:30<00:47, 3.59it/s] Loading 0: 42%|████▏ | 122/291 [00:31<00:59, 2.86it/s] Loading 0: 43%|████▎ | 125/291 [00:31<00:36, 4.56it/s] Loading 0: 43%|████▎ | 126/291 [00:31<00:34, 4.78it/s] Loading 0: 44%|████▍ | 129/291 [00:32<00:34, 4.65it/s] Loading 0: 45%|████▍ | 130/291 [00:32<00:44, 3.63it/s] Loading 0: 45%|████▌ | 131/291 [00:33<00:55, 2.86it/s] Loading 0: 46%|████▌ | 134/291 [00:33<00:34, 4.57it/s] Loading 0: 46%|████▋ | 135/291 [00:33<00:32, 4.79it/s] Loading 0: 47%|████▋ | 136/291 [00:34<00:29, 5.34it/s] Loading 0: 47%|████▋ | 138/291 [00:34<00:34, 4.48it/s] Loading 0: 48%|████▊ | 139/291 [00:35<00:44, 3.39it/s] Loading 0: 48%|████▊ | 140/291 [00:35<00:56, 2.68it/s] Loading 0: 49%|████▉ | 143/291 [00:36<00:32, 4.51it/s] Loading 0: 49%|████▉ | 144/291 [00:36<00:30, 4.75it/s] Loading 0: 51%|█████ | 147/291 [00:36<00:31, 4.64it/s] Loading 0: 51%|█████ | 148/291 [00:37<00:39, 3.61it/s] Loading 0: 51%|█████ | 149/291 [00:38<00:49, 2.87it/s] Loading 0: 52%|█████▏ | 152/291 [00:38<00:30, 4.56it/s] Loading 0: 53%|█████▎ | 153/291 [00:38<00:28, 4.78it/s] Loading 0: 53%|█████▎ | 154/291 [00:38<00:25, 5.34it/s] Loading 0: 54%|█████▎ | 156/291 [00:39<00:30, 4.49it/s] Loading 0: 54%|█████▍ | 157/291 [00:39<00:39, 3.39it/s] Loading 0: 54%|█████▍ | 158/291 [00:40<00:49, 2.67it/s] Loading 0: 55%|█████▌ | 161/291 [00:40<00:28, 4.54it/s] Loading 0: 56%|█████▌ | 162/291 [00:40<00:27, 4.77it/s] Loading 0: 57%|█████▋ | 165/291 [00:41<00:27, 4.66it/s] Loading 0: 57%|█████▋ | 166/291 [00:41<00:34, 3.62it/s] Loading 0: 57%|█████▋ | 167/291 [00:42<00:43, 2.88it/s] Loading 0: 58%|█████▊ | 170/291 [00:42<00:26, 4.65it/s] Loading 0: 59%|█████▉ | 171/291 [00:42<00:24, 4.86it/s] Loading 0: 59%|█████▉ | 173/291 [00:43<00:29, 4.02it/s] Loading 0: 60%|██████ | 175/291 [00:43<00:23, 5.01it/s] Loading 0: 60%|██████ | 176/291 [00:43<00:22, 5.19it/s] Loading 0: 61%|██████ | 177/291 [00:44<00:19, 5.78it/s] Loading 0: 62%|██████▏ | 179/291 [00:44<00:24, 4.65it/s] Loading 0: 62%|██████▏ | 180/291 [00:45<00:32, 3.46it/s] Loading 0: 62%|██████▏ | 181/291 [00:45<00:40, 2.73it/s] Loading 0: 63%|██████▎ | 184/291 [00:45<00:22, 4.66it/s] Loading 0: 64%|██████▎ | 185/291 [00:46<00:21, 4.89it/s] Loading 0: 64%|██████▍ | 187/291 [00:46<00:18, 5.57it/s] Loading 0: 65%|██████▍ | 188/291 [00:46<00:26, 3.88it/s] Loading 0: 65%|██████▍ | 189/291 [00:47<00:34, 2.93it/s] Loading 0: 66%|██████▌ | 192/291 [00:48<00:26, 3.75it/s] Loading 0: 66%|██████▋ | 193/291 [00:48<00:31, 3.10it/s] Loading 0: 67%|██████▋ | 194/291 [00:49<00:37, 2.58it/s] Loading 0: 68%|██████▊ | 197/291 [00:49<00:22, 4.24it/s] Loading 0: 68%|██████▊ | 198/291 [00:49<00:20, 4.49it/s] Loading 0: 68%|██████▊ | 199/291 [00:49<00:18, 5.01it/s] Loading 0: 69%|██████▉ | 201/291 [00:50<00:20, 4.34it/s] Loading 0: 69%|██████▉ | 202/291 [00:50<00:26, 3.33it/s] Loading 0: 70%|██████▉ | 203/291 [00:51<00:33, 2.66it/s] Loading 0: 71%|███████ | 206/291 [00:51<00:18, 4.48it/s] Loading 0: 71%|███████ | 207/291 [00:51<00:17, 4.73it/s] Loading 0: 71%|███████▏ | 208/291 [00:52<00:15, 5.31it/s] Loading 0: 72%|███████▏ | 210/291 [00:52<00:18, 4.47it/s] Loading 0: 73%|███████▎ | 211/291 [00:53<00:23, 3.37it/s] Loading 0: 73%|███████▎ | 212/291 [00:53<00:29, 2.67it/s] Loading 0: 74%|███████▍ | 215/291 [00:54<00:16, 4.52it/s] Loading 0: 74%|███████▍ | 216/291 [00:54<00:15, 4.75it/s] Loading 0: 75%|███████▍ | 217/291 [00:54<00:13, 5.34it/s] Loading 0: 75%|███████▌ | 219/291 [00:54<00:16, 4.45it/s] Loading 0: 76%|███████▌ | 220/291 [00:55<00:21, 3.36it/s] Loading 0: 76%|███████▌ | 221/291 [00:56<00:26, 2.67it/s] Loading 0: 77%|███████▋ | 224/291 [00:56<00:14, 4.56it/s] Loading 0: 77%|███████▋ | 225/291 [00:56<00:13, 4.80it/s] Loading 0: 78%|███████▊ | 227/291 [00:56<00:09, 6.57it/s] Loading 0: 79%|███████▊ | 229/291 [00:57<00:18, 3.44it/s] Loading 0: 79%|███████▉ | 230/291 [00:58<00:21, 2.80it/s] Loading 0: 80%|████████ | 233/291 [00:58<00:12, 4.47it/s] Loading 0: 80%|████████ | 234/291 [00:58<00:12, 4.70it/s] Loading 0: 81%|████████▏ | 237/291 [00:59<00:11, 4.62it/s] Loading 0: 82%|████████▏ | 238/291 [00:59<00:14, 3.63it/s] Loading 0: 82%|████████▏ | 239/291 [01:00<00:17, 2.89it/s] Loading 0: 83%|████████▎ | 242/291 [01:00<00:10, 4.58it/s] Loading 0: 84%|████████▎ | 243/291 [01:00<00:10, 4.80it/s] Loading 0: 84%|████████▍ | 244/291 [01:00<00:08, 5.35it/s] Loading 0: 85%|████████▍ | 246/291 [01:01<00:09, 4.50it/s] Loading 0: 85%|████████▍ | 247/291 [01:02<00:12, 3.42it/s] Loading 0: 85%|████████▌ | 248/291 [01:02<00:15, 2.71it/s] Loading 0: 86%|████████▋ | 251/291 [01:02<00:08, 4.61it/s] Loading 0: 87%|████████▋ | 252/291 [01:03<00:08, 4.84it/s] Loading 0: 87%|████████▋ | 253/291 [01:03<00:07, 5.38it/s] Loading 0: 88%|████████▊ | 255/291 [01:03<00:08, 4.49it/s] Loading 0: 88%|████████▊ | 256/291 [01:04<00:10, 3.38it/s] Loading 0: 88%|████████▊ | 257/291 [01:04<00:12, 2.68it/s] Loading 0: 89%|████████▉ | 260/291 [01:05<00:06, 4.58it/s] Loading 0: 90%|████████▉ | 261/291 [01:05<00:06, 4.82it/s] Loading 0: 90%|█████████ | 262/291 [01:05<00:05, 5.42it/s] Loading 0: 91%|█████████ | 264/291 [01:05<00:06, 4.48it/s] Loading 0: 91%|█████████ | 265/291 [01:06<00:07, 3.38it/s] Loading 0: 91%|█████████▏| 266/291 [01:07<00:09, 2.68it/s] Loading 0: 92%|█████████▏| 269/291 [01:07<00:04, 4.59it/s] Loading 0: 93%|█████████▎| 270/291 [01:07<00:04, 4.83it/s] Loading 0: 94%|█████████▍| 273/291 [01:08<00:03, 4.67it/s] Loading 0: 94%|█████████▍| 274/291 [01:08<00:04, 3.62it/s] Loading 0: 95%|█████████▍| 275/291 [01:09<00:05, 2.88it/s] Loading 0: 96%|█████████▌| 278/291 [01:09<00:02, 4.64it/s] Loading 0: 96%|█████████▌| 279/291 [01:09<00:02, 4.85it/s] Loading 0: 97%|█████████▋| 281/291 [01:10<00:02, 4.01it/s] Loading 0: 97%|█████████▋| 282/291 [01:10<00:02, 3.10it/s] Loading 0: 98%|█████████▊| 284/291 [01:11<00:01, 4.11it/s] Loading 0: 98%|█████████▊| 285/291 [01:11<00:01, 4.41it/s] Loading 0: 98%|█████████▊| 286/291 [01:11<00:00, 5.01it/s] Loading 0: 99%|█████████▊| 287/291 [01:11<00:00, 5.20it/s] Loading 0: 99%|█████████▉| 288/291 [01:12<00:00, 3.47it/s]
Job rirv938-llama-8b-256-to-50255-v6-mkmlizer completed after 135.0s with status: succeeded
Stopping job with name rirv938-llama-8b-256-to-50255-v6-mkmlizer
Pipeline stage MKMLizer completed in 135.47s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rirv938-llama-8b-256-to-50255-v6
Waiting for inference service rirv938-llama-8b-256-to-50255-v6 to be ready
Failed to get response for submission nischaydnk-exp26-mistra_55673_v5: HTTPConnectionPool(host='nischaydnk-exp26-mistra-55673-v5-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Inference service rirv938-llama-8b-256-to-50255-v6 ready after 160.56911087036133s
Pipeline stage MKMLDeployer completed in 161.00s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 5.114212512969971s
Received healthy response to inference request in 5.041090250015259s
Received healthy response to inference request in 4.337454319000244s
Received healthy response to inference request in 2.159355878829956s
Received healthy response to inference request in 4.323402166366577s
5 requests
0 failed requests
5th percentile: 2.59216513633728
10th percentile: 3.0249743938446043
20th percentile: 3.890592908859253
30th percentile: 4.326212596893311
40th percentile: 4.331833457946777
50th percentile: 4.337454319000244
60th percentile: 4.61890869140625
70th percentile: 4.9003630638122555
80th percentile: 5.055714702606201
90th percentile: 5.084963607788086
95th percentile: 5.099588060379029
99th percentile: 5.1112876224517825
mean time: 4.195103025436401
%s, retrying in %s seconds...
Received healthy response to inference request in 3.3457369804382324s
Received healthy response to inference request in 4.204193592071533s
Received healthy response to inference request in 2.6865854263305664s
Received healthy response to inference request in 1.9175972938537598s
Received healthy response to inference request in 5.010962009429932s
5 requests
0 failed requests
5th percentile: 2.071394920349121
10th percentile: 2.2251925468444824
20th percentile: 2.532787799835205
30th percentile: 2.8184157371520997
40th percentile: 3.082076358795166
50th percentile: 3.3457369804382324
60th percentile: 3.6891196250915526
70th percentile: 4.032502269744873
80th percentile: 4.365547275543213
90th percentile: 4.688254642486572
95th percentile: 4.849608325958251
99th percentile: 4.978691272735595
mean time: 3.433015060424805
%s, retrying in %s seconds...
Received healthy response to inference request in 1.898087501525879s
Received healthy response to inference request in 2.677300214767456s
Received healthy response to inference request in 3.3381664752960205s
Received healthy response to inference request in 4.643239259719849s
Received healthy response to inference request in 3.200270652770996s
5 requests
0 failed requests
5th percentile: 2.0539300441741943
10th percentile: 2.2097725868225098
20th percentile: 2.5214576721191406
30th percentile: 2.781894302368164
40th percentile: 2.99108247756958
50th percentile: 3.200270652770996
60th percentile: 3.255428981781006
70th percentile: 3.3105873107910155
80th percentile: 3.599181032180786
90th percentile: 4.121210145950317
95th percentile: 4.3822247028350825
99th percentile: 4.591036348342896
mean time: 3.15141282081604
Pipeline stage StressChecker completed in 58.09s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.77s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.87s
Shutdown handler de-registered
rirv938-llama-8b-256-to_50255_v6 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.11s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-256-to-50255-v6-profiler
Waiting for inference service rirv938-llama-8b-256-to-50255-v6-profiler to be ready
Inference service rirv938-llama-8b-256-to-50255-v6-profiler ready after 160.6735680103302s
Pipeline stage MKMLProfilerDeployer completed in 161.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-256e81a1c506ef243ed4324c0bc09f8b82c-deplopgqsn:/code/chaiverse_profiler_1740731644 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-256e81a1c506ef243ed4324c0bc09f8b82c-deplopgqsn --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1740731644 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 512 --output_tokens 1 --summary /code/chaiverse_profiler_1740731644/summary.json'
Received signal 15, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-256-to-50255-v6-profiler is running
Tearing down inference service rirv938-llama-8b-256-to-50255-v6-profiler
Service rirv938-llama-8b-256-to-50255-v6-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.15s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-256-to-50255-v6-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 1.97s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-256-to-50255-v6-profiler
Waiting for inference service rirv938-llama-8b-256-to-50255-v6-profiler to be ready
Inference service rirv938-llama-8b-256-to-50255-v6-profiler ready after 50.234991788864136s
Pipeline stage MKMLProfilerDeployer completed in 50.63s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-256e81a1c506ef243ed4324c0bc09f8b82c-deplompc4c:/code/chaiverse_profiler_1740735169 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-256e81a1c506ef243ed4324c0bc09f8b82c-deplompc4c --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1740735169 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 512 --output_tokens 1 --summary /code/chaiverse_profiler_1740735169/summary.json'
Received signal 15, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-256-to-50255-v6-profiler is running
Tearing down inference service rirv938-llama-8b-256-to-50255-v6-profiler
Service rirv938-llama-8b-256-to-50255-v6-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.03s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-256-to-50255-v6-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 2.02s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-256-to-50255-v6-profiler
Waiting for inference service rirv938-llama-8b-256-to-50255-v6-profiler to be ready
Inference service rirv938-llama-8b-256-to-50255-v6-profiler ready after 80.32737016677856s
Pipeline stage MKMLProfilerDeployer completed in 80.72s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-256e81a1c506ef243ed4324c0bc09f8b82c-deplogm4hk:/code/chaiverse_profiler_1740738819 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-256e81a1c506ef243ed4324c0bc09f8b82c-deplogm4hk --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1740738819 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 512 --output_tokens 1 --summary /code/chaiverse_profiler_1740738819/summary.json'
Received signal 15, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-256-to-50255-v6-profiler is running
Tearing down inference service rirv938-llama-8b-256-to-50255-v6-profiler
Service rirv938-llama-8b-256-to-50255-v6-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.00s
Shutdown handler de-registered
rirv938-llama-8b-256-to_50255_v6 status is now inactive due to auto deactivation removed underperforming models
rirv938-llama-8b-256-to_50255_v6 status is now torndown due to DeploymentManager action
ChatRequest
Generation Params
Prompt Formatter
Chat History
ChatMessage 1