developer_uid: rirv938
submission_id: rirv938-llama-8b-1024-t_67568_v2
model_name: rirv938-llama-8b-1024-t_67568_v2
model_group: rirv938/llama_8b_1024_to
status: torndown
timestamp: 2025-02-27T18:01:27+00:00
num_battles: 6192
num_wins: 3315
celo_rating: 1303.03
family_friendly_score: 0.5284
family_friendly_standard_error: 0.007059652116074842
submission_type: basic
model_repo: rirv938/llama_8b_1024_tokens_context_3m_step_11718
model_architecture: LlamaForSequenceClassification
model_num_parameters: 8030261248.0
best_of: 1
max_input_tokens: 256
max_output_tokens: 1
display_name: rirv938-llama-8b-1024-t_67568_v2
ineligible_reason: max_output_tokens!=64
is_internal_developer: True
language_model: rirv938/llama_8b_1024_tokens_context_3m_step_11718
model_size: 8B
ranking_group: single
us_pacific_date: 2025-02-27
win_ratio: 0.5353682170542635
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 256, 'best_of': 1, 'max_output_tokens': 1}
formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rirv938-llama-8b-1024-t-67568-v2-mkmlizer
Waiting for job on rirv938-llama-8b-1024-t-67568-v2-mkmlizer to finish
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ _____ __ __ ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ /___/ ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ Version: 0.12.8 ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ https://mk1.ai ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ The license key for the current software has been verified as ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ belonging to: ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ Chai Research Corp. ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ Expiration: 2025-04-15 23:59:59 ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ║ ║
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: Downloaded to shared memory in 28.610s
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:t0, folder:/tmp/tmpi8o0b9i7, device:0
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
Failed to get response for submission nischaydnk-exp27-ftexp1_72507_v1: HTTPConnectionPool(host='nischaydnk-exp27-ftexp1-72507-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: quantized model in 85.641s
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: Processed model rirv938/llama_8b_1024_tokens_context_3m_step_11718 in 114.253s
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: creating bucket guanaco-mkml-models
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rirv938-llama-8b-1024-t-67568-v2
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/rirv938-llama-8b-1024-t-67568-v2/config.json
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/rirv938-llama-8b-1024-t-67568-v2/special_tokens_map.json
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/rirv938-llama-8b-1024-t-67568-v2/tokenizer_config.json
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rirv938-llama-8b-1024-t-67568-v2/tokenizer.json
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/rirv938-llama-8b-1024-t-67568-v2/flywheel_model.0.safetensors
rirv938-llama-8b-1024-t-67568-v2-mkmlizer: Loading 0: 0%| | 0/291 [00:00<?, ?it/s] Loading 0: 1%| | 3/291 [00:00<00:55, 5.20it/s] Loading 0: 1%|▏ | 4/291 [00:01<01:30, 3.17it/s] Loading 0: 2%|▏ | 5/291 [00:01<02:01, 2.36it/s] Loading 0: 3%|▎ | 8/291 [00:02<01:02, 4.52it/s] Loading 0: 3%|▎ | 9/291 [00:02<01:01, 4.58it/s] Loading 0: 3%|▎ | 10/291 [00:02<00:54, 5.16it/s] Loading 0: 4%|▍ | 12/291 [00:02<01:04, 4.32it/s] Loading 0: 4%|▍ | 13/291 [00:03<01:25, 3.24it/s] Loading 0: 5%|▍ | 14/291 [00:04<01:49, 2.53it/s] Loading 0: 6%|▌ | 17/291 [00:04<01:02, 4.35it/s] Loading 0: 6%|▌ | 18/291 [00:04<00:59, 4.60it/s] Loading 0: 7%|▋ | 19/291 [00:04<00:53, 5.12it/s] Loading 0: 7%|▋ | 21/291 [00:05<01:01, 4.37it/s] Loading 0: 8%|▊ | 22/291 [00:05<01:21, 3.31it/s] Loading 0: 8%|▊ | 23/291 [00:06<01:43, 2.59it/s] Loading 0: 9%|▉ | 26/291 [00:06<00:59, 4.43it/s] Loading 0: 9%|▉ | 27/291 [00:06<00:56, 4.68it/s] Loading 0: 10%|▉ | 28/291 [00:06<00:50, 5.25it/s] Loading 0: 10%|█ | 30/291 [00:07<00:58, 4.46it/s] Loading 0: 11%|█ | 31/291 [00:08<01:16, 3.39it/s] Loading 0: 11%|█ | 32/291 [00:08<01:37, 2.65it/s] Loading 0: 12%|█▏ | 35/291 [00:08<00:56, 4.51it/s] Loading 0: 12%|█▏ | 36/291 [00:09<00:53, 4.77it/s] Loading 0: 13%|█▎ | 37/291 [00:09<00:47, 5.37it/s] Loading 0: 13%|█▎ | 39/291 [00:09<00:55, 4.53it/s] Loading 0: 14%|█▎ | 40/291 [00:10<01:13, 3.41it/s] Loading 0: 14%|█▍ | 41/291 [00:10<01:33, 2.68it/s] Loading 0: 15%|█▌ | 44/291 [00:11<00:54, 4.55it/s] Loading 0: 15%|█▌ | 45/291 [00:11<00:51, 4.79it/s] Loading 0: 16%|█▌ | 46/291 [00:11<00:46, 5.22it/s] Loading 0: 16%|█▋ | 48/291 [00:11<00:54, 4.45it/s] Loading 0: 17%|█▋ | 49/291 [00:12<01:11, 3.38it/s] Loading 0: 17%|█▋ | 50/291 [00:13<01:30, 2.65it/s] Loading 0: 18%|█▊ | 53/291 [00:13<00:53, 4.48it/s] Loading 0: 19%|█▊ | 54/291 [00:13<00:50, 4.72it/s] Loading 0: 19%|█▉ | 55/291 [00:13<00:45, 5.21it/s] Loading 0: 20%|█▉ | 57/291 [00:14<00:53, 4.39it/s] Loading 0: 20%|█▉ | 58/291 [00:14<01:10, 3.31it/s] Loading 0: 20%|██ | 59/291 [00:15<01:29, 2.59it/s] Loading 0: 21%|██▏ | 62/291 [00:15<00:52, 4.40it/s] Loading 0: 22%|██▏ | 63/291 [00:15<00:49, 4.64it/s] Loading 0: 22%|██▏ | 64/291 [00:15<00:43, 5.21it/s] Loading 0: 23%|██▎ | 66/291 [00:16<00:51, 4.37it/s] Loading 0: 23%|██▎ | 67/291 [00:17<01:08, 3.29it/s] Loading 0: 23%|██▎ | 68/291 [00:17<01:26, 2.59it/s] Loading 0: 24%|██▍ | 71/291 [00:17<00:50, 4.38it/s] Loading 0: 25%|██▍ | 72/291 [00:18<00:47, 4.62it/s] Loading 0: 25%|██▌ | 73/291 [00:18<00:41, 5.20it/s] Loading 0: 26%|██▌ | 75/291 [00:18<00:49, 4.36it/s] Loading 0: 26%|██▌ | 76/291 [00:19<01:05, 3.29it/s] Loading 0: 26%|██▋ | 77/291 [00:20<01:23, 2.57it/s] Loading 0: 27%|██▋ | 80/291 [00:20<00:48, 4.36it/s] Loading 0: 28%|██▊ | 81/291 [00:20<00:45, 4.60it/s] Loading 0: 28%|██▊ | 82/291 [00:20<00:40, 5.16it/s] Loading 0: 29%|██▊ | 83/291 [00:20<00:38, 5.33it/s] Loading 0: 29%|██▉ | 84/291 [00:21<00:59, 3.49it/s] Loading 0: 29%|██▉ | 85/291 [00:21<01:14, 2.75it/s] Loading 0: 30%|██▉ | 86/291 [00:22<01:30, 2.27it/s] Loading 0: 31%|███ | 89/291 [00:22<00:48, 4.16it/s] Loading 0: 31%|███ | 90/291 [00:22<00:45, 4.45it/s] Loading 0: 31%|███▏ | 91/291 [00:22<00:39, 5.04it/s] Loading 0: 32%|███▏ | 93/291 [00:23<00:46, 4.27it/s] Loading 0: 32%|███▏ | 94/291 [00:24<01:01, 3.23it/s] Loading 0: 33%|███▎ | 95/291 [00:24<01:16, 2.55it/s] Loading 0: 34%|███▎ | 98/291 [00:24<00:43, 4.41it/s] Loading 0: 34%|███▍ | 99/291 [00:25<00:41, 4.65it/s] Loading 0: 34%|███▍ | 100/291 [00:25<00:36, 5.23it/s] Loading 0: 35%|███▌ | 102/291 [00:25<00:43, 4.38it/s] Loading 0: 35%|███▌ | 103/291 [00:26<00:57, 3.30it/s] Loading 0: 36%|███▌ | 104/291 [00:27<01:11, 2.60it/s] Loading 0: 37%|███▋ | 107/291 [00:27<00:41, 4.42it/s] Loading 0: 37%|███▋ | 108/291 [00:27<00:39, 4.66it/s] Loading 0: 37%|███▋ | 109/291 [00:27<00:34, 5.25it/s] Loading 0: 38%|███▊ | 111/291 [00:28<00:40, 4.39it/s] Loading 0: 38%|███▊ | 112/291 [00:28<00:54, 3.31it/s] Loading 0: 39%|███▉ | 113/291 [00:29<01:08, 2.60it/s] Loading 0: 40%|███▉ | 116/291 [00:29<00:39, 4.46it/s] Loading 0: 40%|████ | 117/291 [00:29<00:37, 4.69it/s] Loading 0: 41%|████ | 118/291 [00:29<00:33, 5.18it/s] Loading 0: 41%|████ | 120/291 [00:30<00:39, 4.36it/s] Loading 0: 42%|████▏ | 121/291 [00:30<00:51, 3.29it/s] Loading 0: 42%|████▏ | 122/291 [00:31<01:04, 2.61it/s] Loading 0: 43%|████▎ | 125/291 [00:31<00:36, 4.49it/s] Loading 0: 43%|████▎ | 126/291 [00:31<00:34, 4.73it/s] Loading 0: 44%|████▎ | 127/291 [00:32<00:34, 4.81it/s] Loading 0: 44%|████▍ | 129/291 [00:32<00:38, 4.21it/s] Loading 0: 45%|████▍ | 130/291 [00:33<00:49, 3.24it/s] Loading 0: 45%|████▌ | 131/291 [00:33<01:01, 2.60it/s] Loading 0: 46%|████▌ | 134/291 [00:34<00:34, 4.49it/s] Loading 0: 46%|████▋ | 135/291 [00:34<00:33, 4.72it/s] Loading 0: 47%|████▋ | 136/291 [00:34<00:29, 5.27it/s] Loading 0: 47%|████▋ | 138/291 [00:34<00:34, 4.41it/s] Loading 0: 48%|████▊ | 139/291 [00:35<00:45, 3.32it/s] Loading 0: 48%|████▊ | 140/291 [00:36<00:57, 2.61it/s] Loading 0: 49%|████▉ | 143/291 [00:36<00:32, 4.49it/s] Loading 0: 49%|████▉ | 144/291 [00:36<00:31, 4.73it/s] Loading 0: 50%|████▉ | 145/291 [00:36<00:27, 5.29it/s] Loading 0: 51%|█████ | 147/291 [00:37<00:32, 4.47it/s] Loading 0: 51%|█████ | 148/291 [00:37<00:42, 3.37it/s] Loading 0: 51%|█████ | 149/291 [00:38<00:53, 2.66it/s] Loading 0: 52%|█████▏ | 152/291 [00:38<00:30, 4.58it/s] Loading 0: 53%|█████▎ | 153/291 [00:38<00:28, 4.83it/s] Loading 0: 53%|█████▎ | 154/291 [00:38<00:25, 5.37it/s] Loading 0: 54%|█████▎ | 156/291 [00:39<00:29, 4.50it/s] Loading 0: 54%|█████▍ | 157/291 [00:39<00:39, 3.40it/s] Loading 0: 54%|█████▍ | 158/291 [00:40<00:50, 2.66it/s] Loading 0: 55%|█████▌ | 161/291 [00:40<00:28, 4.56it/s] Loading 0: 56%|█████▌ | 162/291 [00:40<00:26, 4.80it/s] Loading 0: 56%|█████▌ | 163/291 [00:41<00:23, 5.34it/s] Loading 0: 57%|█████▋ | 165/291 [00:41<00:27, 4.50it/s] Loading 0: 57%|█████▋ | 166/291 [00:42<00:36, 3.39it/s] Loading 0: 57%|█████▋ | 167/291 [00:42<00:46, 2.67it/s] Loading 0: 58%|█████▊ | 170/291 [00:43<00:26, 4.52it/s] Loading 0: 59%|█████▉ | 171/291 [00:43<00:25, 4.76it/s] Loading 0: 59%|█████▉ | 172/291 [00:43<00:22, 5.36it/s] Loading 0: 59%|█████▉ | 173/291 [00:43<00:32, 3.60it/s] Loading 0: 60%|██████ | 175/291 [00:44<00:23, 4.91it/s] Loading 0: 60%|██████ | 176/291 [00:44<00:22, 5.10it/s] Loading 0: 61%|██████ | 177/291 [00:44<00:20, 5.57it/s] Loading 0: 62%|██████▏ | 179/291 [00:44<00:24, 4.49it/s] Loading 0: 62%|██████▏ | 180/291 [00:45<00:33, 3.32it/s] Loading 0: 62%|██████▏ | 181/291 [00:46<00:42, 2.61it/s] Loading 0: 63%|██████▎ | 184/291 [00:46<00:23, 4.57it/s] Loading 0: 64%|██████▎ | 185/291 [00:46<00:21, 4.88it/s] Loading 0: 64%|██████▍ | 186/291 [00:46<00:19, 5.42it/s] Loading 0: 64%|██████▍ | 187/291 [00:46<00:18, 5.48it/s] Loading 0: 65%|██████▍ | 188/291 [00:47<00:28, 3.60it/s] Loading 0: 65%|██████▍ | 189/291 [00:47<00:37, 2.69it/s] Loading 0: 66%|██████▌ | 192/291 [00:48<00:27, 3.66it/s] Loading 0: 66%|██████▋ | 193/291 [00:49<00:32, 3.01it/s] Loading 0: 67%|██████▋ | 194/291 [00:49<00:38, 2.50it/s] Loading 0: 68%|██████▊ | 197/291 [00:49<00:22, 4.21it/s] Loading 0: 68%|██████▊ | 198/291 [00:50<00:20, 4.48it/s] Loading 0: 69%|██████▉ | 201/291 [00:50<00:20, 4.46it/s] Loading 0: 69%|██████▉ | 202/291 [00:51<00:25, 3.49it/s] Loading 0: 70%|██████▉ | 203/291 [00:51<00:31, 2.79it/s] Loading 0: 71%|███████ | 206/291 [00:52<00:19, 4.45it/s] Loading 0: 71%|███████ | 207/291 [00:52<00:17, 4.68it/s] Loading 0: 71%|███████▏ | 208/291 [00:52<00:16, 5.18it/s] Loading 0: 72%|███████▏ | 210/291 [00:53<00:18, 4.42it/s] Loading 0: 73%|███████▎ | 211/291 [00:53<00:23, 3.37it/s] Loading 0: 73%|███████▎ | 212/291 [00:54<00:29, 2.66it/s] Loading 0: 74%|███████▍ | 215/291 [00:54<00:16, 4.48it/s] Loading 0: 74%|███████▍ | 216/291 [00:54<00:15, 4.73it/s] Loading 0: 75%|███████▍ | 217/291 [00:54<00:14, 5.24it/s] Loading 0: 75%|███████▌ | 219/291 [00:55<00:16, 4.44it/s] Loading 0: 76%|███████▌ | 220/291 [00:55<00:21, 3.35it/s] Loading 0: 76%|███████▌ | 221/291 [00:56<00:26, 2.63it/s] Loading 0: 77%|███████▋ | 224/291 [00:56<00:14, 4.52it/s] Loading 0: 77%|███████▋ | 225/291 [00:56<00:13, 4.76it/s] Loading 0: 78%|███████▊ | 226/291 [00:56<00:12, 5.30it/s] Loading 0: 78%|███████▊ | 228/291 [00:57<00:14, 4.28it/s] Loading 0: 79%|███████▊ | 229/291 [00:58<00:19, 3.17it/s] Loading 0: 79%|███████▉ | 230/291 [00:58<00:24, 2.54it/s] Loading 0: 80%|████████ | 233/291 [00:59<00:13, 4.36it/s] Loading 0: 80%|████████ | 234/291 [00:59<00:12, 4.60it/s] Loading 0: 81%|████████ | 235/291 [00:59<00:10, 5.14it/s] Loading 0: 81%|████████▏ | 237/291 [00:59<00:12, 4.26it/s] Loading 0: 82%|████████▏ | 238/291 [01:00<00:16, 3.23it/s] Loading 0: 82%|████████▏ | 239/291 [01:01<00:20, 2.57it/s] Loading 0: 83%|████████▎ | 242/291 [01:01<00:11, 4.37it/s] Loading 0: 84%|████████▎ | 243/291 [01:01<00:10, 4.61it/s] Loading 0: 84%|████████▍ | 244/291 [01:01<00:09, 5.16it/s] Loading 0: 85%|████████▍ | 246/291 [01:02<00:10, 4.34it/s] Loading 0: 85%|████████▍ | 247/291 [01:02<00:13, 3.27it/s] Loading 0: 85%|████████▌ | 248/291 [01:03<00:16, 2.60it/s] Loading 0: 86%|████████▋ | 251/291 [01:03<00:09, 4.43it/s] Loading 0: 87%|████████▋ | 252/291 [01:03<00:08, 4.68it/s] Loading 0: 87%|████████▋ | 253/291 [01:03<00:07, 5.22it/s] Loading 0: 88%|████████▊ | 255/291 [01:04<00:08, 4.38it/s] Loading 0: 88%|████████▊ | 256/291 [01:05<00:10, 3.32it/s] Loading 0: 88%|████████▊ | 257/291 [01:05<00:12, 2.63it/s] Loading 0: 89%|████████▉ | 260/291 [01:05<00:06, 4.51it/s] Loading 0: 90%|████████▉ | 261/291 [01:06<00:06, 4.72it/s] Loading 0: 90%|█████████ | 262/291 [01:06<00:05, 5.24it/s] Loading 0: 91%|█████████ | 264/291 [01:06<00:06, 4.33it/s] Loading 0: 91%|█████████ | 265/291 [01:07<00:08, 3.24it/s] Loading 0: 91%|█████████▏| 266/291 [01:08<00:09, 2.55it/s] Loading 0: 92%|█████████▏| 269/291 [01:08<00:05, 4.39it/s] Loading 0: 93%|█████████▎| 270/291 [01:08<00:04, 4.62it/s] Loading 0: 93%|█████████▎| 271/291 [01:08<00:03, 5.16it/s] Loading 0: 94%|█████████▍| 273/291 [01:09<00:04, 4.32it/s] Loading 0: 94%|█████████▍| 274/291 [01:09<00:05, 3.29it/s] Loading 0: 95%|█████████▍| 275/291 [01:10<00:06, 2.58it/s] Loading 0: 96%|█████████▌| 278/291 [01:10<00:02, 4.45it/s] Loading 0: 96%|█████████▌| 279/291 [01:10<00:02, 4.71it/s] Loading 0: 96%|█████████▌| 280/291 [01:10<00:02, 5.22it/s] Loading 0: 97%|█████████▋| 281/291 [01:11<00:02, 3.58it/s] Loading 0: 97%|█████████▋| 282/291 [01:11<00:03, 2.70it/s] Loading 0: 98%|█████████▊| 284/291 [01:12<00:01, 3.85it/s] Loading 0: 98%|█████████▊| 285/291 [01:12<00:01, 4.22it/s] Loading 0: 98%|█████████▊| 286/291 [01:12<00:01, 4.85it/s] Loading 0: 99%|█████████▊| 287/291 [01:12<00:00, 5.03it/s] Loading 0: 99%|█████████▉| 288/291 [01:13<00:00, 3.27it/s]
Job rirv938-llama-8b-1024-t-67568-v2-mkmlizer completed after 156.51s with status: succeeded
Stopping job with name rirv938-llama-8b-1024-t-67568-v2-mkmlizer
Pipeline stage MKMLizer completed in 157.02s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rirv938-llama-8b-1024-t-67568-v2
Waiting for inference service rirv938-llama-8b-1024-t-67568-v2 to be ready
Failed to get response for submission nischaydnk-exp27-ftexp1_72507_v2: HTTPConnectionPool(host='nischaydnk-exp27-ftexp1-72507-v2-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission nischaydnk-exp26-mistra_55673_v1: HTTPConnectionPool(host='nischaydnk-exp26-mistra-55673-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission nischaydnk-exp27-ftexp1_72507_v1: HTTPConnectionPool(host='nischaydnk-exp27-ftexp1-72507-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission nischaydnk-exp27-ftexp1_72507_v2: HTTPConnectionPool(host='nischaydnk-exp27-ftexp1-72507-v2-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission nischaydnk-exp26-mistra_55673_v1: HTTPConnectionPool(host='nischaydnk-exp26-mistra-55673-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Inference service rirv938-llama-8b-1024-t-67568-v2 ready after 291.12954664230347s
Pipeline stage MKMLDeployer completed in 291.64s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.876830816268921s
Received healthy response to inference request in 3.097120761871338s
Received healthy response to inference request in 3.798942804336548s
Received healthy response to inference request in 4.074981212615967s
Received healthy response to inference request in 4.003666162490845s
5 requests
0 failed requests
5th percentile: 2.920888805389404
10th percentile: 2.9649467945098875
20th percentile: 3.0530627727508546
30th percentile: 3.23748517036438
40th percentile: 3.518213987350464
50th percentile: 3.798942804336548
60th percentile: 3.880832147598267
70th percentile: 3.9627214908599853
80th percentile: 4.017929172515869
90th percentile: 4.046455192565918
95th percentile: 4.060718202590943
99th percentile: 4.072128610610962
mean time: 3.5703083515167235
%s, retrying in %s seconds...
Received healthy response to inference request in 8.299967050552368s
Received healthy response to inference request in 3.8618812561035156s
Received healthy response to inference request in 3.1345465183258057s
Received healthy response to inference request in 3.5819756984710693s
Received healthy response to inference request in 3.903010845184326s
5 requests
0 failed requests
5th percentile: 3.2240323543548586
10th percentile: 3.313518190383911
20th percentile: 3.4924898624420164
30th percentile: 3.6379568099975588
40th percentile: 3.749919033050537
50th percentile: 3.8618812561035156
60th percentile: 3.87833309173584
70th percentile: 3.894784927368164
80th percentile: 4.782402086257935
90th percentile: 6.541184568405152
95th percentile: 7.420575809478759
99th percentile: 8.124088802337646
mean time: 4.556276273727417
%s, retrying in %s seconds...
Received healthy response to inference request in 3.75142765045166s
Received healthy response to inference request in 2.892049789428711s
Received healthy response to inference request in 3.451401710510254s
Received healthy response to inference request in 3.7166593074798584s
Received healthy response to inference request in 2.4195079803466797s
5 requests
0 failed requests
5th percentile: 2.514016342163086
10th percentile: 2.608524703979492
20th percentile: 2.7975414276123045
30th percentile: 3.0039201736450196
40th percentile: 3.2276609420776365
50th percentile: 3.451401710510254
60th percentile: 3.557504749298096
70th percentile: 3.6636077880859377
80th percentile: 3.723612976074219
90th percentile: 3.7375203132629395
95th percentile: 3.7444739818572996
99th percentile: 3.750036916732788
mean time: 3.2462092876434325
Pipeline stage StressChecker completed in 61.86s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.70s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.77s
Shutdown handler de-registered
rirv938-llama-8b-1024-t_67568_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.12s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-1024-t-67568-v2-profiler
Waiting for inference service rirv938-llama-8b-1024-t-67568-v2-profiler to be ready
Inference service rirv938-llama-8b-1024-t-67568-v2-profiler ready after 291.0672562122345s
Pipeline stage MKMLProfilerDeployer completed in 291.54s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deploljg8k:/code/chaiverse_profiler_1740680135 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deploljg8k --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1740680135 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1740680135/summary.json'
Received signal 15, running shutdown handler
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-1024-t-67568-v2-profiler is running
Tearing down inference service rirv938-llama-8b-1024-t-67568-v2-profiler
Service rirv938-llama-8b-1024-t-67568-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.93s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-1024-t-67568-v2-profiler is running
Tearing down inference service rirv938-llama-8b-1024-t-67568-v2-profiler
Service rirv938-llama-8b-1024-t-67568-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 3.46s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-1024-t-67568-v2-profiler
Ignoring service rirv938-llama-8b-1024-t-67568-v2-profiler already deployed
Waiting for inference service rirv938-llama-8b-1024-t-67568-v2-profiler to be ready
Inference service rirv938-llama-8b-1024-t-67568-v2-profiler ready after 10.08982801437378s
Pipeline stage MKMLProfilerDeployer completed in 10.51s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deploljg8k:/code/chaiverse_profiler_1740683486 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deploljg8k --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1740683486 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1740683486/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deploljg8k:/code/chaiverse_profiler_1740683642 --namespace tenant-chaiml-guanaco
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deploljg8k:/code/chaiverse_profiler_1740683643 --namespace tenant-chaiml-guanaco
clean up pipeline due to error=ISVCScriptError('Command failed with error: Defaulted container "kserve-container" out of: kserve-container, queue-proxy\nerror: unable to upgrade connection: container not found ("kserve-container")\n, output: ')
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-1024-t-67568-v2-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 2.64s
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-1024-t-67568-v2-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 3.52s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-1024-t-67568-v2-profiler
Waiting for inference service rirv938-llama-8b-1024-t-67568-v2-profiler to be ready
Inference service rirv938-llama-8b-1024-t-67568-v2-profiler ready after 291.14896750450134s
Pipeline stage MKMLProfilerDeployer completed in 291.51s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deplomxhtx:/code/chaiverse_profiler_1740683979 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deplomxhtx --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1740683979 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1740683979/summary.json'
kubectl exec -it rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deplomxhtx --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1740683979/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deplomxhtx:/code/chaiverse_profiler_1740685302 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deplomxhtx --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1740685302 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1740685302/summary.json'
%s, retrying in %s seconds...
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deplomxhtx:/code/chaiverse_profiler_1740685615 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-10235dc9903bb3cc214dc1434a0e61ed1e3-deplomxhtx --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1740685615 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1740685615/summary.json'
clean up pipeline due to error=ISVCScriptError('Command failed with error: Defaulted container "kserve-container" out of: kserve-container, queue-proxy\nUnable to use a TTY - input is not a terminal or the right kind of file\nTraceback (most recent call last):\n File "/code/chaiverse_profiler_1740685615/profiles.py", line 605, in <module>\n cli()\n File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__\n return self.main(*args, **kwargs)\n File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1078, in main\n rv = self.invoke(ctx)\n File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke\n return _process_result(sub_ctx.command.invoke(sub_ctx))\n File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke\n return ctx.invoke(self.callback, **ctx.params)\n File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke\n return __callback(*args, **kwargs)\n File "/code/chaiverse_profiler_1740685615/profiles.py", line 104, in profile_batches\n client.wait_for_server_startup(target, max_wait=300)\n File "/code/inference_analysis/client.py", line 149, in wait_for_server_startup\n raise RuntimeError(msg)\nRuntimeError: Timed out after 300s waiting for startup\ncommand terminated with exit code 1\n, output: waiting for startup of TargetModel(endpoint=\'localhost\', route=\'GPT-J-6B-lit-v2\', namespace=\'tenant-chaiml-guanaco\', reward=False, url_format=\'{endpoint}-predictor-default.{namespace}.knative.ord1.coreweave.cloud\')\n')
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-1024-t-67568-v2-profiler is running
Tearing down inference service rirv938-llama-8b-1024-t-67568-v2-profiler
Service rirv938-llama-8b-1024-t-67568-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 3.18s
Shutdown handler de-registered
rirv938-llama-8b-1024-t_67568_v2 status is now inactive due to auto deactivation removed underperforming models
rirv938-llama-8b-1024-t_67568_v2 status is now torndown due to DeploymentManager action
ChatRequest
Generation Params
Prompt Formatter
Chat History
ChatMessage 1