submission_id: nousresearch-meta-llama_4939_v32
developer_uid: end_to_end_test
best_of: 4
celo_rating: 1184.27
display_name: nousresearch-meta-llama_4939_v32
family_friendly_score: 0.0
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 1.0, 'top_p': 0.99, 'min_p': 0.1, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 512, 'best_of': 4, 'max_output_tokens': 64}
ineligible_reason: model is only for e2e test
is_internal_developer: True
language_model: NousResearch/Meta-Llama-3.1-8B-Instruct
max_input_tokens: 512
max_output_tokens: 64
model_architecture: LlamaForCausalLM
model_group: NousResearch/Meta-Llama-
model_name: nousresearch-meta-llama_4939_v32
model_num_parameters: 8030261248.0
model_repo: NousResearch/Meta-Llama-3.1-8B-Instruct
model_size: 8B
num_battles: 11247
num_wins: 4812
ranking_group: single
status: torndown
submission_type: basic
timestamp: 2024-08-30T06:05:47+00:00
us_pacific_date: 2024-08-29
win_ratio: 0.4278474259802614
Download Preference Data
Resubmit model
Running pipeline stage MKMLizer
Starting job with name nousresearch-meta-llama-4939-v32-mkmlizer
Waiting for job on nousresearch-meta-llama-4939-v32-mkmlizer to finish
nousresearch-meta-llama-4939-v32-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
nousresearch-meta-llama-4939-v32-mkmlizer: ║ _____ __ __ ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ /___/ ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ Version: 0.10.1 ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ https://mk1.ai ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ The license key for the current software has been verified as ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ belonging to: ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ Chai Research Corp. ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
nousresearch-meta-llama-4939-v32-mkmlizer: ║ ║
nousresearch-meta-llama-4939-v32-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
nousresearch-meta-llama-4939-v32-mkmlizer: Downloaded to shared memory in 52.417s
nousresearch-meta-llama-4939-v32-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpsmwq4a07, device:0
nousresearch-meta-llama-4939-v32-mkmlizer: Saving flywheel model at /dev/shm/model_cache
nousresearch-meta-llama-4939-v32-mkmlizer: quantized model in 25.922s
nousresearch-meta-llama-4939-v32-mkmlizer: Processed model NousResearch/Meta-Llama-3.1-8B-Instruct in 78.339s
nousresearch-meta-llama-4939-v32-mkmlizer: creating bucket guanaco-mkml-models
nousresearch-meta-llama-4939-v32-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
nousresearch-meta-llama-4939-v32-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v32
nousresearch-meta-llama-4939-v32-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v32/config.json
nousresearch-meta-llama-4939-v32-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v32/special_tokens_map.json
nousresearch-meta-llama-4939-v32-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v32/tokenizer_config.json
nousresearch-meta-llama-4939-v32-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v32/tokenizer.json
nousresearch-meta-llama-4939-v32-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v32/flywheel_model.0.safetensors
nousresearch-meta-llama-4939-v32-mkmlizer: Loading 0: 0%| | 0/291 [00:00<?, ?it/s] Loading 0: 2%|▏ | 5/291 [00:00<00:07, 36.64it/s] Loading 0: 5%|▍ | 14/291 [00:00<00:05, 51.51it/s] Loading 0: 8%|▊ | 23/291 [00:00<00:04, 55.60it/s] Loading 0: 11%|█ | 32/291 [00:00<00:04, 57.64it/s] Loading 0: 14%|█▍ | 41/291 [00:00<00:04, 56.67it/s] Loading 0: 17%|█▋ | 50/291 [00:00<00:04, 55.69it/s] Loading 0: 20%|██ | 59/291 [00:01<00:04, 54.41it/s] Loading 0: 23%|██▎ | 67/291 [00:01<00:03, 59.47it/s] Loading 0: 25%|██▌ | 74/291 [00:01<00:03, 57.27it/s] Loading 0: 27%|██▋ | 80/291 [00:01<00:03, 55.45it/s] Loading 0: 30%|██▉ | 86/291 [00:01<00:05, 35.65it/s] Loading 0: 33%|███▎ | 95/291 [00:01<00:04, 40.41it/s] Loading 0: 35%|███▌ | 103/291 [00:02<00:03, 47.45it/s] Loading 0: 37%|███▋ | 109/291 [00:02<00:04, 43.46it/s] Loading 0: 39%|███▉ | 114/291 [00:02<00:04, 43.68it/s] Loading 0: 42%|████▏ | 121/291 [00:02<00:03, 49.31it/s] Loading 0: 44%|████▎ | 127/291 [00:02<00:03, 46.90it/s] Loading 0: 46%|████▌ | 133/291 [00:02<00:03, 47.98it/s] Loading 0: 48%|████▊ | 139/291 [00:02<00:03, 50.26it/s] Loading 0: 50%|████▉ | 145/291 [00:02<00:03, 47.07it/s] Loading 0: 52%|█████▏ | 150/291 [00:03<00:03, 45.90it/s] Loading 0: 54%|█████▍ | 157/291 [00:03<00:02, 49.90it/s] Loading 0: 56%|█████▌ | 163/291 [00:03<00:02, 45.15it/s] Loading 0: 58%|█████▊ | 168/291 [00:03<00:02, 42.00it/s] Loading 0: 59%|█████▉ | 173/291 [00:03<00:02, 43.30it/s] Loading 0: 61%|██████ | 178/291 [00:03<00:03, 30.30it/s] Loading 0: 63%|██████▎ | 182/291 [00:04<00:03, 30.72it/s] Loading 0: 64%|██████▍ | 187/291 [00:04<00:03, 27.82it/s] Loading 0: 66%|██████▌ | 192/291 [00:04<00:03, 30.98it/s] Loading 0: 68%|██████▊ | 197/291 [00:04<00:02, 34.79it/s] Loading 0: 69%|██████▉ | 202/291 [00:04<00:02, 37.38it/s] Loading 0: 71%|███████▏ | 208/291 [00:04<00:02, 39.88it/s] Loading 0: 73%|███████▎ | 213/291 [00:04<00:01, 42.28it/s] Loading 0: 76%|███████▌ | 220/291 [00:04<00:01, 49.19it/s] Loading 0: 78%|███████▊ | 226/291 [00:05<00:01, 49.81it/s] Loading 0: 80%|███████▉ | 232/291 [00:05<00:01, 52.09it/s] Loading 0: 82%|████████▏ | 239/291 [00:05<00:01, 50.00it/s] Loading 0: 85%|████████▌ | 248/291 [00:05<00:00, 52.77it/s] Loading 0: 88%|████████▊ | 257/291 [00:05<00:00, 54.36it/s] Loading 0: 91%|█████████ | 265/291 [00:05<00:00, 59.72it/s] Loading 0: 93%|█████████▎| 272/291 [00:05<00:00, 57.05it/s] Loading 0: 96%|█████████▌| 278/291 [00:05<00:00, 55.00it/s] Loading 0: 98%|█████████▊| 284/291 [00:06<00:00, 50.68it/s] Loading 0: 100%|█████████▉| 290/291 [00:11<00:00, 3.79it/s]
Job nousresearch-meta-llama-4939-v32-mkmlizer completed after 108.48s with status: succeeded
Stopping job with name nousresearch-meta-llama-4939-v32-mkmlizer
Pipeline stage MKMLizer completed in 110.08s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.29s
Running pipeline stage MKMLDeployer
Creating inference service nousresearch-meta-llama-4939-v32
Waiting for inference service nousresearch-meta-llama-4939-v32 to be ready
Inference service nousresearch-meta-llama-4939-v32 ready after 182.36884880065918s
Pipeline stage MKMLDeployer completed in 183.25s
Running pipeline stage StressChecker
Received healthy response to inference request in 8.175087213516235s
Received healthy response to inference request in 2.003225803375244s
Received healthy response to inference request in 1.7437529563903809s
Received healthy response to inference request in 1.6045589447021484s
Received healthy response to inference request in 1.2784578800201416s
5 requests
0 failed requests
5th percentile: 1.343678092956543
10th percentile: 1.4088983058929443
20th percentile: 1.539338731765747
30th percentile: 1.632397747039795
40th percentile: 1.688075351715088
50th percentile: 1.7437529563903809
60th percentile: 1.8475420951843262
70th percentile: 1.9513312339782714
80th percentile: 3.2375980854034436
90th percentile: 5.70634264945984
95th percentile: 6.940714931488036
99th percentile: 7.928212757110596
mean time: 2.96101655960083
Pipeline stage StressChecker completed in 17.01s
Running pipeline stage TriggerMKMLProfilingPipeline
starting trigger_guanaco_pipeline %s
triggered trigger_guanaco_pipeline %s
Pipeline stage TriggerMKMLProfilingPipeline completed in 5.76s
nousresearch-meta-llama_4939_v32 status is now deployed due to DeploymentManager action
nousresearch-meta-llama_4939_v32 status is now inactive due to auto deactivation removed underperforming models
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.41s
Running pipeline stage MKMLProfilerDeployer
Creating inference service nousresearch-meta-llama-4939-v32-profiler
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Inference service nousresearch-meta-llama-4939-v32-profiler ready after 191.75274205207825s
Pipeline stage MKMLProfilerDeployer completed in 193.00s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.41s
Running pipeline stage MKMLProfilerDeployer
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerDeployer completed in 0.18s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.49s
Running pipeline stage MKMLProfilerDeployer
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerDeployer completed in 0.21s
Running pipeline stage MKMLProfilerRunner
script pods %s
Pipeline stage MKMLProfilerRunner completed in 0.83s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.44s
Running pipeline stage MKMLProfilerDeployer
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerDeployer completed in 0.19s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.19s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Tearing down inference service nousresearch-meta-llama-4939-v32-profiler
Service nousresearch-meta-llama-4939-v32-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 3.01s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.45s
Running pipeline stage MKMLProfilerDeployer
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerDeployer completed in 0.21s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.22s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 2.82s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.46s
Running pipeline stage MKMLProfilerDeployer
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerDeployer completed in 0.21s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.21s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 2.56s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.48s
Running pipeline stage MKMLProfilerDeployer
Creating inference service nousresearch-meta-llama-4939-v32-profiler
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Inference service nousresearch-meta-llama-4939-v32-profiler ready after 191.69111108779907s
Pipeline stage MKMLProfilerDeployer completed in 192.72s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.25s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Tearing down inference service nousresearch-meta-llama-4939-v32-profiler
Service nousresearch-meta-llama-4939-v32-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 3.01s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.50s
Running pipeline stage MKMLProfilerDeployer
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerDeployer completed in 0.22s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.21s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 2.59s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.58s
Running pipeline stage MKMLProfilerDeployer
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerDeployer completed in 0.24s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.25s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 2.99s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.56s
Running pipeline stage MKMLProfilerDeployer
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerDeployer completed in 0.24s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.24s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 2.82s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.55s
Running pipeline stage MKMLProfilerDeployer
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerDeployer completed in 0.24s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.24s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 3.01s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.53s
Running pipeline stage MKMLProfilerDeployer
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.63s
Running pipeline stage MKMLProfilerDeployer
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Exception when calling CustomObjectsApi->get_namespaced_custom_object: (404) Reason: Not Found HTTP response headers: HTTPHeaderDict({'Audit-Id': '845ee47d-b41d-4ab8-8cb2-07aafd144181, 095289fa-2293-41a5-a04f-1365530aa9fc', 'Cache-Control': 'no-cache, private, no-cache, private', 'Content-Length': '322', 'Content-Type': 'application/json', 'Date': 'Fri, 30 Aug 2024 19:02:54 GMT', 'X-Kubernetes-Pf-Flowschema-Uid': '514c121f-0f8a-452c-aa56-437270a02244', 'X-Kubernetes-Pf-Prioritylevel-Uid': '48ad322a-4034-4c03-9ea4-7745e1e2c31a'}) HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"inferenceservices.serving.kserve.io \"nousresearch-meta-llama-4939-v32-profiler\" not found","reason":"NotFound","details":{"name":"nousresearch-meta-llama-4939-v32-profiler","group":"serving.kserve.io","kind":"inferenceservices"},"code":404}
Tearing down inference service nousresearch-meta-llama-4939-v32-profiler
404 Reason: Not Found HTTP response headers: HTTPHeaderDict({'Audit-Id': 'b30847f3-931a-42e4-83c3-d7749ea1cb0e, c0ba3aee-a78b-4992-80ce-d4107cbbee2f', 'Cache-Control': 'no-cache, private, no-cache, private', 'Content-Length': '322', 'Content-Type': 'application/json', 'Date': 'Fri, 30 Aug 2024 19:02:55 GMT', 'X-Kubernetes-Pf-Flowschema-Uid': '514c121f-0f8a-452c-aa56-437270a02244', 'X-Kubernetes-Pf-Prioritylevel-Uid': '48ad322a-4034-4c03-9ea4-7745e1e2c31a'}) HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"inferenceservices.serving.kserve.io \\"nousresearch-meta-llama-4939-v32-profiler\\" not found","reason":"NotFound","details":{"name":"nousresearch-meta-llama-4939-v32-profiler","group":"serving.kserve.io","kind":"inferenceservices"},"code":404}\n' Original traceback: File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 55, in inner resp = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 273, in request api_response = self.client.call_api( ^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 348, in call_api return self.__call_api(resource_path, method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 180, in __call_api response_data = self.request( ^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 415, in request return self.rest_client.DELETE(url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 270, in DELETE return self.request("DELETE", url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 238, in request raise ApiException(http_resp=r)
404 Reason: Not Found HTTP response headers: HTTPHeaderDict({'Audit-Id': 'b30847f3-931a-42e4-83c3-d7749ea1cb0e, c0ba3aee-a78b-4992-80ce-d4107cbbee2f', 'Cache-Control': 'no-cache, private, no-cache, private', 'Content-Length': '322', 'Content-Type': 'application/json', 'Date': 'Fri, 30 Aug 2024 19:02:55 GMT', 'X-Kubernetes-Pf-Flowschema-Uid': '514c121f-0f8a-452c-aa56-437270a02244', 'X-Kubernetes-Pf-Prioritylevel-Uid': '48ad322a-4034-4c03-9ea4-7745e1e2c31a'}) HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"inferenceservices.serving.kserve.io \\"nousresearch-meta-llama-4939-v32-profiler\\" not found","reason":"NotFound","details":{"name":"nousresearch-meta-llama-4939-v32-profiler","group":"serving.kserve.io","kind":"inferenceservices"},"code":404}\n' Original traceback: File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 55, in inner resp = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 273, in request api_response = self.client.call_api( ^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 348, in call_api return self.__call_api(resource_path, method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 180, in __call_api response_data = self.request( ^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 415, in request return self.rest_client.DELETE(url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 270, in DELETE return self.request("DELETE", url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 238, in request raise ApiException(http_resp=r)
404 Reason: Not Found HTTP response headers: HTTPHeaderDict({'Audit-Id': 'b30847f3-931a-42e4-83c3-d7749ea1cb0e, c0ba3aee-a78b-4992-80ce-d4107cbbee2f', 'Cache-Control': 'no-cache, private, no-cache, private', 'Content-Length': '322', 'Content-Type': 'application/json', 'Date': 'Fri, 30 Aug 2024 19:02:55 GMT', 'X-Kubernetes-Pf-Flowschema-Uid': '514c121f-0f8a-452c-aa56-437270a02244', 'X-Kubernetes-Pf-Prioritylevel-Uid': '48ad322a-4034-4c03-9ea4-7745e1e2c31a'}) HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"inferenceservices.serving.kserve.io \\"nousresearch-meta-llama-4939-v32-profiler\\" not found","reason":"NotFound","details":{"name":"nousresearch-meta-llama-4939-v32-profiler","group":"serving.kserve.io","kind":"inferenceservices"},"code":404}\n' Original traceback: File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 55, in inner resp = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 273, in request api_response = self.client.call_api( ^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 348, in call_api return self.__call_api(resource_path, method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 180, in __call_api response_data = self.request( ^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 415, in request return self.rest_client.DELETE(url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 270, in DELETE return self.request("DELETE", url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 238, in request raise ApiException(http_resp=r)
%s, retrying in %s seconds...
Tearing down inference service nousresearch-meta-llama-4939-v32-profiler
404 Reason: Not Found HTTP response headers: HTTPHeaderDict({'Audit-Id': '16482460-5fb6-4545-a038-47b9cbb7e7ac, 5fa300b8-bf13-4d71-b260-df7c63e4f197', 'Cache-Control': 'no-cache, private, no-cache, private', 'Content-Length': '322', 'Content-Type': 'application/json', 'Date': 'Fri, 30 Aug 2024 19:02:56 GMT', 'X-Kubernetes-Pf-Flowschema-Uid': '514c121f-0f8a-452c-aa56-437270a02244', 'X-Kubernetes-Pf-Prioritylevel-Uid': '48ad322a-4034-4c03-9ea4-7745e1e2c31a'}) HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"inferenceservices.serving.kserve.io \\"nousresearch-meta-llama-4939-v32-profiler\\" not found","reason":"NotFound","details":{"name":"nousresearch-meta-llama-4939-v32-profiler","group":"serving.kserve.io","kind":"inferenceservices"},"code":404}\n' Original traceback: File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 55, in inner resp = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 273, in request api_response = self.client.call_api( ^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 348, in call_api return self.__call_api(resource_path, method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 180, in __call_api response_data = self.request( ^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 415, in request return self.rest_client.DELETE(url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 270, in DELETE return self.request("DELETE", url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 238, in request raise ApiException(http_resp=r)
404 Reason: Not Found HTTP response headers: HTTPHeaderDict({'Audit-Id': '16482460-5fb6-4545-a038-47b9cbb7e7ac, 5fa300b8-bf13-4d71-b260-df7c63e4f197', 'Cache-Control': 'no-cache, private, no-cache, private', 'Content-Length': '322', 'Content-Type': 'application/json', 'Date': 'Fri, 30 Aug 2024 19:02:56 GMT', 'X-Kubernetes-Pf-Flowschema-Uid': '514c121f-0f8a-452c-aa56-437270a02244', 'X-Kubernetes-Pf-Prioritylevel-Uid': '48ad322a-4034-4c03-9ea4-7745e1e2c31a'}) HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"inferenceservices.serving.kserve.io \\"nousresearch-meta-llama-4939-v32-profiler\\" not found","reason":"NotFound","details":{"name":"nousresearch-meta-llama-4939-v32-profiler","group":"serving.kserve.io","kind":"inferenceservices"},"code":404}\n' Original traceback: File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 55, in inner resp = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 273, in request api_response = self.client.call_api( ^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 348, in call_api return self.__call_api(resource_path, method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 180, in __call_api response_data = self.request( ^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 415, in request return self.rest_client.DELETE(url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 270, in DELETE return self.request("DELETE", url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 238, in request raise ApiException(http_resp=r)
404 Reason: Not Found HTTP response headers: HTTPHeaderDict({'Audit-Id': '16482460-5fb6-4545-a038-47b9cbb7e7ac, 5fa300b8-bf13-4d71-b260-df7c63e4f197', 'Cache-Control': 'no-cache, private, no-cache, private', 'Content-Length': '322', 'Content-Type': 'application/json', 'Date': 'Fri, 30 Aug 2024 19:02:56 GMT', 'X-Kubernetes-Pf-Flowschema-Uid': '514c121f-0f8a-452c-aa56-437270a02244', 'X-Kubernetes-Pf-Prioritylevel-Uid': '48ad322a-4034-4c03-9ea4-7745e1e2c31a'}) HTTP response body: b'{"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"inferenceservices.serving.kserve.io \\"nousresearch-meta-llama-4939-v32-profiler\\" not found","reason":"NotFound","details":{"name":"nousresearch-meta-llama-4939-v32-profiler","group":"serving.kserve.io","kind":"inferenceservices"},"code":404}\n' Original traceback: File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 55, in inner resp = func(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/dynamic/client.py", line 273, in request api_response = self.client.call_api( ^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 348, in call_api return self.__call_api(resource_path, method, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 180, in __call_api response_data = self.request( ^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/api_client.py", line 415, in request return self.rest_client.DELETE(url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 270, in DELETE return self.request("DELETE", url, ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/albert/.pyenv/versions/3.12.0/lib/python3.12/site-packages/kubernetes/client/rest.py", line 238, in request raise ApiException(http_resp=r)
%s, retrying in %s seconds...
Tearing down inference service nousresearch-meta-llama-4939-v32-profiler
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.57s
Running pipeline stage MKMLProfilerDeployer
Creating inference service nousresearch-meta-llama-4939-v32-profiler
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Inference service nousresearch-meta-llama-4939-v32-profiler ready after 181.6351399421692s
Pipeline stage MKMLProfilerDeployer completed in 182.85s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.24s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Tearing down inference service nousresearch-meta-llama-4939-v32-profiler
Service nousresearch-meta-llama-4939-v32-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 3.25s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.54s
Running pipeline stage MKMLProfilerDeployer
Creating inference service nousresearch-meta-llama-4939-v32-profiler
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Inference service nousresearch-meta-llama-4939-v32-profiler ready after 40.46404790878296s
Pipeline stage MKMLProfilerDeployer completed in 41.49s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.25s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.71s
Running pipeline stage MKMLProfilerDeployer
Creating inference service nousresearch-meta-llama-4939-v32-profiler
Ignoring service nousresearch-meta-llama-4939-v32-profiler already deployed
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Inference service nousresearch-meta-llama-4939-v32-profiler ready after 10.212621212005615s
Pipeline stage MKMLProfilerDeployer completed in 11.41s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.29s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.68s
Running pipeline stage MKMLProfilerDeployer
Creating inference service nousresearch-meta-llama-4939-v32-profiler
Ignoring service nousresearch-meta-llama-4939-v32-profiler already deployed
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Inference service nousresearch-meta-llama-4939-v32-profiler ready after 10.214231014251709s
Pipeline stage MKMLProfilerDeployer completed in 11.41s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.22s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.58s
Running pipeline stage MKMLProfilerDeployer
Creating inference service nousresearch-meta-llama-4939-v32-profiler
Ignoring service nousresearch-meta-llama-4939-v32-profiler already deployed
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Inference service nousresearch-meta-llama-4939-v32-profiler ready after 10.375891923904419s
Pipeline stage MKMLProfilerDeployer completed in 63.83s
Running pipeline stage MKMLProfilerRunner
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerRunner completed in 0.26s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.56s
Running pipeline stage MKMLProfilerDeployer
Creating inference service nousresearch-meta-llama-4939-v32-profiler
Ignoring service nousresearch-meta-llama-4939-v32-profiler already deployed
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Inference service nousresearch-meta-llama-4939-v32-profiler ready after 10.227960109710693s
Pipeline stage MKMLProfilerDeployer completed in 14.21s
Running pipeline stage MKMLProfilerRunner
script pods %s
Pipeline stage MKMLProfilerRunner completed in 0.92s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Tearing down inference service nousresearch-meta-llama-4939-v32-profiler
Service nousresearch-meta-llama-4939-v32-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 3.31s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.68s
Running pipeline stage MKMLProfilerDeployer
Creating inference service nousresearch-meta-llama-4939-v32-profiler
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.63s
Running pipeline stage MKMLProfilerDeployer
Creating inference service nousresearch-meta-llama-4939-v32-profiler
Ignoring service nousresearch-meta-llama-4939-v32-profiler already deployed
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Inference service nousresearch-meta-llama-4939-v32-profiler ready after 171.56513214111328s
Pipeline stage MKMLProfilerDeployer completed in 173.05s
Running pipeline stage MKMLProfilerRunner
script pods %s
Pipeline stage MKMLProfilerRunner completed in 1.12s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Tearing down inference service nousresearch-meta-llama-4939-v32-profiler
Service nousresearch-meta-llama-4939-v32-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 3.68s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.58s
Running pipeline stage MKMLProfilerDeployer
Creating inference service nousresearch-meta-llama-4939-v32-profiler
Waiting for inference service nousresearch-meta-llama-4939-v32-profiler to be ready
Inference service nousresearch-meta-llama-4939-v32-profiler ready after 181.60373306274414s
Pipeline stage MKMLProfilerDeployer completed in 182.74s
Running pipeline stage MKMLProfilerRunner
script pods %s
Pipeline stage MKMLProfilerRunner completed in 1.04s
Running pipeline stage MKMLProfilerDeleter
Checking if service nousresearch-meta-llama-4939-v32-profiler is running
Tearing down inference service nousresearch-meta-llama-4939-v32-profiler
Service nousresearch-meta-llama-4939-v32-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 3.04s
nousresearch-meta-llama_4939_v32 status is now torndown due to DeploymentManager action