developer_uid: rinen0721
submission_id: rinen0721-llama0914_v4
model_name: rinen0721-llama0914_v4
model_group: rinen0721/llama0914
status: torndown
timestamp: 2024-09-14T12:53:13+00:00
num_battles: 10803
num_wins: 5004
celo_rating: 1226.88
family_friendly_score: 0.0
submission_type: basic
model_repo: rinen0721/llama0914
model_architecture: LlamaForCausalLM
model_num_parameters: 8030261248.0
best_of: 16
max_input_tokens: 512
max_output_tokens: 64
latencies: [{'batch_size': 1, 'throughput': 0.9077734108567004, 'latency_mean': 1.1015375065803528, 'latency_p50': 1.0980055332183838, 'latency_p90': 1.2300389051437377}, {'batch_size': 4, 'throughput': 1.803881698288466, 'latency_mean': 2.201165987253189, 'latency_p50': 2.221821904182434, 'latency_p90': 2.4705490112304687}, {'batch_size': 5, 'throughput': 1.857735196844779, 'latency_mean': 2.6773219513893127, 'latency_p50': 2.6921589374542236, 'latency_p90': 3.0114025592803957}, {'batch_size': 8, 'throughput': 2.013628149151397, 'latency_mean': 3.935594834089279, 'latency_p50': 3.948670983314514, 'latency_p90': 4.420715427398681}, {'batch_size': 10, 'throughput': 2.034064690343759, 'latency_mean': 4.867819367647171, 'latency_p50': 4.866330027580261, 'latency_p90': 5.644235420227051}, {'batch_size': 12, 'throughput': 2.0502282116460937, 'latency_mean': 5.7668782496452335, 'latency_p50': 5.781975626945496, 'latency_p90': 6.812478184700012}, {'batch_size': 15, 'throughput': 2.0064563266551794, 'latency_mean': 7.3357511746883395, 'latency_p50': 7.478062987327576, 'latency_p90': 8.041895818710326}]
gpu_counts: {'NVIDIA RTX A5000': 1}
display_name: rinen0721-llama0914_v4
is_internal_developer: False
language_model: rinen0721/llama0914
model_size: 8B
ranking_group: single
throughput_3p7s: 2.01
us_pacific_date: 2024-09-14
win_ratio: 0.46320466537073035
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 512, 'best_of': 16, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rinen0721-llama0914-v4-mkmlizer
Waiting for job on rinen0721-llama0914-v4-mkmlizer to finish
rinen0721-llama0914-v4-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rinen0721-llama0914-v4-mkmlizer: ║ _____ __ __ ║
rinen0721-llama0914-v4-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rinen0721-llama0914-v4-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rinen0721-llama0914-v4-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rinen0721-llama0914-v4-mkmlizer: ║ /___/ ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ Version: 0.10.1 ║
rinen0721-llama0914-v4-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rinen0721-llama0914-v4-mkmlizer: ║ https://mk1.ai ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ The license key for the current software has been verified as ║
rinen0721-llama0914-v4-mkmlizer: ║ belonging to: ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ Chai Research Corp. ║
rinen0721-llama0914-v4-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rinen0721-llama0914-v4-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
rinen0721-llama0914-v4-mkmlizer: Traceback (most recent call last):
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
rinen0721-llama0914-v4-mkmlizer: response.raise_for_status()
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
rinen0721-llama0914-v4-mkmlizer: raise HTTPError(http_error_msg, response=self)
rinen0721-llama0914-v4-mkmlizer: requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://huggingface.co/rinen0721/llama0914/resolve/401297a7ad6499a219929a75af524726f6158afb/generation_config.json
rinen0721-llama0914-v4-mkmlizer: The above exception was the direct cause of the following exception:
rinen0721-llama0914-v4-mkmlizer: Traceback (most recent call last):
rinen0721-llama0914-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 151, in <module>
rinen0721-llama0914-v4-mkmlizer: cli()
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
rinen0721-llama0914-v4-mkmlizer: return self.main(*args, **kwargs)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1078, in main
rinen0721-llama0914-v4-mkmlizer: rv = self.invoke(ctx)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
rinen0721-llama0914-v4-mkmlizer: return _process_result(sub_ctx.command.invoke(sub_ctx))
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
rinen0721-llama0914-v4-mkmlizer: return ctx.invoke(self.callback, **ctx.params)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
rinen0721-llama0914-v4-mkmlizer: return __callback(*args, **kwargs)
rinen0721-llama0914-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 38, in quantize
rinen0721-llama0914-v4-mkmlizer: temp_folder = download_to_shared_memory(repo_id, revision, hf_auth_token)
rinen0721-llama0914-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 65, in download_to_shared_memory
rinen0721-llama0914-v4-mkmlizer: snapshot_download(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
rinen0721-llama0914-v4-mkmlizer: return fn(*args, **kwargs)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/_snapshot_download.py", line 297, in snapshot_download
rinen0721-llama0914-v4-mkmlizer: _inner_hf_hub_download(file)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/_snapshot_download.py", line 273, in _inner_hf_hub_download
rinen0721-llama0914-v4-mkmlizer: return hf_hub_download(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
rinen0721-llama0914-v4-mkmlizer: return f(*args, **kwargs)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
rinen0721-llama0914-v4-mkmlizer: return fn(*args, **kwargs)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1220, in hf_hub_download
rinen0721-llama0914-v4-mkmlizer: return _hf_hub_download_to_local_dir(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1515, in _hf_hub_download_to_local_dir
rinen0721-llama0914-v4-mkmlizer: _download_to_tmp_and_move(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1915, in _download_to_tmp_and_move
rinen0721-llama0914-v4-mkmlizer: http_get(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 463, in http_get
rinen0721-llama0914-v4-mkmlizer: r = _request_wrapper(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 400, in _request_wrapper
rinen0721-llama0914-v4-mkmlizer: hf_raise_for_status(response)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 371, in hf_raise_for_status
rinen0721-llama0914-v4-mkmlizer: raise HfHubHTTPError(str(e), response=response) from e
rinen0721-llama0914-v4-mkmlizer: huggingface_hub.utils._errors.HfHubHTTPError: 500 Server Error: Internal Server Error for url: https://huggingface.co/rinen0721/llama0914/resolve/401297a7ad6499a219929a75af524726f6158afb/generation_config.json (Request ID: Root=1-66e58743-5fd23bf32bc1c72146c7eb64;8dba9ac8-0491-4906-b5e7-e09f926d80a3)
rinen0721-llama0914-v4-mkmlizer: Internal Error - We're working hard to fix this as soon as possible!
Job rinen0721-llama0914-v4-mkmlizer completed after 23.65s with status: failed
Stopping job with name rinen0721-llama0914-v4-mkmlizer
%s, retrying in %s seconds...
Starting job with name rinen0721-llama0914-v4-mkmlizer
Waiting for job on rinen0721-llama0914-v4-mkmlizer to finish
rinen0721-llama0914-v4-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rinen0721-llama0914-v4-mkmlizer: ║ _____ __ __ ║
rinen0721-llama0914-v4-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rinen0721-llama0914-v4-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rinen0721-llama0914-v4-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rinen0721-llama0914-v4-mkmlizer: ║ /___/ ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ Version: 0.10.1 ║
rinen0721-llama0914-v4-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rinen0721-llama0914-v4-mkmlizer: ║ https://mk1.ai ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ The license key for the current software has been verified as ║
rinen0721-llama0914-v4-mkmlizer: ║ belonging to: ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ Chai Research Corp. ║
rinen0721-llama0914-v4-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rinen0721-llama0914-v4-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
rinen0721-llama0914-v4-mkmlizer: Traceback (most recent call last):
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 304, in hf_raise_for_status
rinen0721-llama0914-v4-mkmlizer: response.raise_for_status()
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/requests/models.py", line 1024, in raise_for_status
rinen0721-llama0914-v4-mkmlizer: raise HTTPError(http_error_msg, response=self)
rinen0721-llama0914-v4-mkmlizer: requests.exceptions.HTTPError: 500 Server Error: Internal Server Error for url: https://huggingface.co/rinen0721/llama0914/resolve/401297a7ad6499a219929a75af524726f6158afb/special_tokens_map.json
rinen0721-llama0914-v4-mkmlizer: The above exception was the direct cause of the following exception:
rinen0721-llama0914-v4-mkmlizer: Traceback (most recent call last):
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1751, in _get_metadata_or_catch_error
rinen0721-llama0914-v4-mkmlizer: metadata = get_hf_file_metadata(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
rinen0721-llama0914-v4-mkmlizer: return fn(*args, **kwargs)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1673, in get_hf_file_metadata
rinen0721-llama0914-v4-mkmlizer: r = _request_wrapper(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 376, in _request_wrapper
rinen0721-llama0914-v4-mkmlizer: response = _request_wrapper(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 400, in _request_wrapper
rinen0721-llama0914-v4-mkmlizer: hf_raise_for_status(response)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_errors.py", line 371, in hf_raise_for_status
rinen0721-llama0914-v4-mkmlizer: raise HfHubHTTPError(str(e), response=response) from e
rinen0721-llama0914-v4-mkmlizer: huggingface_hub.utils._errors.HfHubHTTPError: 500 Server Error: Internal Server Error for url: https://huggingface.co/rinen0721/llama0914/resolve/401297a7ad6499a219929a75af524726f6158afb/special_tokens_map.json (Request ID: Root=1-66e58771-0b235db02fa140a92df9b712;3b9a1c4c-821f-464c-8d56-dac0a2f6eb4d)
rinen0721-llama0914-v4-mkmlizer: Internal Error - We're working hard to fix this as soon as possible!
rinen0721-llama0914-v4-mkmlizer: The above exception was the direct cause of the following exception:
rinen0721-llama0914-v4-mkmlizer: Traceback (most recent call last):
rinen0721-llama0914-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 151, in <module>
rinen0721-llama0914-v4-mkmlizer: cli()
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
rinen0721-llama0914-v4-mkmlizer: return self.main(*args, **kwargs)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1078, in main
rinen0721-llama0914-v4-mkmlizer: rv = self.invoke(ctx)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
rinen0721-llama0914-v4-mkmlizer: return _process_result(sub_ctx.command.invoke(sub_ctx))
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
rinen0721-llama0914-v4-mkmlizer: return ctx.invoke(self.callback, **ctx.params)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
rinen0721-llama0914-v4-mkmlizer: return __callback(*args, **kwargs)
rinen0721-llama0914-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 38, in quantize
rinen0721-llama0914-v4-mkmlizer: temp_folder = download_to_shared_memory(repo_id, revision, hf_auth_token)
rinen0721-llama0914-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 65, in download_to_shared_memory
rinen0721-llama0914-v4-mkmlizer: snapshot_download(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
rinen0721-llama0914-v4-mkmlizer: return fn(*args, **kwargs)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/_snapshot_download.py", line 297, in snapshot_download
rinen0721-llama0914-v4-mkmlizer: _inner_hf_hub_download(file)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/_snapshot_download.py", line 273, in _inner_hf_hub_download
rinen0721-llama0914-v4-mkmlizer: return hf_hub_download(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_deprecation.py", line 101, in inner_f
rinen0721-llama0914-v4-mkmlizer: return f(*args, **kwargs)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
rinen0721-llama0914-v4-mkmlizer: return fn(*args, **kwargs)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1220, in hf_hub_download
rinen0721-llama0914-v4-mkmlizer: return _hf_hub_download_to_local_dir(
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1468, in _hf_hub_download_to_local_dir
rinen0721-llama0914-v4-mkmlizer: _raise_on_head_call_error(head_call_error, force_download, local_files_only)
rinen0721-llama0914-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1857, in _raise_on_head_call_error
rinen0721-llama0914-v4-mkmlizer: raise LocalEntryNotFoundError(
rinen0721-llama0914-v4-mkmlizer: huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.
rinen0721-llama0914-v4-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rinen0721-llama0914-v4-mkmlizer: ║ _____ __ __ ║
rinen0721-llama0914-v4-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rinen0721-llama0914-v4-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rinen0721-llama0914-v4-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rinen0721-llama0914-v4-mkmlizer: ║ /___/ ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ Version: 0.10.1 ║
rinen0721-llama0914-v4-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rinen0721-llama0914-v4-mkmlizer: ║ https://mk1.ai ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ The license key for the current software has been verified as ║
rinen0721-llama0914-v4-mkmlizer: ║ belonging to: ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ Chai Research Corp. ║
rinen0721-llama0914-v4-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rinen0721-llama0914-v4-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Job rinen0721-llama0914-v4-mkmlizer completed after 56.26s with status: failed
Stopping job with name rinen0721-llama0914-v4-mkmlizer
%s, retrying in %s seconds...
Starting job with name rinen0721-llama0914-v4-mkmlizer
Waiting for job on rinen0721-llama0914-v4-mkmlizer to finish
rinen0721-llama0914-v4-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rinen0721-llama0914-v4-mkmlizer: ║ _____ __ __ ║
rinen0721-llama0914-v4-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rinen0721-llama0914-v4-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rinen0721-llama0914-v4-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rinen0721-llama0914-v4-mkmlizer: ║ /___/ ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ Version: 0.10.1 ║
rinen0721-llama0914-v4-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rinen0721-llama0914-v4-mkmlizer: ║ https://mk1.ai ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ The license key for the current software has been verified as ║
rinen0721-llama0914-v4-mkmlizer: ║ belonging to: ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ║ Chai Research Corp. ║
rinen0721-llama0914-v4-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rinen0721-llama0914-v4-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
rinen0721-llama0914-v4-mkmlizer: ║ ║
rinen0721-llama0914-v4-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
rinen0721-llama0914-v4-mkmlizer: Downloaded to shared memory in 20.626s
rinen0721-llama0914-v4-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpwjnu6nm4, device:0
rinen0721-llama0914-v4-mkmlizer: Saving flywheel model at /dev/shm/model_cache
rinen0721-llama0914-v4-mkmlizer: Loading 0: 0%| | 0/291 [00:00<?, ?it/s] Loading 0: 2%|▏ | 7/291 [00:00<00:06, 46.79it/s] Loading 0: 4%|▍ | 13/291 [00:00<00:10, 25.36it/s] Loading 0: 8%|▊ | 22/291 [00:00<00:06, 39.78it/s] Loading 0: 12%|█▏ | 34/291 [00:00<00:04, 54.34it/s] Loading 0: 15%|█▍ | 43/291 [00:00<00:03, 62.82it/s] Loading 0: 18%|█▊ | 52/291 [00:00<00:03, 62.53it/s] Loading 0: 21%|██ | 61/291 [00:01<00:03, 67.75it/s] Loading 0: 24%|██▍ | 70/291 [00:01<00:03, 71.58it/s] Loading 0: 27%|██▋ | 79/291 [00:01<00:02, 73.54it/s] Loading 0: 30%|██▉ | 87/291 [00:02<00:10, 20.21it/s] Loading 0: 32%|███▏ | 94/291 [00:02<00:08, 24.34it/s] Loading 0: 35%|███▌ | 103/291 [00:02<00:06, 30.27it/s] Loading 0: 38%|███▊ | 112/291 [00:02<00:04, 37.68it/s] Loading 0: 42%|████▏ | 121/291 [00:02<00:03, 45.88it/s] Loading 0: 45%|████▍ | 130/291 [00:03<00:03, 53.60it/s] Loading 0: 48%|████▊ | 139/291 [00:03<00:02, 58.64it/s] Loading 0: 51%|█████ | 148/291 [00:03<00:02, 58.26it/s] Loading 0: 54%|█████▍ | 157/291 [00:03<00:02, 59.24it/s] Loading 0: 57%|█████▋ | 166/291 [00:03<00:01, 63.19it/s] Loading 0: 60%|██████ | 175/291 [00:03<00:01, 63.71it/s] Loading 0: 63%|██████▎ | 184/291 [00:03<00:01, 64.57it/s] Loading 0: 66%|██████▌ | 191/291 [00:04<00:05, 19.12it/s] Loading 0: 67%|██████▋ | 196/291 [00:05<00:04, 20.84it/s] Loading 0: 70%|███████ | 205/291 [00:05<00:03, 27.20it/s] Loading 0: 74%|███████▎ | 214/291 [00:05<00:02, 33.70it/s] Loading 0: 77%|███████▋ | 223/291 [00:05<00:01, 39.38it/s] Loading 0: 80%|███████▉ | 232/291 [00:05<00:01, 45.10it/s] Loading 0: 83%|████████▎ | 241/291 [00:05<00:01, 49.04it/s] Loading 0: 86%|████████▌ | 250/291 [00:05<00:00, 51.36it/s] Loading 0: 89%|████████▉ | 259/291 [00:06<00:00, 53.62it/s] Loading 0: 92%|█████████▏| 268/291 [00:06<00:00, 56.28it/s] Loading 0: 95%|█████████▌| 277/291 [00:06<00:00, 57.17it/s] Loading 0: 98%|█████████▊| 286/291 [00:06<00:00, 59.43it/s]
Failed to get response for submission blend_puheb_2024-09-09: ('http://chaiml-llama-8b-pairwis-8189-v19-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:49800->127.0.0.1:8080: read: connection reset by peer\n')
rinen0721-llama0914-v4-mkmlizer: quantized model in 26.462s
rinen0721-llama0914-v4-mkmlizer: Processed model rinen0721/llama0914 in 46.225s
rinen0721-llama0914-v4-mkmlizer: creating bucket guanaco-mkml-models
rinen0721-llama0914-v4-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rinen0721-llama0914-v4-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rinen0721-llama0914-v4
rinen0721-llama0914-v4-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/rinen0721-llama0914-v4/config.json
rinen0721-llama0914-v4-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/rinen0721-llama0914-v4/special_tokens_map.json
rinen0721-llama0914-v4-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/rinen0721-llama0914-v4/tokenizer_config.json
rinen0721-llama0914-v4-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rinen0721-llama0914-v4/tokenizer.json
rinen0721-llama0914-v4-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/rinen0721-llama0914-v4/flywheel_model.0.safetensors
rinen0721-llama0914-v4-mkmlizer: Loading 0: 0%| | 0/291 [00:00<?, ?it/s] Loading 0: 2%|▏ | 7/291 [00:00<00:05, 47.85it/s] Loading 0: 8%|▊ | 22/291 [00:00<00:03, 76.01it/s] Loading 0: 11%|█ | 31/291 [00:00<00:03, 72.80it/s] Loading 0: 14%|█▎ | 40/291 [00:00<00:03, 78.04it/s] Loading 0: 17%|█▋ | 49/291 [00:00<00:03, 76.14it/s] Loading 0: 20%|█▉ | 58/291 [00:00<00:03, 73.43it/s] Loading 0: 23%|██▎ | 67/291 [00:00<00:02, 76.11it/s] Loading 0: 26%|██▌ | 76/291 [00:01<00:02, 72.15it/s] Loading 0: 29%|██▉ | 84/291 [00:02<00:10, 19.88it/s] Loading 0: 31%|███ | 90/291 [00:02<00:08, 22.94it/s] Loading 0: 35%|███▌ | 103/291 [00:02<00:05, 33.07it/s] Loading 0: 38%|███▊ | 112/291 [00:02<00:04, 40.25it/s] Loading 0: 42%|████▏ | 121/291 [00:02<00:03, 47.41it/s] Loading 0: 46%|████▌ | 133/291 [00:02<00:02, 56.27it/s] Loading 0: 49%|████▉ | 142/291 [00:02<00:02, 61.78it/s] Loading 0: 52%|█████▏ | 152/291 [00:03<00:01, 69.62it/s] Loading 0: 55%|█████▌ | 161/291 [00:03<00:01, 73.88it/s] Loading 0: 60%|██████ | 175/291 [00:03<00:01, 78.78it/s] Loading 0: 63%|██████▎ | 184/291 [00:03<00:01, 80.10it/s] Loading 0: 66%|██████▋ | 193/291 [00:04<00:04, 21.88it/s] Loading 0: 69%|██████▉ | 202/291 [00:04<00:03, 27.62it/s] Loading 0: 73%|███████▎ | 211/291 [00:04<00:02, 33.32it/s] Loading 0: 76%|███████▌ | 220/291 [00:04<00:01, 39.10it/s] Loading 0: 79%|███████▊ | 229/291 [00:05<00:01, 45.12it/s] Loading 0: 83%|████████▎ | 241/291 [00:05<00:00, 53.98it/s] Loading 0: 86%|████████▌ | 250/291 [00:05<00:00, 60.31it/s] Loading 0: 89%|████████▉ | 259/291 [00:05<00:00, 64.43it/s] Loading 0: 93%|█████████▎| 272/291 [00:05<00:00, 79.17it/s] Loading 0: 97%|█████████▋| 282/291 [00:05<00:00, 78.59it/s] Loading 0: 100%|██████████| 291/291 [00:11<00:00, 5.73it/s]
Job rinen0721-llama0914-v4-mkmlizer completed after 69.66s with status: succeeded
Stopping job with name rinen0721-llama0914-v4-mkmlizer
Pipeline stage MKMLizer completed in 153.05s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.11s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rinen0721-llama0914-v4
Waiting for inference service rinen0721-llama0914-v4 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Failed to get response for submission blend_hokok_2024-09-09: ('http://neversleep-noromaid-v0-8068-v150-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Inference service rinen0721-llama0914-v4 ready after 161.2427110671997s
Pipeline stage MKMLDeployer completed in 163.87s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.278333902359009s
Received healthy response to inference request in 2.200289487838745s
Received healthy response to inference request in 1.7183713912963867s
Received healthy response to inference request in 1.546417474746704s
Received healthy response to inference request in 2.0489349365234375s
5 requests
0 failed requests
5th percentile: 1.5808082580566407
10th percentile: 1.615199041366577
20th percentile: 1.68398060798645
30th percentile: 1.7844841003417968
40th percentile: 1.9167095184326173
50th percentile: 2.0489349365234375
60th percentile: 2.1094767570495607
70th percentile: 2.1700185775756835
80th percentile: 2.215898370742798
90th percentile: 2.247116136550903
95th percentile: 2.262725019454956
99th percentile: 2.2752121257781983
mean time: 1.9584694385528565
Pipeline stage StressChecker completed in 13.11s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 8.22s
Shutdown handler de-registered
rinen0721-llama0914_v4 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.23s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.18s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rinen0721-llama0914-v4-profiler
Waiting for inference service rinen0721-llama0914-v4-profiler to be ready
Inference service rinen0721-llama0914-v4-profiler ready after 170.3920373916626s
Pipeline stage MKMLProfilerDeployer completed in 171.03s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rinen0721-llama0914-v4-profiler-predictor-00001-deploymentkv6pt:/code/chaiverse_profiler_1726318948 --namespace tenant-chaiml-guanaco
kubectl exec -it rinen0721-llama0914-v4-profiler-predictor-00001-deploymentkv6pt --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1726318948 && python profiles.py profile --best_of_n 16 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 512 --output_tokens 64 --summary /code/chaiverse_profiler_1726318948/summary.json'
kubectl exec -it rinen0721-llama0914-v4-profiler-predictor-00001-deploymentkv6pt --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1726318948/summary.json'
Pipeline stage MKMLProfilerRunner completed in 842.14s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rinen0721-llama0914-v4-profiler is running
Tearing down inference service rinen0721-llama0914-v4-profiler
Service rinen0721-llama0914-v4-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 5.70s
Shutdown handler de-registered
rinen0721-llama0914_v4 status is now inactive due to auto deactivation removed underperforming models
rinen0721-llama0914_v4 status is now torndown due to DeploymentManager action