developer_uid: Bbbrun0
submission_id: bbchicago-test-120k-pref-dpo_v1
model_name: bbchicago-test_120k_pref_dpo
model_group: BBChicago/test_120k_pref
status: torndown
timestamp: 2024-07-16T06:27:53+00:00
num_battles: 141994
num_wins: 71404
celo_rating: 1213.55
family_friendly_score: 0.0
submission_type: basic
model_repo: BBChicago/test_120k_pref_dpo
model_architecture: LlamaForCausalLM
reward_repo: Jellywibble/gpt2_xl_pairwise_89m_step_347634
model_num_parameters: 8030261248.0
best_of: 16
max_input_tokens: 512
max_output_tokens: 64
display_name: bbchicago-test_120k_pref_dpo
is_internal_developer: False
language_model: BBChicago/test_120k_pref_dpo
model_size: 8B
ranking_group: single
us_pacific_date: 2024-07-15
win_ratio: 0.5028663182951393
generation_params: {'temperature': 0.95, 'top_p': 0.95, 'min_p': 0.05, 'top_k': 80, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '<|end_header_id|>', '<|eot_id|>', '\n\n{user_name}'], 'max_input_tokens': 512, 'best_of': 16, 'max_output_tokens': 64}
formatter: {'memory_template': "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{bot_name}'s Persona: {memory}\n\n", 'prompt_template': '{prompt}<|eot_id|>', 'bot_template': '<|start_header_id|>assistant<|end_header_id|>\n\n{bot_name}: {message}<|eot_id|>', 'user_template': '<|start_header_id|>user<|end_header_id|>\n\n{user_name}: {message}<|eot_id|>', 'response_template': '<|start_header_id|>assistant<|end_header_id|>\n\n{bot_name}:', 'truncate_by_message': False}
reward_formatter: {'bot_template': '{bot_name}: {message}\n', 'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'response_template': '{bot_name}:', 'truncate_by_message': False, 'user_template': '{user_name}: {message}\n'}
Resubmit model
Running pipeline stage MKMLizer
Starting job with name bbchicago-test-120k-pref-dpo-v1-mkmlizer
Waiting for job on bbchicago-test-120k-pref-dpo-v1-mkmlizer to finish
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ _____ __ __ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ /___/ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ Version: 0.9.5.post2 ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ https://mk1.ai ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ The license key for the current software has been verified as ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ belonging to: ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ Chai Research Corp. ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Connection pool is full, discarding connection: %s. Connection pool size: %s
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Traceback (most recent call last):
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 203, in _new_conn
bbchicago-test-120k-pref-dpo-v1-mkmlizer: sock = connection.create_connection(
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
bbchicago-test-120k-pref-dpo-v1-mkmlizer: raise err
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/urllib3/util/connection.py", line 73, in create_connection
bbchicago-test-120k-pref-dpo-v1-mkmlizer: sock.connect(sa)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: OSError: [Errno 101] Network is unreachable
bbchicago-test-120k-pref-dpo-v1-mkmlizer: The above exception was the direct cause of the following exception:
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Traceback (most recent call last):
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 790, in urlopen
bbchicago-test-120k-pref-dpo-v1-mkmlizer: response = self._make_request(
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 491, in _make_request
bbchicago-test-120k-pref-dpo-v1-mkmlizer: raise new_e
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 467, in _make_request
bbchicago-test-120k-pref-dpo-v1-mkmlizer: self._validate_conn(conn)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1096, in _validate_conn
bbchicago-test-120k-pref-dpo-v1-mkmlizer: conn.connect()
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 611, in connect
bbchicago-test-120k-pref-dpo-v1-mkmlizer: self.sock = sock = self._new_conn()
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/urllib3/connection.py", line 218, in _new_conn
bbchicago-test-120k-pref-dpo-v1-mkmlizer: raise NewConnectionError(
bbchicago-test-120k-pref-dpo-v1-mkmlizer: urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7e45a2883dc0>: Failed to establish a new connection: [Errno 101] Network is unreachable
bbchicago-test-120k-pref-dpo-v1-mkmlizer: The above exception was the direct cause of the following exception:
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Traceback (most recent call last):
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/requests/adapters.py", line 667, in send
bbchicago-test-120k-pref-dpo-v1-mkmlizer: resp = conn.urlopen(
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/urllib3/connectionpool.py", line 844, in urlopen
bbchicago-test-120k-pref-dpo-v1-mkmlizer: retries = retries.increment(
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/urllib3/util/retry.py", line 515, in increment
bbchicago-test-120k-pref-dpo-v1-mkmlizer: raise MaxRetryError(_pool, url, reason) from reason # type: ignore[arg-type]
bbchicago-test-120k-pref-dpo-v1-mkmlizer: urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/BBChicago/test_120k_pref_dpo/paths-info/main (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7e45a2883dc0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))
bbchicago-test-120k-pref-dpo-v1-mkmlizer: During handling of the above exception, another exception occurred:
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Traceback (most recent call last):
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/code/uploading/mkmlize.py", line 151, in <module>
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cli()
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
bbchicago-test-120k-pref-dpo-v1-mkmlizer: return self.main(*args, **kwargs)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1078, in main
bbchicago-test-120k-pref-dpo-v1-mkmlizer: rv = self.invoke(ctx)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
bbchicago-test-120k-pref-dpo-v1-mkmlizer: return _process_result(sub_ctx.command.invoke(sub_ctx))
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
bbchicago-test-120k-pref-dpo-v1-mkmlizer: return ctx.invoke(self.callback, **ctx.params)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
bbchicago-test-120k-pref-dpo-v1-mkmlizer: return __callback(*args, **kwargs)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/code/uploading/mkmlize.py", line 38, in quantize
bbchicago-test-120k-pref-dpo-v1-mkmlizer: temp_folder = download_to_shared_memory(repo_id, revision, hf_auth_token)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/code/uploading/mkmlize.py", line 60, in download_to_shared_memory
bbchicago-test-120k-pref-dpo-v1-mkmlizer: if repo_has_model_safetensors(repo_id, revision, token):
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/code/uploading/mkmlize.py", line 83, in repo_has_model_safetensors
bbchicago-test-120k-pref-dpo-v1-mkmlizer: files = [f.path for f in get_paths_info(repo_id, revision=revision, paths=["/"], token=token)]
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
bbchicago-test-120k-pref-dpo-v1-mkmlizer: return fn(*args, **kwargs)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/hf_api.py", line 3037, in get_paths_info
bbchicago-test-120k-pref-dpo-v1-mkmlizer: response = get_session().post(
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 637, in post
bbchicago-test-120k-pref-dpo-v1-mkmlizer: return self.request("POST", url, data=data, json=json, **kwargs)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
bbchicago-test-120k-pref-dpo-v1-mkmlizer: resp = self.send(prep, **send_kwargs)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
bbchicago-test-120k-pref-dpo-v1-mkmlizer: r = adapter.send(request, **kwargs)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 66, in send
bbchicago-test-120k-pref-dpo-v1-mkmlizer: return super().send(request, *args, **kwargs)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/requests/adapters.py", line 700, in send
bbchicago-test-120k-pref-dpo-v1-mkmlizer: raise ConnectionError(e, request=request)
bbchicago-test-120k-pref-dpo-v1-mkmlizer: requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /api/models/BBChicago/test_120k_pref_dpo/paths-info/main (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7e45a2883dc0>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 1367b137-1232-4ccc-bca1-52281fcad937)')
Job bbchicago-test-120k-pref-dpo-v1-mkmlizer completed after 555.37s with status: failed
Stopping job with name bbchicago-test-120k-pref-dpo-v1-mkmlizer
%s, retrying in %s seconds...
Starting job with name bbchicago-test-120k-pref-dpo-v1-mkmlizer
Waiting for job on bbchicago-test-120k-pref-dpo-v1-mkmlizer to finish
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ _____ __ __ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ /___/ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ Version: 0.9.5.post2 ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ https://mk1.ai ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ The license key for the current software has been verified as ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ belonging to: ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ Chai Research Corp. ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ║ ║
bbchicago-test-120k-pref-dpo-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Downloaded to shared memory in 40.359s
bbchicago-test-120k-pref-dpo-v1-mkmlizer: quantizing model to /dev/shm/model_cache
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.embed_tokens.weight torch.Size([139542528])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.0.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.0.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.0.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.0.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.0.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.0.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.1.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.1.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.1.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.1.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.1.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.1.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.2.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.2.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.2.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.2.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.2.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.2.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.3.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.3.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.3.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.3.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.3.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.3.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.4.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.4.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.4.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.4.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.4.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.4.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.5.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.5.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.5.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.5.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.5.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.5.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.6.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.6.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.6.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.6.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.6.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.6.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.7.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.7.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.7.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.7.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.7.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.7.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.8.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.8.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.8.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.8.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.8.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.8.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.10.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.10.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.10.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.10.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.10.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.10.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.11.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.11.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.11.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.11.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.11.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.11.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.12.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.12.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.12.mlp.up_gate_proj.weight torch.Size([23855104])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.12.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.12.self_attn.o_proj.weight torch.Size([3407872])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.12.self_attn.qkv_proj.weight torch.Size([5111808])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.13.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.13.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: lm_head.weight torch.Size([139542528])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.31.input_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.31.mlp.down_proj.weight torch.Size([11927552])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.layers.31.post_attention_layernorm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: model.norm.weight torch.Size([4096])
bbchicago-test-120k-pref-dpo-v1-mkmlizer: | 181/291 [00:02<00:01, 105.25it/s] Loading 0: 62%|██████▏ | 181/291 [00:02<00:01, 105.25it/s] Loading 0: 63%|██████▎ | 183/291 [00:02<00:01, 105.25it/s] Loading 0: 64%|██████▎ | 185/291 [00:02<00:01, 105.25it/s] Loading 0: 64%|██████▍ | 186/291 [00:02<00:00, 105.25it/s] Loading 0: 64%|██████▍ | 187/291 [00:02<00:00, 105.25it/s] Loading 0: 65%|██████▍ | 188/291 [00:02<00:00, 105.25it/s] Loading 0: 65%|██████▍ | 189/291 [00:02<00:00, 105.25it/s] Loading 0: 65%|██████▌ | 190/291 [00:02<00:00, 105.25it/s] Loading 0: 66%|██████▌ | 191/291 [00:02<00:00, 105.25it/s] Loading 0: 66%|██████▋ | 193/291 [00:02<00:01, 53.46it/s] Loading 0: 66%|██████▋ | 193/291 [00:02<00:01, 53.46it/s] Loading 0: 67%|██████▋ | 194/291 [00:02<00:01, 53.46it/s] Loading 0: 67%|██████▋ | 196/291 [00:02<00:01, 53.46it/s] Loading 0: 68%|██████▊ | 198/291 [00:02<00:01, 53.46it/s] Loading 0: 68%|██████▊ | 199/291 [00:02<00:01, 53.46it/s] Loading 0: 69%|██████▊ | 200/291 [00:02<00:01, 53.46it/s] Loading 0: 69%|██████▉ | 202/291 [00:02<00:01, 53.46it/s] Loading 0: 70%|██████▉ | 203/291 [00:02<00:01, 53.46it/s] Loading 0: 70%|███████ | 205/291 [00:02<00:01, 53.46it/s] Loading 0: 71%|███████ | 206/291 [00:02<00:01, 65.02it/s] Loading 0: 71%|███████ | 207/291 [00:02<00:01, 65.02it/s] Loading 0: 71%|███████▏ | 208/291 [00:02<00:01, 65.02it/s] Loading 0: 72%|███████▏ | 209/291 [00:02<00:01, 65.02it/s] Loading 0: 73%|███████▎ | 211/291 [00:02<00:01, 65.02it/s] Loading 0: 73%|███████▎ | 212/291 [00:02<00:01, 65.02it/s] Loading 0: 74%|███████▎ | 214/291 [00:02<00:01, 65.02it/s] Loading 0: 74%|███████▍ | 216/291 [00:02<00:01, 65.02it/s] Loading 0: 75%|███████▍ | 217/291 [00:02<00:01, 65.02it/s] Loading 0: 75%|███████▍ | 218/291 [00:02<00:01, 65.02it/s] Loading 0: 75%|███████▌ | 219/291 [00:02<00:00, 76.48it/s] Loading 0: 76%|███████▌ | 220/291 [00:02<00:00, 76.48it/s] Loading 0: 76%|███████▌ | 221/291 [00:02<00:00, 76.48it/s] Loading 0: 77%|███████▋ | 223/291 [00:02<00:00, 76.48it/s] Loading 0: 77%|███████▋ | 225/291 [00:02<00:00, 76.48it/s] Loading 0: 78%|███████▊ | 226/291 [00:02<00:00, 76.48it/s] Loading 0: 78%|███████▊ | 227/291 [00:02<00:00, 76.48it/s] Loading 0: 79%|███████▊ | 229/291 [00:02<00:00, 76.48it/s] Loading 0: 79%|███████▉ | 230/291 [00:02<00:00, 80.46it/s] Loading 0: 79%|███████▉ | 230/291 [00:02<00:00, 80.46it/s] Loading 0: 80%|███████▉ | 232/291 [00:02<00:00, 80.46it/s] Loading 0: 80%|████████ | 234/291 [00:02<00:00, 80.46it/s] Loading 0: 81%|████████ | 235/291 [00:02<00:00, 80.46it/s] Loading 0: 81%|████████ | 236/291 [00:02<00:00, 80.46it/s] Loading 0: 82%|████████▏ | 238/291 [00:02<00:00, 80.46it/s] Loading 0: 82%|████████▏ | 239/291 [00:02<00:00, 80.46it/s] Loading 0: 83%|████████▎ | 241/291 [00:02<00:00, 80.46it/s] Loading 0: 84%|████████▎ | 243/291 [00:02<00:00, 80.46it/s] Loading 0: 84%|████████▍ | 244/291 [00:02<00:00, 80.46it/s] Loading 0: 84%|████████▍ | 245/291 [00:02<00:00, 80.46it/s] Loading 0: 85%|████████▍ | 246/291 [00:02<00:00, 95.02it/s] Loading 0: 85%|████████▍ | 247/291 [00:03<00:00, 95.02it/s] Loading 0: 85%|████████▌ | 248/291 [00:03<00:00, 95.02it/s] Loading 0: 86%|████████▌ | 250/291 [00:03<00:00, 95.02it/s] Loading 0: 87%|████████▋ | 252/291 [00:03<00:00, 95.02it/s] Loading 0: 87%|████████▋ | 253/291 [00:03<00:00, 95.02it/s] Loading 0: 87%|████████▋ | 254/291 [00:03<00:00, 95.02it/s] Loading 0: 88%|████████▊ | 256/291 [00:03<00:00, 95.02it/s] Loading 0: 88%|████████▊ | 257/291 [00:03<00:00, 95.02it/s] Loading 0: 89%|████████▊ | 258/291 [00:03<00:00, 98.53it/s] Loading 0: 89%|████████▉ | 259/291 [00:03<00:00, 98.53it/s] Loading 0: 90%|████████▉ | 261/291 [00:03<00:00, 98.53it/s] Loading 0: 90%|█████████ | 262/291 [00:03<00:00, 98.53it/s] Loading 0: 90%|█████████ | 263/291 [00:03<00:00, 98.53it/s] Loading 0: 91%|█████████ | 265/291 [00:03<00:00, 98.53it/s] Loading 0: 91%|█████████▏| 266/291 [00:03<00:00, 98.53it/s] Loading 0: 92%|█████████▏| 268/291 [00:03<00:00, 98.53it/s] Loading 0: 93%|█████████▎| 270/291 [00:03<00:00, 98.53it/s] Loading 0: 93%|█████████▎| 271/291 [00:03<00:00, 98.53it/s] Loading 0: 93%|█████████▎| 272/291 [00:03<00:00, 98.53it/s] Loading 0: 94%|█████████▍| 273/291 [00:03<00:00, 109.12it/s] Loading 0: 94%|█████████▍| 274/291 [00:03<00:00, 109.12it/s] Loading 0: 95%|█████████▍| 275/291 [00:03<00:00, 109.12it/s] Loading 0: 95%|█████████▌| 277/291 [00:03<00:00, 109.12it/s] Loading 0: 96%|█████████▌| 279/291 [00:03<00:00, 109.12it/s] Loading 0: 97%|█████████▋| 281/291 [00:03<00:00, 109.12it/s] Loading 0: 97%|█████████▋| 283/291 [00:03<00:00, 109.12it/s] Loading 0: 98%|█████████▊| 285/291 [00:03<00:00, 109.12it/s] Loading 0: 98%|█████████▊| 286/291 [00:03<00:00, 112.34it/s] Loading 0: 98%|█████████▊| 286/291 [00:10<00:00, 112.34it/s] Loading 0: 99%|█████████▊| 287/291 [00:10<00:00, 112.34it/s] Loading 0: 99%|█████████▉| 288/291 [00:10<00:00, 112.34it/s] Loading 0: 99%|█████████▉| 289/291 [00:10<00:00, 112.34it/s] Loading 0: 100%|█████████▉| 290/291 [00:10<00:00, 112.34it/s] Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
bbchicago-test-120k-pref-dpo-v1-mkmlizer: quantized model in 29.751s
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Processed model BBChicago/test_120k_pref_dpo in 70.110s
bbchicago-test-120k-pref-dpo-v1-mkmlizer: creating bucket guanaco-mkml-models
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
bbchicago-test-120k-pref-dpo-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/bbchicago-test-120k-pref-dpo-v1
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/bbchicago-test-120k-pref-dpo-v1/config.json
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/bbchicago-test-120k-pref-dpo-v1/tokenizer_config.json
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/bbchicago-test-120k-pref-dpo-v1/special_tokens_map.json
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/bbchicago-test-120k-pref-dpo-v1/tokenizer.json
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/bbchicago-test-120k-pref-dpo-v1/flywheel_model.0.safetensors
bbchicago-test-120k-pref-dpo-v1-mkmlizer: loading reward model from Jellywibble/gpt2_xl_pairwise_89m_step_347634
bbchicago-test-120k-pref-dpo-v1-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py:950: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
bbchicago-test-120k-pref-dpo-v1-mkmlizer: warnings.warn(
bbchicago-test-120k-pref-dpo-v1-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:778: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
bbchicago-test-120k-pref-dpo-v1-mkmlizer: warnings.warn(
bbchicago-test-120k-pref-dpo-v1-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:469: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
bbchicago-test-120k-pref-dpo-v1-mkmlizer: warnings.warn(
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Saving model to /tmp/reward_cache/reward.tensors
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Saving duration: 2.283s
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Processed model Jellywibble/gpt2_xl_pairwise_89m_step_347634 in 13.934s
bbchicago-test-120k-pref-dpo-v1-mkmlizer: creating bucket guanaco-reward-models
bbchicago-test-120k-pref-dpo-v1-mkmlizer: Bucket 's3://guanaco-reward-models/' created
bbchicago-test-120k-pref-dpo-v1-mkmlizer: uploading /tmp/reward_cache to s3://guanaco-reward-models/bbchicago-test-120k-pref-dpo-v1_reward
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /tmp/reward_cache/config.json s3://guanaco-reward-models/bbchicago-test-120k-pref-dpo-v1_reward/config.json
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /tmp/reward_cache/special_tokens_map.json s3://guanaco-reward-models/bbchicago-test-120k-pref-dpo-v1_reward/special_tokens_map.json
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /tmp/reward_cache/merges.txt s3://guanaco-reward-models/bbchicago-test-120k-pref-dpo-v1_reward/merges.txt
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /tmp/reward_cache/tokenizer_config.json s3://guanaco-reward-models/bbchicago-test-120k-pref-dpo-v1_reward/tokenizer_config.json
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /tmp/reward_cache/vocab.json s3://guanaco-reward-models/bbchicago-test-120k-pref-dpo-v1_reward/vocab.json
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /tmp/reward_cache/tokenizer.json s3://guanaco-reward-models/bbchicago-test-120k-pref-dpo-v1_reward/tokenizer.json
bbchicago-test-120k-pref-dpo-v1-mkmlizer: cp /tmp/reward_cache/reward.tensors s3://guanaco-reward-models/bbchicago-test-120k-pref-dpo-v1_reward/reward.tensors
Job bbchicago-test-120k-pref-dpo-v1-mkmlizer completed after 142.33s with status: succeeded
Stopping job with name bbchicago-test-120k-pref-dpo-v1-mkmlizer
Pipeline stage MKMLizer completed in 699.40s
Running pipeline stage MKMLKubeTemplater
Pipeline stage MKMLKubeTemplater completed in 0.12s
Running pipeline stage ISVCDeployer
Creating inference service bbchicago-test-120k-pref-dpo-v1
Waiting for inference service bbchicago-test-120k-pref-dpo-v1 to be ready
Inference service bbchicago-test-120k-pref-dpo-v1 ready after 40.20529055595398s
Pipeline stage ISVCDeployer completed in 47.13s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.2271108627319336s
Received healthy response to inference request in 1.4690263271331787s
Received healthy response to inference request in 1.4636867046356201s
Received healthy response to inference request in 1.4765117168426514s
Received healthy response to inference request in 1.4906196594238281s
5 requests
0 failed requests
5th percentile: 1.4647546291351319
10th percentile: 1.4658225536346436
20th percentile: 1.467958402633667
30th percentile: 1.4705234050750733
40th percentile: 1.4735175609588622
50th percentile: 1.4765117168426514
60th percentile: 1.4821548938751221
70th percentile: 1.4877980709075929
80th percentile: 1.6379179000854494
90th percentile: 1.9325143814086916
95th percentile: 2.079812622070312
99th percentile: 2.197651214599609
mean time: 1.6253910541534424
Pipeline stage StressChecker completed in 10.24s
bbchicago-test-120k-pref-dpo_v1 status is now deployed due to DeploymentManager action
bbchicago-test-120k-pref-dpo_v1 status is now inactive due to auto deactivation removed underperforming models
admin requested tearing down of bbchicago-test-120k-pref-dpo_v1
Running pipeline stage ISVCDeleter
Checking if service bbchicago-test-120k-pref-dpo-v1 is running
Skipping teardown as no inference service was found
Pipeline stage ISVCDeleter completed in 5.32s
Running pipeline stage MKMLModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
Deleting key bbchicago-test-120k-pref-dpo-v1/config.json from bucket guanaco-mkml-models
Deleting key bbchicago-test-120k-pref-dpo-v1/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Deleting key bbchicago-test-120k-pref-dpo-v1/special_tokens_map.json from bucket guanaco-mkml-models
Deleting key bbchicago-test-120k-pref-dpo-v1/tokenizer.json from bucket guanaco-mkml-models
Deleting key bbchicago-test-120k-pref-dpo-v1/tokenizer_config.json from bucket guanaco-mkml-models
Cleaning model data from model cache
Deleting key bbchicago-test-120k-pref-dpo-v1_reward/config.json from bucket guanaco-reward-models
Deleting key bbchicago-test-120k-pref-dpo-v1_reward/merges.txt from bucket guanaco-reward-models
Deleting key bbchicago-test-120k-pref-dpo-v1_reward/reward.tensors from bucket guanaco-reward-models
Deleting key bbchicago-test-120k-pref-dpo-v1_reward/special_tokens_map.json from bucket guanaco-reward-models
Deleting key bbchicago-test-120k-pref-dpo-v1_reward/tokenizer.json from bucket guanaco-reward-models
Deleting key bbchicago-test-120k-pref-dpo-v1_reward/tokenizer_config.json from bucket guanaco-reward-models
Deleting key bbchicago-test-120k-pref-dpo-v1_reward/vocab.json from bucket guanaco-reward-models
Pipeline stage MKMLModelDeleter completed in 6.90s
bbchicago-test-120k-pref-dpo_v1 status is now torndown due to DeploymentManager action