developer_uid: Nitral-AI
submission_id: nitral-ai-hathor-tahsin_6217_v10
model_name: nitral-ai-hathor-l3-8b-v-01_v12
model_group: Nitral-AI/Hathor_Tahsin-
status: torndown
timestamp: 2024-07-14T15:55:56+00:00
num_battles: 64654
num_wins: 36180
celo_rating: 1241.68
family_friendly_score: 0.0
submission_type: basic
model_repo: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
model_architecture: LlamaForCausalLM
reward_repo: ChaiML/gpt2_xl_pairwise_89m_step_347634
model_num_parameters: 8030261248.0
best_of: 16
max_input_tokens: 512
max_output_tokens: 64
display_name: nitral-ai-hathor-l3-8b-v-01_v12
is_internal_developer: False
language_model: Nitral-AI/Hathor_Tahsin-L3-8B-v0.85
model_size: 8B
ranking_group: single
us_pacific_date: 2024-07-14
win_ratio: 0.5595941473072045
generation_params: {'temperature': 1.25, 'top_p': 1.0, 'min_p': 0.05, 'top_k': 80, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '<|eot_id|>'], 'max_input_tokens': 512, 'best_of': 16, 'max_output_tokens': 64}
formatter: {'memory_template': "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{bot_name}'s Persona: {memory}\n\n", 'prompt_template': '{prompt}<|eot_id|>', 'bot_template': '<|start_header_id|>assistant<|end_header_id|>\n\n{bot_name}: {message}<|eot_id|>', 'user_template': '<|start_header_id|>user<|end_header_id|>\n\n{user_name}: {message}<|eot_id|>', 'response_template': '<|start_header_id|>assistant<|end_header_id|>\n\n{bot_name}:', 'truncate_by_message': False}
reward_formatter: {'bot_template': '{bot_name}: {message}\n', 'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'response_template': '{bot_name}:', 'truncate_by_message': False, 'user_template': '{user_name}: {message}\n'}
Resubmit model
Running pipeline stage MKMLizer
Starting job with name nitral-ai-hathor-tahsin-6217-v10-mkmlizer
Waiting for job on nitral-ai-hathor-tahsin-6217-v10-mkmlizer to finish
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ _____ __ __ ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ /___/ ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ Version: 0.9.5.post2 ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ https://mk1.ai ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ The license key for the current software has been verified as ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ belonging to: ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ Chai Research Corp. ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ║ ║
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Downloaded to shared memory in 42.419s
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: quantizing model to /dev/shm/model_cache
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Saving flywheel model at /dev/shm/model_cache
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: lm_head.weight torch.Size([139542528])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.embed_tokens.weight torch.Size([139542528])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.0.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.0.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.0.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.0.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.0.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.0.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.1.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.1.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.1.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.1.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.1.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.1.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.10.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.10.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.10.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.10.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.10.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.10.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.11.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.11.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.11.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.11.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.11.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.11.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.12.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.12.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.12.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.12.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.12.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.12.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.13.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.13.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.13.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.13.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.13.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.13.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.14.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.14.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.14.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.14.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.14.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.14.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.15.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.15.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.15.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.15.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.15.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.15.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.16.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.16.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.16.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.16.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.16.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.16.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.17.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.17.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.17.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.17.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.17.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.17.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.18.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.18.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.18.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.18.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.18.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.18.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.19.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.19.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.19.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.19.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.19.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.19.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.2.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.2.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.2.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.2.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.2.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.2.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.20.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.20.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.20.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.20.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.20.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.20.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.21.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.21.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.21.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.21.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.21.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.21.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.22.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.22.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.22.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.22.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.22.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.22.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.23.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.23.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.23.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.23.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.23.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.23.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.24.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.24.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.24.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.24.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.24.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.24.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.25.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.25.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.25.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.25.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.25.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.25.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.26.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.26.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.26.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.26.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.26.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.26.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.27.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.27.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.27.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.27.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.27.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.27.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.28.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.28.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.28.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.28.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.28.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.28.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.29.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.29.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.29.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.29.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.29.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.29.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.3.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.3.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.3.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.3.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.3.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.3.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.30.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.30.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.30.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Loading 0: 0%| | 0/291 [00:00<?, ?it/s] Loading 0: 0%| | 1/291 [00:05<27:13, 5.63s/it] Loading 0: 1%| | 2/291 [00:05<13:34, 2.82s/it] Loading 0: 1%| | 2/291 [00:05<13:34, 2.82s/it] Loading 0: 1%| | 3/291 [00:05<13:31, 2.82s/it] Loading 0: 2%|▏ | 5/291 [00:05<13:25, 2.82s/it] Loading 0: 2%|▏ | 6/291 [00:05<13:22, 2.82s/it] Loading 0: 3%|▎ | 8/291 [00:05<13:17, 2.82s/it] Loading 0: 3%|▎ | 10/291 [00:05<13:11, 2.82s/it] Loading 0: 4%|▍ | 11/291 [00:05<13:08, 2.82s/it] Loading 0: 4%|▍ | 12/291 [00:05<13:05, 2.82s/it] Loading 0: 5%|▍ | 14/291 [00:05<01:23, 3.30it/s] Loading 0: 5%|▍ | 14/291 [00:05<01:23, 3.30it/s] Loading 0: 5%|▌ | 15/291 [00:05<01:23, 3.30it/s] Loading 0: 6%|▌ | 17/291 [00:05<01:23, 3.30it/s] Loading 0: 7%|▋ | 19/291 [00:05<01:22, 3.30it/s] Loading 0: 7%|▋ | 20/291 [00:05<01:22, 3.30it/s] Loading 0: 7%|▋ | 21/291 [00:05<01:21, 3.30it/s] Loading 0: 8%|▊ | 23/291 [00:05<01:21, 3.30it/s] Loading 0: 8%|▊ | 24/291 [00:05<01:20, 3.30it/s] Loading 0: 9%|▉ | 26/291 [00:05<01:20, 3.30it/s] Loading 0: 10%|▉ | 28/291 [00:05<00:33, 7.93it/s] Loading 0: 10%|▉ | 28/291 [00:05<00:33, 7.93it/s] Loading 0: 10%|▉ | 29/291 [00:05<00:33, 7.93it/s] Loading 0: 10%|█ | 30/291 [00:05<00:32, 7.93it/s] Loading 0: 11%|█ | 32/291 [00:05<00:32, 7.93it/s] Loading 0: 11%|█▏ | 33/291 [00:05<00:32, 7.93it/s] Loading 0: 12%|█▏ | 35/291 [00:05<00:32, 7.93it/s] Loading 0: 13%|█▎ | 37/291 [00:05<00:32, 7.93it/s] Loading 0: 13%|█▎ | 38/291 [00:05<00:31, 7.93it/s] Loading 0: 13%|█▎ | 39/291 [00:05<00:31, 7.93it/s] Loading 0: 14%|█▍ | 41/291 [00:05<00:18, 13.50it/s] Loading 0: 14%|█▍ | 41/291 [00:05<00:18, 13.50it/s] Loading 0: 14%|█▍ | 42/291 [00:05<00:18, 13.50it/s] Loading 0: 15%|█▌ | 44/291 [00:05<00:18, 13.50it/s] Loading 0: 16%|█▌ | 46/291 [00:06<00:18, 13.50it/s] Loading 0: 16%|█▌ | 47/291 [00:06<00:18, 13.50it/s] Loading 0: 16%|█▋ | 48/291 [00:06<00:18, 13.50it/s] Loading 0: 17%|█▋ | 50/291 [00:06<00:17, 13.50it/s] Loading 0: 18%|█▊ | 51/291 [00:06<00:17, 13.50it/s] Loading 0: 18%|█▊ | 53/291 [00:06<00:17, 13.50it/s] Loading 0: 19%|█▊ | 54/291 [00:06<00:11, 20.67it/s] Loading 0: 19%|█▉ | 55/291 [00:06<00:11, 20.67it/s] Loading 0: 19%|█▉ | 56/291 [00:06<00:11, 20.67it/s] Loading 0: 20%|█▉ | 57/291 [00:06<00:11, 20.67it/s] Loading 0: 20%|██ | 59/291 [00:06<00:11, 20.67it/s] Loading 0: 21%|██ | 60/291 [00:06<00:11, 20.67it/s] Loading 0: 21%|██▏ | 62/291 [00:06<00:11, 20.67it/s] Loading 0: 22%|██▏ | 64/291 [00:06<00:10, 20.67it/s] Loading 0: 22%|██▏ | 65/291 [00:06<00:10, 20.67it/s] Loading 0: 23%|██▎ | 66/291 [00:06<00:10, 21.61it/s] Loading 0: 23%|██▎ | 66/291 [00:06<00:10, 21.61it/s] Loading 0: 23%|██▎ | 68/291 [00:06<00:10, 21.61it/s] Loading 0: 24%|██▎ | 69/291 [00:06<00:10, 21.61it/s] Loading 0: 24%|██▍ | 71/291 [00:06<00:10, 21.61it/s] Loading 0: 25%|██▌ | 73/291 [00:06<00:10, 21.61it/s] Loading 0: 25%|██▌ | 74/291 [00:06<00:10, 21.61it/s] Loading 0: 26%|██▌ | 75/291 [00:06<00:09, 21.61it/s] Loading 0: 26%|██▋ | 77/291 [00:06<00:07, 28.41it/s] Loading 0: 26%|██▋ | 77/291 [00:06<00:07, 28.41it/s] Loading 0: 27%|██▋ | 78/291 [00:06<00:07, 28.41it/s] Loading 0: 27%|██▋ | 80/291 [00:06<00:07, 28.41it/s] Loading 0: 28%|██▊ | 82/291 [00:06<00:07, 28.41it/s] Loading 0: 29%|██▊ | 83/291 [00:06<00:07, 28.41it/s] Loading 0: 29%|██▉ | 84/291 [00:06<00:07, 28.41it/s] Loading 0: 30%|██▉ | 86/291 [00:06<00:07, 28.41it/s] Loading 0: 30%|██▉ | 87/291 [00:06<00:07, 28.41it/s] Loading 0: 31%|███ | 89/291 [00:06<00:07, 28.41it/s] Loading 0: 31%|███ | 90/291 [00:06<00:05, 38.57it/s] Loading 0: 31%|███▏ | 91/291 [00:06<00:05, 38.57it/s] Loading 0: 32%|███▏ | 92/291 [00:06<00:05, 38.57it/s] Loading 0: 32%|███▏ | 93/291 [00:06<00:05, 38.57it/s] Loading 0: 33%|███▎ | 95/291 [00:06<00:05, 38.57it/s] Loading 0: 33%|███▎ | 96/291 [00:06<00:05, 38.57it/s] Loading 0: 34%|███▎ | 98/291 [00:06<00:05, 38.57it/s] Loading 0: 34%|███▍ | 100/291 [00:06<00:04, 38.57it/s] Loading 0: 35%|███▍ | 101/291 [00:06<00:04, 38.57it/s] Loading 0: 35%|███▌ | 102/291 [00:06<00:04, 38.57it/s] Loading 0: 35%|███▌ | 103/291 [00:06<00:03, 49.46it/s] Loading 0: 36%|███▌ | 104/291 [00:06<00:03, 49.46it/s] Loading 0: 36%|███▌ | 105/291 [00:06<00:03, 49.46it/s] Loading 0: 37%|███▋ | 107/291 [00:06<00:03, 49.46it/s] Loading 0: 37%|███▋ | 109/291 [00:06<00:03, 49.46it/s] Loading 0: 38%|███▊ | 110/291 [00:06<00:03, 49.46it/s] Loading 0: 38%|███▊ | 111/291 [00:06<00:03, 49.46it/s] Loading 0: 39%|███▉ | 113/291 [00:07<00:03, 49.46it/s] Loading 0: 39%|███▉ | 114/291 [00:07<00:03, 57.30it/s] Loading 0: 39%|███▉ | 114/291 [00:07<00:03, 57.30it/s] Loading 0: 40%|███▉ | 116/291 [00:07<00:03, 57.30it/s] Loading 0: 41%|████ | 118/291 [00:07<00:03, 57.30it/s] Loading 0: 41%|████ | 119/291 [00:07<00:03, 57.30it/s] Loading 0: 41%|████ | 120/291 [00:07<00:02, 57.30it/s] Loading 0: 42%|████▏ | 122/291 [00:07<00:02, 57.30it/s] Loading 0: 42%|████▏ | 123/291 [00:07<00:02, 57.30it/s] Loading 0: 43%|████▎ | 125/291 [00:07<00:02, 57.30it/s] Loading 0: 44%|████▎ | 127/291 [00:07<00:02, 57.30it/s] Loading 0: 44%|████▍ | 128/291 [00:07<00:02, 57.30it/s] Loading 0: 44%|████▍ | 129/291 [00:07<00:02, 57.30it/s] Loading 0: 45%|████▍ | 130/291 [00:07<00:02, 73.97it/s] Loading 0: 45%|████▌ | 131/291 [00:07<00:02, 73.97it/s] Loading 0: 45%|████▌ | 132/291 [00:07<00:02, 73.97it/s] Loading 0: 46%|████▌ | 134/291 [00:07<00:02, 73.97it/s] Loading 0: 47%|████▋ | 136/291 [00:07<00:02, 73.97it/s] Loading 0: 47%|████▋ | 137/291 [00:07<00:02, 73.97it/s] Loading 0: 47%|████▋ | 138/291 [00:07<00:02, 73.97it/s] Loading 0: 48%|████▊ | 140/291 [00:07<00:02, 73.97it/s] Loading 0: 48%|████▊ | 141/291 [00:07<00:02, 73.97it/s] Loading 0: 49%|████▉ | 142/291 [00:07<00:01, 79.62it/s] Loading 0: 49%|████▉ | 143/291 [00:07<00:01, 79.62it/s] Loading 0: 50%|████▉ | 145/291 [00:07<00:01, 79.62it/s] Loading 0: 50%|█████ | 146/291 [00:07<00:01, 79.62it/s] Loading 0: 51%|█████ | 147/291 [00:07<00:01, 79.62it/s] Loading 0: 51%|█████ | 149/291 [00:07<00:01, 79.62it/s] Loading 0: 52%|█████▏ | 150/291 [00:07<00:01, 79.62it/s] Loading 0: 52%|█████▏ | 152/291 [00:07<00:01, 79.62it/s] Loading 0: 53%|█████▎ | 154/291 [00:07<00:01, 86.09it/s] Loading 0: 53%|█████▎ | 154/291 [00:07<00:01, 86.09it/s] Loading 0: 53%|█████▎ | 155/291 [00:07<00:01, 86.09it/s] Loading 0: 54%|█████▎ | 156/291 [00:07<00:01, 86.09it/s] Loading 0: 54%|█████▍ | 158/291 [00:07<00:01, 86.09it/s] Loading 0: 55%|█████▍ | 159/291 [00:07<00:01, 86.09it/s] Loading 0: 55%|█████▌ | 161/291 [00:07<00:01, 86.09it/s] Loading 0: 56%|█████▌ | 163/291 [00:07<00:01, 86.09it/s] Loading 0: 56%|█████▋ | 164/291 [00:07<00:01, 86.09it/s] Loading 0: 57%|█████▋ | 165/291 [00:07<00:01, 86.41it/s] Loading 0: 57%|█████▋ | 165/291 [00:07<00:01, 86.41it/s] Loading 0: 57%|█████▋ | 167/291 [00:07<00:01, 86.41it/s] Loading 0: 58%|█████▊ | 168/291 [00:07<00:01, 86.41it/s] Loading 0: 58%|█████▊ | 170/291 [00:07<00:01, 86.41it/s] Loading 0: 59%|█████▉ | 172/291 [00:07<00:01, 86.41it/s] Loading 0: 59%|█████▉ | 173/291 [00:07<00:01, 86.41it/s] Loading 0: 60%|█████▉ | 174/291 [00:07<00:01, 86.41it/s] Loading 0: 60%|██████ | 176/291 [00:08<00:02, 45.56it/s] Loading 0: 60%|██████ | 176/291 [00:08<00:02, 45.56it/s] Loading 0: 61%|██████ | 177/291 [00:08<00:02, 45.56it/s] model.layers.30.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.30.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.30.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.31.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.31.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.31.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.31.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.31.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.31.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.4.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.4.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.4.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.4.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.4.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.4.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.5.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.5.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.5.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.5.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.5.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.5.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.6.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.6.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.6.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.6.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.6.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.6.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.7.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.7.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.7.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.7.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.7.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.7.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.8.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.8.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.8.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.8.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.8.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.8.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.9.input_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.9.mlp.down_proj.weight torch.Size([11927552])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.9.mlp.up_gate_proj.weight torch.Size([23855104])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.9.post_attention_layernorm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.9.self_attn.o_proj.weight torch.Size([3407872])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.layers.9.self_attn.qkv_proj.weight torch.Size([5111808])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: model.norm.weight torch.Size([4096])
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Loading 0: 62%|██████▏ | 179/291 [00:08<00:02, 45.56it/s] Loading 0: 62%|██████▏ | 181/291 [00:08<00:02, 45.56it/s] Loading 0: 63%|██████▎ | 182/291 [00:08<00:02, 45.56it/s] Loading 0: 63%|██████▎ | 183/291 [00:08<00:02, 45.56it/s] Loading 0: 64%|██████▎ | 185/291 [00:08<00:02, 45.56it/s] Loading 0: 64%|██████▍ | 186/291 [00:08<00:02, 45.56it/s] Loading 0: 65%|██████▍ | 188/291 [00:08<00:02, 45.56it/s] Loading 0: 65%|██████▍ | 189/291 [00:08<00:01, 57.27it/s] Loading 0: 65%|██████▌ | 190/291 [00:08<00:01, 57.27it/s] Loading 0: 66%|██████▌ | 191/291 [00:08<00:01, 57.27it/s] Loading 0: 66%|██████▌ | 192/291 [00:08<00:01, 57.27it/s] Loading 0: 67%|██████▋ | 194/291 [00:08<00:01, 57.27it/s] Loading 0: 67%|██████▋ | 195/291 [00:08<00:01, 57.27it/s] Loading 0: 68%|██████▊ | 197/291 [00:08<00:01, 57.27it/s] Loading 0: 68%|██████▊ | 199/291 [00:08<00:01, 57.27it/s] Loading 0: 69%|██████▊ | 200/291 [00:08<00:01, 57.27it/s] Loading 0: 69%|██████▉ | 201/291 [00:08<00:01, 57.27it/s] Loading 0: 69%|██████▉ | 202/291 [00:08<00:01, 68.92it/s] Loading 0: 70%|██████▉ | 203/291 [00:08<00:01, 68.92it/s] Loading 0: 70%|███████ | 204/291 [00:08<00:01, 68.92it/s] Loading 0: 71%|███████ | 206/291 [00:08<00:01, 68.92it/s] Loading 0: 71%|███████▏ | 208/291 [00:08<00:01, 68.92it/s] Loading 0: 72%|███████▏ | 209/291 [00:08<00:01, 68.92it/s] Loading 0: 72%|███████▏ | 210/291 [00:08<00:01, 68.92it/s] Loading 0: 73%|███████▎ | 212/291 [00:08<00:01, 68.92it/s] Loading 0: 73%|███████▎ | 213/291 [00:08<00:01, 75.27it/s] Loading 0: 73%|███████▎ | 213/291 [00:08<00:01, 75.27it/s] Loading 0: 74%|███████▍ | 215/291 [00:08<00:01, 75.27it/s] Loading 0: 75%|███████▍ | 217/291 [00:08<00:00, 75.27it/s] Loading 0: 75%|███████▍ | 218/291 [00:08<00:00, 75.27it/s] Loading 0: 75%|███████▌ | 219/291 [00:08<00:00, 75.27it/s] Loading 0: 76%|███████▌ | 221/291 [00:08<00:00, 75.27it/s] Loading 0: 76%|███████▋ | 222/291 [00:08<00:00, 75.27it/s] Loading 0: 77%|███████▋ | 224/291 [00:08<00:00, 75.27it/s] Loading 0: 78%|███████▊ | 226/291 [00:08<00:00, 75.27it/s] Loading 0: 78%|███████▊ | 227/291 [00:08<00:00, 75.27it/s] Loading 0: 78%|███████▊ | 228/291 [00:08<00:00, 75.27it/s] Loading 0: 79%|███████▊ | 229/291 [00:08<00:00, 91.20it/s] Loading 0: 79%|███████▉ | 230/291 [00:08<00:00, 91.20it/s] Loading 0: 79%|███████▉ | 231/291 [00:08<00:00, 91.20it/s] Loading 0: 80%|████████ | 233/291 [00:08<00:00, 91.20it/s] Loading 0: 81%|████████ | 235/291 [00:08<00:00, 91.20it/s] Loading 0: 81%|████████ | 236/291 [00:08<00:00, 91.20it/s] Loading 0: 81%|████████▏ | 237/291 [00:08<00:00, 91.20it/s] Loading 0: 82%|████████▏ | 239/291 [00:08<00:00, 91.20it/s] Loading 0: 82%|████████▏ | 240/291 [00:08<00:00, 91.20it/s] Loading 0: 83%|████████▎ | 241/291 [00:08<00:00, 95.00it/s] Loading 0: 83%|████████▎ | 242/291 [00:08<00:00, 95.00it/s] Loading 0: 84%|████████▍ | 244/291 [00:08<00:00, 95.00it/s] Loading 0: 84%|████████▍ | 245/291 [00:08<00:00, 95.00it/s] Loading 0: 85%|████████▍ | 246/291 [00:08<00:00, 95.00it/s] Loading 0: 85%|████████▌ | 248/291 [00:08<00:00, 95.00it/s] Loading 0: 86%|████████▌ | 249/291 [00:08<00:00, 95.00it/s] Loading 0: 86%|████████▋ | 251/291 [00:08<00:00, 95.00it/s] Loading 0: 87%|████████▋ | 253/291 [00:08<00:00, 95.00it/s] Loading 0: 87%|████████▋ | 254/291 [00:08<00:00, 95.00it/s] Loading 0: 88%|████████▊ | 255/291 [00:08<00:00, 95.00it/s] Loading 0: 88%|████████▊ | 256/291 [00:08<00:00, 102.60it/s] Loading 0: 88%|████████▊ | 257/291 [00:08<00:00, 102.60it/s] Loading 0: 89%|████████▊ | 258/291 [00:08<00:00, 102.60it/s] Loading 0: 89%|████████▉ | 260/291 [00:08<00:00, 102.60it/s] Loading 0: 90%|█████████ | 262/291 [00:08<00:00, 102.60it/s] Loading 0: 90%|█████████ | 263/291 [00:08<00:00, 102.60it/s] Loading 0: 91%|█████████ | 264/291 [00:08<00:00, 102.60it/s] Loading 0: 91%|█████████▏| 266/291 [00:09<00:00, 102.60it/s] Loading 0: 92%|█████████▏| 267/291 [00:09<00:00, 102.60it/s] Loading 0: 92%|█████████▏| 268/291 [00:09<00:00, 51.58it/s] Loading 0: 92%|█████████▏| 269/291 [00:09<00:00, 51.58it/s] Loading 0: 93%|█████████▎| 271/291 [00:09<00:00, 51.58it/s] Loading 0: 93%|█████████▎| 272/291 [00:09<00:00, 51.58it/s] Loading 0: 94%|█████████▍| 273/291 [00:09<00:00, 51.58it/s] Loading 0: 95%|█████████▍| 275/291 [00:09<00:00, 51.58it/s] Loading 0: 95%|█████████▍| 276/291 [00:09<00:00, 51.58it/s] Loading 0: 96%|█████████▌| 278/291 [00:09<00:00, 51.58it/s] Loading 0: 96%|█████████▌| 280/291 [00:09<00:00, 51.58it/s] Loading 0: 97%|█████████▋| 281/291 [00:09<00:00, 51.58it/s] Loading 0: 97%|█████████▋| 282/291 [00:09<00:00, 51.58it/s] Loading 0: 97%|█████████▋| 283/291 [00:09<00:00, 64.46it/s] Loading 0: 98%|█████████▊| 284/291 [00:09<00:00, 64.46it/s] Loading 0: 98%|█████████▊| 285/291 [00:09<00:00, 64.46it/s] Loading 0: 99%|█████████▊| 287/291 [00:09<00:00, 64.46it/s] Loading 0: 99%|█████████▉| 289/291 [00:09<00:00, 64.46it/s] Loading 0: 100%|█████████▉| 290/291 [00:09<00:00, 64.46it/s] Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: quantized model in 30.404s
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Processed model Nitral-AI/Hathor_Tahsin-L3-8B-v0.85 in 72.824s
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: creating bucket guanaco-mkml-models
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/nitral-ai-hathor-tahsin-6217-v10
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/nitral-ai-hathor-tahsin-6217-v10/special_tokens_map.json
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/nitral-ai-hathor-tahsin-6217-v10/tokenizer_config.json
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/nitral-ai-hathor-tahsin-6217-v10/config.json
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/nitral-ai-hathor-tahsin-6217-v10/tokenizer.json
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/nitral-ai-hathor-tahsin-6217-v10/flywheel_model.0.safetensors
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: loading reward model from ChaiML/gpt2_xl_pairwise_89m_step_347634
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py:950: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: warnings.warn(
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:778: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: warnings.warn(
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:469: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: warnings.warn(
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Downloading shards: 0%| | 0/2 [00:00<?, ?it/s] Downloading shards: 50%|█████ | 1/2 [00:07<00:07, 7.55s/it] Downloading shards: 100%|██████████| 2/2 [00:09<00:00, 4.52s/it] Downloading shards: 100%|██████████| 2/2 [00:09<00:00, 4.97s/it]
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|█████ | 1/2 [00:01<00:01, 1.11s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00, 1.71it/s] Loading checkpoint shards: 100%|██████████| 2/2 [00:01<00:00, 1.51it/s]
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Saving model to /tmp/reward_cache/reward.tensors
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Saving duration: 2.320s
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Processed model ChaiML/gpt2_xl_pairwise_89m_step_347634 in 17.975s
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: creating bucket guanaco-reward-models
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: Bucket 's3://guanaco-reward-models/' created
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: uploading /tmp/reward_cache to s3://guanaco-reward-models/nitral-ai-hathor-tahsin-6217-v10_reward
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: cp /tmp/reward_cache/config.json s3://guanaco-reward-models/nitral-ai-hathor-tahsin-6217-v10_reward/config.json
nitral-ai-hathor-tahsin-6217-v10-mkmlizer: cp /tmp/reward_cache/reward.tensors s3://guanaco-reward-models/nitral-ai-hathor-tahsin-6217-v10_reward/reward.tensors
Job nitral-ai-hathor-tahsin-6217-v10-mkmlizer completed after 123.15s with status: succeeded
Stopping job with name nitral-ai-hathor-tahsin-6217-v10-mkmlizer
Pipeline stage MKMLizer completed in 124.32s
Running pipeline stage MKMLKubeTemplater
Pipeline stage MKMLKubeTemplater completed in 0.12s
Running pipeline stage ISVCDeployer
Creating inference service nitral-ai-hathor-tahsin-6217-v10
Waiting for inference service nitral-ai-hathor-tahsin-6217-v10 to be ready
Inference service nitral-ai-hathor-tahsin-6217-v10 ready after 50.376763105392456s
Pipeline stage ISVCDeployer completed in 57.62s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.3348677158355713s
Received healthy response to inference request in 1.4829280376434326s
Received healthy response to inference request in 1.4665472507476807s
Received healthy response to inference request in 1.4122285842895508s
Received healthy response to inference request in 1.4870731830596924s
5 requests
0 failed requests
5th percentile: 1.4230923175811767
10th percentile: 1.4339560508728026
20th percentile: 1.4556835174560547
30th percentile: 1.469823408126831
40th percentile: 1.4763757228851317
50th percentile: 1.4829280376434326
60th percentile: 1.4845860958099366
70th percentile: 1.4862441539764404
80th percentile: 1.6566320896148683
90th percentile: 1.99574990272522
95th percentile: 2.165308809280395
99th percentile: 2.300955934524536
mean time: 1.6367289543151855
Pipeline stage StressChecker completed in 9.57s
nitral-ai-hathor-tahsin_6217_v10 status is now deployed due to DeploymentManager action
nitral-ai-hathor-tahsin_6217_v10 status is now inactive due to auto deactivation removed underperforming models
admin requested tearing down of nitral-ai-hathor-tahsin_6217_v10
Running pipeline stage ISVCDeleter
Checking if service nitral-ai-hathor-tahsin-6217-v10 is running
Skipping teardown as no inference service was found
Pipeline stage ISVCDeleter completed in 5.01s
Running pipeline stage MKMLModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
Deleting key nitral-ai-hathor-tahsin-6217-v10/config.json from bucket guanaco-mkml-models
Deleting key nitral-ai-hathor-tahsin-6217-v10/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Deleting key nitral-ai-hathor-tahsin-6217-v10/special_tokens_map.json from bucket guanaco-mkml-models
Deleting key nitral-ai-hathor-tahsin-6217-v10/tokenizer.json from bucket guanaco-mkml-models
Deleting key nitral-ai-hathor-tahsin-6217-v10/tokenizer_config.json from bucket guanaco-mkml-models
Cleaning model data from model cache
Deleting key nitral-ai-hathor-tahsin-6217-v10_reward/config.json from bucket guanaco-reward-models
Deleting key nitral-ai-hathor-tahsin-6217-v10_reward/merges.txt from bucket guanaco-reward-models
Deleting key nitral-ai-hathor-tahsin-6217-v10_reward/reward.tensors from bucket guanaco-reward-models
Deleting key nitral-ai-hathor-tahsin-6217-v10_reward/special_tokens_map.json from bucket guanaco-reward-models
Deleting key nitral-ai-hathor-tahsin-6217-v10_reward/tokenizer.json from bucket guanaco-reward-models
Deleting key nitral-ai-hathor-tahsin-6217-v10_reward/tokenizer_config.json from bucket guanaco-reward-models
Deleting key nitral-ai-hathor-tahsin-6217-v10_reward/vocab.json from bucket guanaco-reward-models
Pipeline stage MKMLModelDeleter completed in 7.12s
nitral-ai-hathor-tahsin_6217_v10 status is now torndown due to DeploymentManager action