submission_id: nousresearch-meta-llama_4941_v82
developer_uid: end_to_end_test
alignment_samples: 0
best_of: 4
celo_rating: 1124.28
display_name: nousresearch-meta-llama_4941_v82
formatter: {'memory_template': 'character: {bot_name} {memory}\n', 'prompt_template': '{prompt}', 'bot_template': '{bot_name}: {message}', 'user_template': '{user_name}: {message}', 'response_template': '{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 1.0, 'top_p': 0.99, 'min_p': 0.1, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 512, 'best_of': 4, 'max_output_tokens': 64}
ineligible_reason: model is only for e2e test
is_internal_developer: True
language_model: NousResearch/Meta-Llama-3-8B-Instruct
max_input_tokens: 512
max_output_tokens: 64
model_architecture: LlamaForCausalLM
model_group: NousResearch/Meta-Llama-
model_name: nousresearch-meta-llama_4941_v82
model_num_parameters: 8030261248.0
model_repo: NousResearch/Meta-Llama-3-8B-Instruct
model_size: 8B
num_battles: 132409
num_wins: 52038
propriety_score: 0.7309046110524463
propriety_total_count: 11364.0
ranking_group: single
reward_formatter: {'bot_template': '{bot_name}: {message}', 'memory_template': 'character: {bot_name} {memory}\n', 'prompt_template': '{prompt}', 'response_template': '{bot_name}:', 'truncate_by_message': False, 'user_template': '{user_name}: {message}'}
reward_repo: ChaiML/reward_models_100_170000000_cp_498032
status: torndown
submission_type: basic
timestamp: 2024-07-13T23:30:08+00:00
us_pacific_date: 2024-07-13
win_ratio: 0.39300953862652843
Resubmit model
Running pipeline stage MKMLizer
Starting job with name nousresearch-meta-llama-4941-v82-mkmlizer
Waiting for job on nousresearch-meta-llama-4941-v82-mkmlizer to finish
nousresearch-meta-llama-4941-v82-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
nousresearch-meta-llama-4941-v82-mkmlizer: ║ _____ __ __ ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ /___/ ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ Version: 0.9.5.post2 ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ https://mk1.ai ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ The license key for the current software has been verified as ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ belonging to: ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ Chai Research Corp. ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
nousresearch-meta-llama-4941-v82-mkmlizer: ║ ║
nousresearch-meta-llama-4941-v82-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
nousresearch-meta-llama-4941-v82-mkmlizer: Downloaded to shared memory in 23.651s
nousresearch-meta-llama-4941-v82-mkmlizer: quantizing model to /dev/shm/model_cache
nousresearch-meta-llama-4941-v82-mkmlizer: Saving flywheel model at /dev/shm/model_cache
nousresearch-meta-llama-4941-v82-mkmlizer: lm_head.weight torch.Size([139542528])
nousresearch-meta-llama-4941-v82-mkmlizer: model.layers.31.input_layernorm.weight torch.Size([4096])
nousresearch-meta-llama-4941-v82-mkmlizer: model.layers.31.mlp.down_proj.weight torch.Size([11927552])
nousresearch-meta-llama-4941-v82-mkmlizer: model.layers.31.post_attention_layernorm.weight torch.Size([4096])
nousresearch-meta-llama-4941-v82-mkmlizer: model.norm.weight torch.Size([4096])
nousresearch-meta-llama-4941-v82-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/nousresearch-meta-llama-4941-v82/flywheel_model.0.safetensors
nousresearch-meta-llama-4941-v82-mkmlizer: loading reward model from ChaiML/reward_models_100_170000000_cp_498032
nousresearch-meta-llama-4941-v82-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py:950: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
nousresearch-meta-llama-4941-v82-mkmlizer: warnings.warn(
nousresearch-meta-llama-4941-v82-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:778: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
nousresearch-meta-llama-4941-v82-mkmlizer: warnings.warn(
nousresearch-meta-llama-4941-v82-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:469: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
nousresearch-meta-llama-4941-v82-mkmlizer: warnings.warn(
nousresearch-meta-llama-4941-v82-mkmlizer: Saving model to /tmp/reward_cache/reward.tensors
nousresearch-meta-llama-4941-v82-mkmlizer: Saving duration: 0.185s
nousresearch-meta-llama-4941-v82-mkmlizer: Processed model ChaiML/reward_models_100_170000000_cp_498032 in 3.283s
nousresearch-meta-llama-4941-v82-mkmlizer: creating bucket guanaco-reward-models
nousresearch-meta-llama-4941-v82-mkmlizer: Bucket 's3://guanaco-reward-models/' created
nousresearch-meta-llama-4941-v82-mkmlizer: uploading /tmp/reward_cache to s3://guanaco-reward-models/nousresearch-meta-llama-4941-v82_reward
nousresearch-meta-llama-4941-v82-mkmlizer: cp /tmp/reward_cache/config.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v82_reward/config.json
nousresearch-meta-llama-4941-v82-mkmlizer: cp /tmp/reward_cache/special_tokens_map.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v82_reward/special_tokens_map.json
nousresearch-meta-llama-4941-v82-mkmlizer: cp /tmp/reward_cache/tokenizer_config.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v82_reward/tokenizer_config.json
nousresearch-meta-llama-4941-v82-mkmlizer: cp /tmp/reward_cache/vocab.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v82_reward/vocab.json
nousresearch-meta-llama-4941-v82-mkmlizer: cp /tmp/reward_cache/merges.txt s3://guanaco-reward-models/nousresearch-meta-llama-4941-v82_reward/merges.txt
nousresearch-meta-llama-4941-v82-mkmlizer: cp /tmp/reward_cache/tokenizer.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v82_reward/tokenizer.json
nousresearch-meta-llama-4941-v82-mkmlizer: cp /tmp/reward_cache/reward.tensors s3://guanaco-reward-models/nousresearch-meta-llama-4941-v82_reward/reward.tensors
Job nousresearch-meta-llama-4941-v82-mkmlizer completed after 149.86s with status: succeeded
Stopping job with name nousresearch-meta-llama-4941-v82-mkmlizer
Pipeline stage MKMLizer completed in 151.53s
Running pipeline stage MKMLKubeTemplater
Pipeline stage MKMLKubeTemplater completed in 0.26s
Running pipeline stage ISVCDeployer
Creating inference service nousresearch-meta-llama-4941-v82
Waiting for inference service nousresearch-meta-llama-4941-v82 to be ready
Inference service nousresearch-meta-llama-4941-v82 ready after 242.5205340385437s
Pipeline stage ISVCDeployer completed in 248.89s
Running pipeline stage StressChecker
Received healthy response to inference request in 6.192289113998413s
Received healthy response to inference request in 6.507264852523804s
Received healthy response to inference request in 4.029428958892822s
Received healthy response to inference request in 1.2038629055023193s
Received healthy response to inference request in 4.238800764083862s
5 requests
0 failed requests
5th percentile: 1.7689761161804198
10th percentile: 2.3340893268585203
20th percentile: 3.4643157482147218
30th percentile: 4.0713033199310305
40th percentile: 4.155052042007446
50th percentile: 4.238800764083862
60th percentile: 5.020196104049682
70th percentile: 5.801591444015503
80th percentile: 6.255284261703491
90th percentile: 6.381274557113647
95th percentile: 6.444269704818725
99th percentile: 6.494665822982788
mean time: 4.434329319000244
%s, retrying in %s seconds...
Received healthy response to inference request in 1.4087669849395752s
Received healthy response to inference request in 1.2455508708953857s
Received healthy response to inference request in 1.1614630222320557s
Received healthy response to inference request in 1.689068078994751s
Received healthy response to inference request in 1.17252516746521s
5 requests
0 failed requests
5th percentile: 1.1636754512786864
10th percentile: 1.1658878803253174
20th percentile: 1.1703127384185792
30th percentile: 1.1871303081512452
40th percentile: 1.2163405895233155
50th percentile: 1.2455508708953857
60th percentile: 1.3108373165130616
70th percentile: 1.3761237621307372
80th percentile: 1.4648272037506105
90th percentile: 1.5769476413726806
95th percentile: 1.6330078601837157
99th percentile: 1.677856035232544
mean time: 1.3354748249053956
Pipeline stage StressChecker completed in 33.12s
nousresearch-meta-llama_4941_v82 status is now deployed due to DeploymentManager action
nousresearch-meta-llama_4941_v82 status is now inactive due to auto deactivation removed underperforming models
admin requested tearing down of nousresearch-meta-llama_4941_v82
Running pipeline stage ISVCDeleter
Checking if service nousresearch-meta-llama-4941-v82 is running
Skipping teardown as no inference service was found
Pipeline stage ISVCDeleter completed in 4.71s
Running pipeline stage MKMLModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
Deleting key nousresearch-meta-llama-4941-v82/config.json from bucket guanaco-mkml-models
Deleting key nousresearch-meta-llama-4941-v82/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Deleting key nousresearch-meta-llama-4941-v82/special_tokens_map.json from bucket guanaco-mkml-models
Deleting key nousresearch-meta-llama-4941-v82/tokenizer.json from bucket guanaco-mkml-models
Deleting key nousresearch-meta-llama-4941-v82/tokenizer_config.json from bucket guanaco-mkml-models
Cleaning model data from model cache
Deleting key nousresearch-meta-llama-4941-v82_reward/config.json from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v82_reward/merges.txt from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v82_reward/reward.tensors from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v82_reward/special_tokens_map.json from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v82_reward/tokenizer.json from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v82_reward/tokenizer_config.json from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v82_reward/vocab.json from bucket guanaco-reward-models
Pipeline stage MKMLModelDeleter completed in 5.50s
nousresearch-meta-llama_4941_v82 status is now torndown due to DeploymentManager action

Usage Metrics

Latency Metrics