Running pipeline stage MKMLizer
Starting job with name nousresearch-meta-llama-4941-v77-mkmlizer
Waiting for job on nousresearch-meta-llama-4941-v77-mkmlizer to finish
nousresearch-meta-llama-4941-v76-mkmlizer: Saving model to /tmp/reward_cache/reward.tensors
nousresearch-meta-llama-4941-v76-mkmlizer: Saving duration: 0.449s
nousresearch-meta-llama-4941-v76-mkmlizer: Processed model ChaiML/gpt2_medium_pairwise_60m_step_937500 in 7.159s
nousresearch-meta-llama-4941-v76-mkmlizer: creating bucket guanaco-reward-models
nousresearch-meta-llama-4941-v76-mkmlizer: Bucket 's3://guanaco-reward-models/' created
nousresearch-meta-llama-4941-v76-mkmlizer: uploading /tmp/reward_cache to s3://guanaco-reward-models/nousresearch-meta-llama-4941-v76_reward
nousresearch-meta-llama-4941-v76-mkmlizer: cp /tmp/reward_cache/config.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v76_reward/config.json
nousresearch-meta-llama-4941-v76-mkmlizer: cp /tmp/reward_cache/special_tokens_map.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v76_reward/special_tokens_map.json
nousresearch-meta-llama-4941-v76-mkmlizer: cp /tmp/reward_cache/tokenizer_config.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v76_reward/tokenizer_config.json
nousresearch-meta-llama-4941-v76-mkmlizer: cp /tmp/reward_cache/vocab.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v76_reward/vocab.json
nousresearch-meta-llama-4941-v76-mkmlizer: cp /tmp/reward_cache/merges.txt s3://guanaco-reward-models/nousresearch-meta-llama-4941-v76_reward/merges.txt
nousresearch-meta-llama-4941-v76-mkmlizer: cp /tmp/reward_cache/tokenizer.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v76_reward/tokenizer.json
nousresearch-meta-llama-4941-v76-mkmlizer: cp /tmp/reward_cache/reward.tensors s3://guanaco-reward-models/nousresearch-meta-llama-4941-v76_reward/reward.tensors
Job nousresearch-meta-llama-4941-v76-mkmlizer completed after 73.92s with status: succeeded
Stopping job with name nousresearch-meta-llama-4941-v76-mkmlizer
Pipeline stage MKMLizer completed in 74.89s
Running pipeline stage MKMLKubeTemplater
Pipeline stage MKMLKubeTemplater completed in 0.14s
Running pipeline stage ISVCDeployer
Creating inference service nousresearch-meta-llama-4941-v76
nousresearch-meta-llama-4941-v77-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
nousresearch-meta-llama-4941-v77-mkmlizer: ║ _____ __ __ ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ /___/ ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ Version: 0.8.14 ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ https://mk1.ai ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ The license key for the current software has been verified as ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ belonging to: ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ Chai Research Corp. ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ Expiration: 2024-07-15 23:59:59 ║
nousresearch-meta-llama-4941-v77-mkmlizer: ║ ║
nousresearch-meta-llama-4941-v77-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Waiting for inference service nousresearch-meta-llama-4941-v76 to be ready
nousresearch-meta-llama-4941-v77-mkmlizer: Downloaded to shared memory in 26.623s
nousresearch-meta-llama-4941-v77-mkmlizer: quantizing model to /dev/shm/model_cache
nousresearch-meta-llama-4941-v77-mkmlizer: Saving flywheel model at /dev/shm/model_cache
Connection pool is full, discarding connection: %s
Connection pool is full, discarding connection: %s
Connection pool is full, discarding connection: %s
Connection pool is full, discarding connection: %s
nousresearch-meta-llama-4941-v77-mkmlizer:
Loading 0: 0%| | 0/291 [00:00<?, ?it/s]
Loading 0: 4%|▍ | 12/291 [00:00<00:02, 114.50it/s]
Loading 0: 8%|▊ | 24/291 [00:00<00:02, 102.08it/s]
Loading 0: 13%|█▎ | 37/291 [00:00<00:02, 113.09it/s]
Loading 0: 17%|█▋ | 49/291 [00:00<00:02, 103.51it/s]
Loading 0: 21%|██ | 60/291 [00:00<00:02, 103.81it/s]
Loading 0: 26%|██▌ | 75/291 [00:00<00:01, 112.34it/s]
Loading 0: 30%|██▉ | 87/291 [00:01<00:03, 60.58it/s]
Loading 0: 35%|███▌ | 102/291 [00:01<00:02, 74.35it/s]
Loading 0: 38%|███▊ | 112/291 [00:01<00:02, 78.65it/s]
Loading 0: 42%|████▏ | 122/291 [00:01<00:02, 82.23it/s]
Loading 0: 46%|████▋ | 135/291 [00:01<00:01, 93.58it/s]
Loading 0: 51%|█████ | 147/291 [00:01<00:01, 97.24it/s]
Loading 0: 54%|█████▍ | 158/291 [00:01<00:01, 93.72it/s]
Loading 0: 59%|█████▉ | 171/291 [00:01<00:01, 102.89it/s]
Loading 0: 63%|██████▎ | 182/291 [00:01<00:01, 101.54it/s]
Loading 0: 66%|██████▋ | 193/291 [00:02<00:01, 57.90it/s]
Loading 0: 70%|██████▉ | 203/291 [00:02<00:01, 64.16it/s]
Loading 0: 74%|███████▍ | 216/291 [00:02<00:00, 76.98it/s]
Loading 0: 78%|███████▊ | 228/291 [00:02<00:00, 84.48it/s]
Loading 0: 82%|████████▏ | 239/291 [00:02<00:00, 83.49it/s]
Loading 0: 87%|████████▋ | 252/291 [00:02<00:00, 94.11it/s]
Loading 0: 91%|█████████ | 264/291 [00:03<00:00, 98.78it/s]
Loading 0: 95%|█████████▍| 275/291 [00:03<00:00, 94.42it/s]
Loading 0: 99%|█████████▊| 287/291 [00:08<00:00, 6.43it/s]
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
nousresearch-meta-llama-4941-v77-mkmlizer: quantized model in 25.261s
nousresearch-meta-llama-4941-v77-mkmlizer: Processed model NousResearch/Meta-Llama-3-8B-Instruct in 51.884s
nousresearch-meta-llama-4941-v77-mkmlizer: creating bucket guanaco-mkml-models
nousresearch-meta-llama-4941-v77-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
nousresearch-meta-llama-4941-v77-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/nousresearch-meta-llama-4941-v77
nousresearch-meta-llama-4941-v77-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/nousresearch-meta-llama-4941-v77/config.json
nousresearch-meta-llama-4941-v77-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/nousresearch-meta-llama-4941-v77/special_tokens_map.json
nousresearch-meta-llama-4941-v77-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/nousresearch-meta-llama-4941-v77/tokenizer_config.json
nousresearch-meta-llama-4941-v77-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/nousresearch-meta-llama-4941-v77/tokenizer.json
nousresearch-meta-llama-4941-v77-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/nousresearch-meta-llama-4941-v77/flywheel_model.0.safetensors
nousresearch-meta-llama-4941-v77-mkmlizer: loading reward model from ChaiML/gpt2_medium_pairwise_60m_step_937500
nousresearch-meta-llama-4941-v77-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py:919: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
nousresearch-meta-llama-4941-v77-mkmlizer: warnings.warn(
nousresearch-meta-llama-4941-v77-mkmlizer: /opt/conda/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: `resume_download` is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use `force_download=True`.
nousresearch-meta-llama-4941-v77-mkmlizer: warnings.warn(
nousresearch-meta-llama-4941-v77-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:769: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
nousresearch-meta-llama-4941-v77-mkmlizer: warnings.warn(
nousresearch-meta-llama-4941-v77-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:468: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
nousresearch-meta-llama-4941-v77-mkmlizer: warnings.warn(
nousresearch-meta-llama-4941-v77-mkmlizer: Saving model to /tmp/reward_cache/reward.tensors
nousresearch-meta-llama-4941-v77-mkmlizer: Saving duration: 0.417s
nousresearch-meta-llama-4941-v77-mkmlizer: Processed model ChaiML/gpt2_medium_pairwise_60m_step_937500 in 8.525s
nousresearch-meta-llama-4941-v77-mkmlizer: creating bucket guanaco-reward-models
nousresearch-meta-llama-4941-v77-mkmlizer: Bucket 's3://guanaco-reward-models/' created
nousresearch-meta-llama-4941-v77-mkmlizer: uploading /tmp/reward_cache to s3://guanaco-reward-models/nousresearch-meta-llama-4941-v77_reward
nousresearch-meta-llama-4941-v77-mkmlizer: cp /tmp/reward_cache/config.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v77_reward/config.json
nousresearch-meta-llama-4941-v77-mkmlizer: cp /tmp/reward_cache/special_tokens_map.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v77_reward/special_tokens_map.json
nousresearch-meta-llama-4941-v77-mkmlizer: cp /tmp/reward_cache/vocab.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v77_reward/vocab.json
nousresearch-meta-llama-4941-v77-mkmlizer: cp /tmp/reward_cache/merges.txt s3://guanaco-reward-models/nousresearch-meta-llama-4941-v77_reward/merges.txt
nousresearch-meta-llama-4941-v77-mkmlizer: cp /tmp/reward_cache/tokenizer_config.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v77_reward/tokenizer_config.json
nousresearch-meta-llama-4941-v77-mkmlizer: cp /tmp/reward_cache/tokenizer.json s3://guanaco-reward-models/nousresearch-meta-llama-4941-v77_reward/tokenizer.json
Job nousresearch-meta-llama-4941-v77-mkmlizer completed after 86.16s with status: succeeded
Stopping job with name nousresearch-meta-llama-4941-v77-mkmlizer
Pipeline stage MKMLizer completed in 86.78s
Running pipeline stage MKMLKubeTemplater
Pipeline stage MKMLKubeTemplater completed in 0.16s
Running pipeline stage ISVCDeployer
Creating inference service nousresearch-meta-llama-4941-v77
Waiting for inference service nousresearch-meta-llama-4941-v77 to be ready
Inference service nousresearch-meta-llama-4941-v76 ready after 90.48861455917358s
Pipeline stage ISVCDeployer completed in 97.36s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.834223985671997s
Received healthy response to inference request in 1.1117067337036133s
Received healthy response to inference request in 1.1289067268371582s
Received healthy response to inference request in 1.1106581687927246s
Received healthy response to inference request in 1.1191134452819824s
5 requests
0 failed requests
5th percentile: 1.1108678817749023
10th percentile: 1.1110775947570801
20th percentile: 1.1114970207214356
30th percentile: 1.1131880760192872
40th percentile: 1.1161507606506347
50th percentile: 1.1191134452819824
60th percentile: 1.1230307579040528
70th percentile: 1.126948070526123
80th percentile: 1.2699701786041262
90th percentile: 1.5520970821380615
95th percentile: 1.693160533905029
99th percentile: 1.8060112953186034
mean time: 1.2609218120574952
Pipeline stage StressChecker completed in 8.07s
nousresearch-meta-llama_4941_v76 status is now deployed due to DeploymentManager action
Inference service nousresearch-meta-llama-4941-v77 ready after 100.62141990661621s
Pipeline stage ISVCDeployer completed in 106.44s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.8681678771972656s
Received healthy response to inference request in 1.1621296405792236s
Received healthy response to inference request in 1.1466686725616455s
Received healthy response to inference request in 1.1568152904510498s
Received healthy response to inference request in 1.1584346294403076s
5 requests
0 failed requests
5th percentile: 1.1486979961395263
10th percentile: 1.1507273197174073
20th percentile: 1.154785966873169
30th percentile: 1.1571391582489015
40th percentile: 1.1577868938446045
50th percentile: 1.1584346294403076
60th percentile: 1.159912633895874
70th percentile: 1.1613906383514405
80th percentile: 1.3033372879028322
90th percentile: 1.585752582550049
95th percentile: 1.726960229873657
99th percentile: 1.8399263477325438
mean time: 1.2984432220458983
Pipeline stage StressChecker completed in 7.50s
nousresearch-meta-llama_4941_v77 status is now deployed due to DeploymentManager action
nousresearch-meta-llama_4941_v77 status is now inactive due to auto deactivation removed underperforming models
admin requested tearing down of nousresearch-meta-llama_4941_v77
Running pipeline stage ISVCDeleter
Checking if service nousresearch-meta-llama-4941-v77 is running
Skipping teardown as no inference service was found
Pipeline stage ISVCDeleter completed in 4.33s
Running pipeline stage MKMLModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
Deleting key nousresearch-meta-llama-4941-v77/config.json from bucket guanaco-mkml-models
Deleting key nousresearch-meta-llama-4941-v77/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Deleting key nousresearch-meta-llama-4941-v77/special_tokens_map.json from bucket guanaco-mkml-models
Deleting key nousresearch-meta-llama-4941-v77/tokenizer.json from bucket guanaco-mkml-models
Deleting key nousresearch-meta-llama-4941-v77/tokenizer_config.json from bucket guanaco-mkml-models
Cleaning model data from model cache
Deleting key nousresearch-meta-llama-4941-v77_reward/config.json from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v77_reward/merges.txt from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v77_reward/reward.tensors from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v77_reward/special_tokens_map.json from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v77_reward/tokenizer.json from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v77_reward/tokenizer_config.json from bucket guanaco-reward-models
Deleting key nousresearch-meta-llama-4941-v77_reward/vocab.json from bucket guanaco-reward-models
Pipeline stage MKMLModelDeleter completed in 5.70s
nousresearch-meta-llama_4941_v77 status is now torndown due to DeploymentManager action