Running pipeline stage MKMLizer
Starting job with name google-gemma-2-27b-it-v10-mkmlizer
Waiting for job on google-gemma-2-27b-it-v10-mkmlizer to finish
google-gemma-2-27b-it-v10-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
google-gemma-2-27b-it-v10-mkmlizer: ║ _____ __ __ ║
google-gemma-2-27b-it-v10-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
google-gemma-2-27b-it-v10-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
google-gemma-2-27b-it-v10-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
google-gemma-2-27b-it-v10-mkmlizer: ║ /___/ ║
google-gemma-2-27b-it-v10-mkmlizer: ║ ║
google-gemma-2-27b-it-v10-mkmlizer: ║ Version: 0.9.5.post3 ║
google-gemma-2-27b-it-v10-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
google-gemma-2-27b-it-v10-mkmlizer: ║ https://mk1.ai ║
google-gemma-2-27b-it-v10-mkmlizer: ║ ║
google-gemma-2-27b-it-v10-mkmlizer: ║ The license key for the current software has been verified as ║
google-gemma-2-27b-it-v10-mkmlizer: ║ belonging to: ║
google-gemma-2-27b-it-v10-mkmlizer: ║ ║
google-gemma-2-27b-it-v10-mkmlizer: ║ Chai Research Corp. ║
google-gemma-2-27b-it-v10-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
google-gemma-2-27b-it-v10-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
google-gemma-2-27b-it-v10-mkmlizer: ║ ║
google-gemma-2-27b-it-v10-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
google-gemma-2-27b-it-v10-mkmlizer: Downloaded to shared memory in 69.067s
google-gemma-2-27b-it-v10-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpmdae3hpy, device:0
google-gemma-2-27b-it-v10-mkmlizer: Saving flywheel model at /dev/shm/model_cache
google-gemma-2-27b-it-v10-mkmlizer: quantized model in 62.677s
google-gemma-2-27b-it-v10-mkmlizer: Processed model google/gemma-2-27b-it in 131.744s
google-gemma-2-27b-it-v10-mkmlizer: creating bucket guanaco-mkml-models
google-gemma-2-27b-it-v10-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
google-gemma-2-27b-it-v10-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/google-gemma-2-27b-it-v10
google-gemma-2-27b-it-v10-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/google-gemma-2-27b-it-v10/config.json
google-gemma-2-27b-it-v10-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/google-gemma-2-27b-it-v10/special_tokens_map.json
google-gemma-2-27b-it-v10-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/google-gemma-2-27b-it-v10/tokenizer_config.json
google-gemma-2-27b-it-v10-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/google-gemma-2-27b-it-v10/tokenizer.json
google-gemma-2-27b-it-v10-mkmlizer: cp /dev/shm/model_cache/tokenizer.model s3://guanaco-mkml-models/google-gemma-2-27b-it-v10/tokenizer.model
google-gemma-2-27b-it-v10-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/google-gemma-2-27b-it-v10/flywheel_model.0.safetensors
google-gemma-2-27b-it-v10-mkmlizer: cp /dev/shm/model_cache/flywheel_model.1.safetensors s3://guanaco-mkml-models/google-gemma-2-27b-it-v10/flywheel_model.1.safetensors
google-gemma-2-27b-it-v10-mkmlizer: loading reward model from ChaiML/gpt2_xl_pairwise_89m_step_347634
google-gemma-2-27b-it-v10-mkmlizer:
Loading 0: 0%| | 0/508 [00:00<?, ?it/s]
Loading 0: 1%| | 5/508 [00:00<00:14, 34.91it/s]
Loading 0: 3%|▎ | 15/508 [00:00<00:07, 62.67it/s]
Loading 0: 6%|▌ | 28/508 [00:00<00:09, 51.16it/s]
Loading 0: 7%|▋ | 34/508 [00:00<00:09, 48.27it/s]
Loading 0: 8%|▊ | 40/508 [00:00<00:10, 46.32it/s]
Loading 0: 9%|▉ | 48/508 [00:00<00:08, 53.25it/s]
Loading 0: 11%|█▏ | 58/508 [00:01<00:07, 63.03it/s]
Loading 0: 13%|█▎ | 65/508 [00:01<00:06, 63.90it/s]
Loading 0: 14%|█▍ | 73/508 [00:01<00:09, 45.07it/s]
Loading 0: 16%|█▌ | 80/508 [00:01<00:08, 49.02it/s]
Loading 0: 17%|█▋ | 87/508 [00:01<00:08, 49.48it/s]
Loading 0: 19%|█▉ | 97/508 [00:01<00:06, 59.41it/s]
Loading 0: 20%|██ | 104/508 [00:01<00:06, 60.90it/s]
Loading 0: 22%|██▏ | 111/508 [00:02<00:07, 56.16it/s]
Loading 0: 23%|██▎ | 118/508 [00:02<00:08, 46.83it/s]
Loading 0: 25%|██▍ | 125/508 [00:02<00:08, 47.41it/s]
Loading 0: 27%|██▋ | 135/508 [00:02<00:06, 57.34it/s]
Loading 0: 28%|██▊ | 142/508 [00:02<00:06, 59.39it/s]
Loading 0: 29%|██▉ | 149/508 [00:02<00:06, 55.08it/s]
Loading 0: 31%|███ | 158/508 [00:02<00:05, 61.91it/s]
Loading 0: 34%|███▎ | 171/508 [00:03<00:06, 55.69it/s]
Loading 0: 35%|███▌ | 178/508 [00:03<00:06, 53.48it/s]
Loading 0: 36%|███▌ | 184/508 [00:03<00:06, 50.08it/s]
Loading 0: 38%|███▊ | 191/508 [00:03<00:06, 52.83it/s]
Loading 0: 39%|███▉ | 197/508 [00:17<03:08, 1.65it/s]
Loading 0: 40%|███▉ | 202/508 [00:17<02:24, 2.12it/s]
Loading 0: 42%|████▏ | 211/508 [00:17<01:29, 3.32it/s]
Loading 0: 43%|████▎ | 218/508 [00:18<01:05, 4.42it/s]
Loading 0: 44%|████▍ | 225/508 [00:18<00:46, 6.04it/s]
Loading 0: 46%|████▋ | 235/508 [00:18<00:29, 9.29it/s]
Loading 0: 48%|████▊ | 245/508 [00:18<00:19, 13.53it/s]
Loading 0: 50%|████▉ | 252/508 [00:18<00:15, 17.04it/s]
Loading 0: 51%|█████ | 259/508 [00:18<00:11, 21.14it/s]
Loading 0: 52%|█████▏ | 266/508 [00:19<00:10, 23.17it/s]
Loading 0: 54%|█████▎ | 272/508 [00:19<00:09, 26.11it/s]
Loading 0: 55%|█████▍ | 279/508 [00:19<00:07, 31.50it/s]
Loading 0: 57%|█████▋ | 289/508 [00:19<00:05, 41.64it/s]
Loading 0: 58%|█████▊ | 296/508 [00:19<00:04, 46.35it/s]
Loading 0: 60%|█████▉ | 303/508 [00:19<00:04, 46.17it/s]
Loading 0: 62%|██████▏ | 314/508 [00:19<00:04, 48.29it/s]
Loading 0: 63%|██████▎ | 320/508 [00:20<00:04, 46.60it/s]
Loading 0: 64%|██████▍ | 326/508 [00:20<00:04, 45.40it/s]
Loading 0: 66%|██████▌ | 334/508 [00:20<00:03, 51.55it/s]
Loading 0: 68%|██████▊ | 344/508 [00:20<00:02, 61.01it/s]
Loading 0: 69%|██████▉ | 351/508 [00:20<00:02, 62.29it/s]
Loading 0: 71%|███████ | 359/508 [00:20<00:03, 48.04it/s]
Loading 0: 72%|███████▏ | 366/508 [00:20<00:02, 48.51it/s]
Loading 0: 73%|███████▎ | 373/508 [00:21<00:02, 52.56it/s]
Loading 0: 75%|███████▍ | 379/508 [00:21<00:02, 48.83it/s]
Loading 0: 77%|███████▋ | 389/508 [00:21<00:02, 59.26it/s]
Loading 0: 78%|███████▊ | 398/508 [00:21<00:01, 65.00it/s]
Loading 0: 80%|███████▉ | 405/508 [00:21<00:02, 44.97it/s]
Loading 0: 81%|████████ | 412/508 [00:21<00:02, 45.82it/s]
Loading 0: 83%|████████▎ | 422/508 [00:21<00:01, 55.30it/s]
Loading 0: 85%|████████▌ | 432/508 [00:22<00:01, 63.52it/s]
Loading 0: 87%|████████▋ | 440/508 [00:22<00:01, 66.05it/s]
Loading 0: 88%|████████▊ | 448/508 [00:22<00:00, 62.42it/s]
Loading 0: 90%|████████▉ | 455/508 [00:36<00:29, 1.83it/s]
Loading 0: 90%|████████▉ | 457/508 [00:36<00:25, 1.98it/s]
Loading 0: 91%|█████████ | 462/508 [00:36<00:17, 2.62it/s]
Loading 0: 92%|█████████▏| 467/508 [00:37<00:11, 3.48it/s]
Loading 0: 94%|█████████▍| 477/508 [00:37<00:05, 5.98it/s]
Loading 0: 96%|█████████▌| 487/508 [00:37<00:02, 9.30it/s]
Loading 0: 97%|█████████▋| 494/508 [00:37<00:01, 12.20it/s]
Loading 0: 99%|█████████▉| 502/508 [00:37<00:00, 15.12it/s]
/opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py:950: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
google-gemma-2-27b-it-v10-mkmlizer: warnings.warn(
google-gemma-2-27b-it-v10-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:778: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
google-gemma-2-27b-it-v10-mkmlizer: warnings.warn(
google-gemma-2-27b-it-v10-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:469: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
google-gemma-2-27b-it-v10-mkmlizer: warnings.warn(
google-gemma-2-27b-it-v10-mkmlizer:
Downloading shards: 0%| | 0/2 [00:00<?, ?it/s]
Downloading shards: 50%|█████ | 1/2 [00:06<00:06, 6.91s/it]
Downloading shards: 100%|██████████| 2/2 [00:13<00:00, 6.67s/it]
Downloading shards: 100%|██████████| 2/2 [00:13<00:00, 6.71s/it]
google-gemma-2-27b-it-v10-mkmlizer:
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards: 50%|█████ | 1/2 [00:00<00:00, 2.43it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 4.00it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 3.64it/s]
google-gemma-2-27b-it-v10-mkmlizer: Saving model to /tmp/reward_cache/reward.tensors
google-gemma-2-27b-it-v10-mkmlizer: Saving duration: 1.329s
google-gemma-2-27b-it-v10-mkmlizer: Processed model ChaiML/gpt2_xl_pairwise_89m_step_347634 in 18.152s
google-gemma-2-27b-it-v10-mkmlizer: creating bucket guanaco-reward-models
google-gemma-2-27b-it-v10-mkmlizer: Bucket 's3://guanaco-reward-models/' created
google-gemma-2-27b-it-v10-mkmlizer: uploading /tmp/reward_cache to s3://guanaco-reward-models/google-gemma-2-27b-it-v10_reward
google-gemma-2-27b-it-v10-mkmlizer: cp /tmp/reward_cache/config.json s3://guanaco-reward-models/google-gemma-2-27b-it-v10_reward/config.json
google-gemma-2-27b-it-v10-mkmlizer: cp /tmp/reward_cache/special_tokens_map.json s3://guanaco-reward-models/google-gemma-2-27b-it-v10_reward/special_tokens_map.json
google-gemma-2-27b-it-v10-mkmlizer: cp /tmp/reward_cache/tokenizer_config.json s3://guanaco-reward-models/google-gemma-2-27b-it-v10_reward/tokenizer_config.json
google-gemma-2-27b-it-v10-mkmlizer: cp /tmp/reward_cache/merges.txt s3://guanaco-reward-models/google-gemma-2-27b-it-v10_reward/merges.txt
google-gemma-2-27b-it-v10-mkmlizer: cp /tmp/reward_cache/vocab.json s3://guanaco-reward-models/google-gemma-2-27b-it-v10_reward/vocab.json
google-gemma-2-27b-it-v10-mkmlizer: cp /tmp/reward_cache/tokenizer.json s3://guanaco-reward-models/google-gemma-2-27b-it-v10_reward/tokenizer.json
google-gemma-2-27b-it-v10-mkmlizer: cp /tmp/reward_cache/reward.tensors s3://guanaco-reward-models/google-gemma-2-27b-it-v10_reward/reward.tensors
Job google-gemma-2-27b-it-v10-mkmlizer completed after 188.06s with status: succeeded
Stopping job with name google-gemma-2-27b-it-v10-mkmlizer
Pipeline stage MKMLizer completed in 189.25s
Running pipeline stage MKMLKubeTemplater
Pipeline stage MKMLKubeTemplater completed in 0.11s
Running pipeline stage ISVCDeployer
Creating inference service google-gemma-2-27b-it-v10
Waiting for inference service google-gemma-2-27b-it-v10 to be ready
Failed to get response for submission undi95-meta-llama-3-70b_6209_v18: ('http://undi95-meta-llama-3-70b-6209-v18-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '{"error":"TypeError : SamplingParameters.__init__() got an unexpected keyword argument \'reward_max_tokens\'"}')
Inference service google-gemma-2-27b-it-v10 ready after 192.05863547325134s
Pipeline stage ISVCDeployer completed in 194.08s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.5325684547424316s
Received healthy response to inference request in 1.9322187900543213s
Received healthy response to inference request in 1.964656114578247s
Received healthy response to inference request in 1.454270839691162s
Received healthy response to inference request in 1.9684293270111084s
5 requests
0 failed requests
5th percentile: 1.549860429763794
10th percentile: 1.6454500198364257
20th percentile: 1.8366291999816895
30th percentile: 1.9387062549591065
40th percentile: 1.9516811847686768
50th percentile: 1.964656114578247
60th percentile: 1.9661653995513917
70th percentile: 1.967674684524536
80th percentile: 2.2812571525573735
90th percentile: 2.9069128036499023
95th percentile: 3.2197406291961665
99th percentile: 3.4700028896331787
mean time: 2.1704287052154543
Pipeline stage StressChecker completed in 11.45s
google-gemma-2-27b-it_v10 status is now deployed due to DeploymentManager action
google-gemma-2-27b-it_v10 status is now inactive due to auto deactivation removed underperforming models
admin requested tearing down of google-gemma-2-27b-it_v10
Running pipeline stage ISVCDeleter
Checking if service google-gemma-2-27b-it-v10 is running
Tearing down inference service google-gemma-2-27b-it-v10
Service google-gemma-2-27b-it-v10 has been torndown
Pipeline stage ISVCDeleter completed in 5.14s
Running pipeline stage MKMLModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
Deleting key google-gemma-2-27b-it-v10/config.json from bucket guanaco-mkml-models
Deleting key google-gemma-2-27b-it-v10/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Deleting key google-gemma-2-27b-it-v10/flywheel_model.1.safetensors from bucket guanaco-mkml-models
Deleting key google-gemma-2-27b-it-v10/flywheel_model.2.safetensors from bucket guanaco-mkml-models
Deleting key google-gemma-2-27b-it-v10/special_tokens_map.json from bucket guanaco-mkml-models
Deleting key google-gemma-2-27b-it-v10/tokenizer.json from bucket guanaco-mkml-models
Deleting key google-gemma-2-27b-it-v10/tokenizer.model from bucket guanaco-mkml-models
Deleting key google-gemma-2-27b-it-v10/tokenizer_config.json from bucket guanaco-mkml-models
Cleaning model data from model cache
Deleting key google-gemma-2-27b-it-v10_reward/config.json from bucket guanaco-reward-models
Deleting key google-gemma-2-27b-it-v10_reward/merges.txt from bucket guanaco-reward-models
Deleting key google-gemma-2-27b-it-v10_reward/reward.tensors from bucket guanaco-reward-models
Deleting key google-gemma-2-27b-it-v10_reward/special_tokens_map.json from bucket guanaco-reward-models
Deleting key google-gemma-2-27b-it-v10_reward/tokenizer.json from bucket guanaco-reward-models
Deleting key google-gemma-2-27b-it-v10_reward/tokenizer_config.json from bucket guanaco-reward-models
Deleting key google-gemma-2-27b-it-v10_reward/vocab.json from bucket guanaco-reward-models
Pipeline stage MKMLModelDeleter completed in 8.09s
google-gemma-2-27b-it_v10 status is now torndown due to DeploymentManager action