Running pipeline stage MKMLizer
Starting job with name arcee-ai-llama-spark-v1-mkmlizer
Waiting for job on arcee-ai-llama-spark-v1-mkmlizer to finish
arcee-ai-llama-spark-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
arcee-ai-llama-spark-v1-mkmlizer: ║ _____ __ __ ║
arcee-ai-llama-spark-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
arcee-ai-llama-spark-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
arcee-ai-llama-spark-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
arcee-ai-llama-spark-v1-mkmlizer: ║ /___/ ║
arcee-ai-llama-spark-v1-mkmlizer: ║ ║
arcee-ai-llama-spark-v1-mkmlizer: ║ Version: 0.9.9 ║
arcee-ai-llama-spark-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
arcee-ai-llama-spark-v1-mkmlizer: ║ https://mk1.ai ║
arcee-ai-llama-spark-v1-mkmlizer: ║ ║
arcee-ai-llama-spark-v1-mkmlizer: ║ The license key for the current software has been verified as ║
arcee-ai-llama-spark-v1-mkmlizer: ║ belonging to: ║
arcee-ai-llama-spark-v1-mkmlizer: ║ ║
arcee-ai-llama-spark-v1-mkmlizer: ║ Chai Research Corp. ║
arcee-ai-llama-spark-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
arcee-ai-llama-spark-v1-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
arcee-ai-llama-spark-v1-mkmlizer: ║ ║
arcee-ai-llama-spark-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'activator request timeout')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'activator request timeout')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
arcee-ai-llama-spark-v1-mkmlizer: quantized model in 28.644s
arcee-ai-llama-spark-v1-mkmlizer: Processed model arcee-ai/Llama-Spark in 75.415s
arcee-ai-llama-spark-v1-mkmlizer: creating bucket guanaco-mkml-models
arcee-ai-llama-spark-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
arcee-ai-llama-spark-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/arcee-ai-llama-spark-v1
arcee-ai-llama-spark-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/arcee-ai-llama-spark-v1/config.json
arcee-ai-llama-spark-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/arcee-ai-llama-spark-v1/tokenizer.json
arcee-ai-llama-spark-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/arcee-ai-llama-spark-v1/flywheel_model.0.safetensors
arcee-ai-llama-spark-v1-mkmlizer: loading reward model from ChaiML/gpt2_xl_pairwise_89m_step_347634
arcee-ai-llama-spark-v1-mkmlizer:
Loading 0: 0%| | 0/291 [00:00<?, ?it/s]
Loading 0: 2%|▏ | 5/291 [00:00<00:10, 27.42it/s]
Loading 0: 4%|▍ | 12/291 [00:00<00:07, 37.28it/s]
Loading 0: 5%|▌ | 16/291 [00:00<00:07, 35.03it/s]
Loading 0: 7%|▋ | 21/291 [00:00<00:07, 37.85it/s]
Loading 0: 9%|▊ | 25/291 [00:00<00:07, 35.21it/s]
Loading 0: 10%|█ | 30/291 [00:00<00:06, 39.10it/s]
Loading 0: 12%|█▏ | 35/291 [00:01<00:09, 25.97it/s]
Loading 0: 13%|█▎ | 39/291 [00:01<00:09, 27.18it/s]
Loading 0: 15%|█▍ | 43/291 [00:01<00:08, 27.84it/s]
Loading 0: 16%|█▋ | 48/291 [00:01<00:07, 31.48it/s]
Loading 0: 18%|█▊ | 52/291 [00:01<00:07, 30.67it/s]
Loading 0: 20%|█▉ | 57/291 [00:01<00:06, 33.97it/s]
Loading 0: 21%|██ | 61/291 [00:01<00:06, 33.07it/s]
Loading 0: 23%|██▎ | 66/291 [00:02<00:06, 35.40it/s]
Loading 0: 24%|██▍ | 70/291 [00:02<00:06, 34.20it/s]
Loading 0: 25%|██▌ | 74/291 [00:02<00:06, 34.63it/s]
Loading 0: 27%|██▋ | 78/291 [00:02<00:06, 34.64it/s]
Loading 0: 28%|██▊ | 82/291 [00:02<00:08, 23.53it/s]
Loading 0: 30%|██▉ | 86/291 [00:02<00:07, 26.53it/s]
Loading 0: 31%|███ | 90/291 [00:02<00:06, 29.05it/s]
Loading 0: 32%|███▏ | 94/291 [00:03<00:06, 29.80it/s]
Loading 0: 34%|███▍ | 99/291 [00:03<00:05, 33.53it/s]
Loading 0: 35%|███▌ | 103/291 [00:03<00:05, 32.78it/s]
Loading 0: 37%|███▋ | 108/291 [00:03<00:05, 35.76it/s]
Loading 0: 38%|███▊ | 112/291 [00:03<00:05, 34.33it/s]
Loading 0: 40%|███▉ | 116/291 [00:03<00:05, 34.82it/s]
Loading 0: 42%|████▏ | 122/291 [00:03<00:04, 39.35it/s]
Loading 0: 44%|████▎ | 127/291 [00:03<00:04, 37.38it/s]
Loading 0: 46%|████▌ | 133/291 [00:04<00:05, 31.42it/s]
Loading 0: 47%|████▋ | 137/291 [00:04<00:04, 31.01it/s]
Loading 0: 48%|████▊ | 141/291 [00:04<00:05, 28.45it/s]
Loading 0: 51%|█████ | 147/291 [00:04<00:04, 32.72it/s]
Loading 0: 52%|█████▏ | 151/291 [00:04<00:04, 32.22it/s]
Loading 0: 54%|█████▎ | 156/291 [00:04<00:03, 35.50it/s]
Loading 0: 55%|█████▍ | 160/291 [00:04<00:03, 34.38it/s]
Loading 0: 57%|█████▋ | 165/291 [00:05<00:03, 37.10it/s]
Loading 0: 58%|█████▊ | 169/291 [00:05<00:03, 35.70it/s]
Loading 0: 60%|█████▉ | 174/291 [00:05<00:03, 37.87it/s]
Loading 0: 61%|██████ | 178/291 [00:05<00:03, 35.99it/s]
Loading 0: 63%|██████▎ | 184/291 [00:05<00:02, 41.92it/s]
Loading 0: 65%|██████▍ | 189/291 [00:05<00:04, 24.90it/s]
Loading 0: 67%|██████▋ | 194/291 [00:06<00:03, 26.77it/s]
Loading 0: 69%|██████▉ | 201/291 [00:06<00:02, 33.65it/s]
Loading 0: 71%|███████ | 206/291 [00:06<00:02, 34.42it/s]
Loading 0: 72%|███████▏ | 210/291 [00:06<00:02, 35.38it/s]
Loading 0: 74%|███████▎ | 214/291 [00:06<00:02, 34.45it/s]
Loading 0: 75%|███████▌ | 219/291 [00:06<00:01, 36.98it/s]
Loading 0: 77%|███████▋ | 223/291 [00:06<00:01, 35.62it/s]
Loading 0: 78%|███████▊ | 227/291 [00:06<00:01, 35.53it/s]
Loading 0: 79%|███████▉ | 231/291 [00:07<00:01, 35.78it/s]
Loading 0: 81%|████████ | 235/291 [00:07<00:02, 26.05it/s]
Loading 0: 82%|████████▏ | 239/291 [00:07<00:01, 26.13it/s]
Loading 0: 85%|████████▍ | 246/291 [00:07<00:01, 34.08it/s]
Loading 0: 86%|████████▌ | 250/291 [00:07<00:01, 32.70it/s]
Loading 0: 88%|████████▊ | 255/291 [00:07<00:01, 35.48it/s]
Loading 0: 89%|████████▉ | 259/291 [00:07<00:00, 33.54it/s]
Loading 0: 91%|█████████ | 264/291 [00:08<00:00, 36.06it/s]
Loading 0: 92%|█████████▏| 268/291 [00:08<00:00, 34.93it/s]
Loading 0: 94%|█████████▍| 273/291 [00:08<00:00, 37.95it/s]
Loading 0: 95%|█████████▌| 277/291 [00:08<00:00, 36.11it/s]
Loading 0: 97%|█████████▋| 281/291 [00:08<00:00, 35.44it/s]
Loading 0: 98%|█████████▊| 286/291 [00:14<00:01, 2.58it/s]
Loading 0: 99%|█████████▉| 289/291 [00:14<00:00, 3.21it/s]
/opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py:957: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
arcee-ai-llama-spark-v1-mkmlizer: warnings.warn(
arcee-ai-llama-spark-v1-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:785: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
arcee-ai-llama-spark-v1-mkmlizer: warnings.warn(
arcee-ai-llama-spark-v1-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:469: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
arcee-ai-llama-spark-v1-mkmlizer: warnings.warn(
arcee-ai-llama-spark-v1-mkmlizer:
Downloading shards: 0%| | 0/2 [00:00<?, ?it/s]
Downloading shards: 50%|█████ | 1/2 [00:05<00:05, 5.78s/it]
Downloading shards: 100%|██████████| 2/2 [00:08<00:00, 4.15s/it]
Downloading shards: 100%|██████████| 2/2 [00:08<00:00, 4.40s/it]
arcee-ai-llama-spark-v1-mkmlizer:
Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s]
Loading checkpoint shards: 50%|█████ | 1/2 [00:00<00:00, 2.45it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 4.06it/s]
Loading checkpoint shards: 100%|██████████| 2/2 [00:00<00:00, 3.69it/s]
arcee-ai-llama-spark-v1-mkmlizer: Saving model to /tmp/reward_cache/reward.tensors
arcee-ai-llama-spark-v1-mkmlizer: Saving duration: 1.295s
arcee-ai-llama-spark-v1-mkmlizer: Processed model ChaiML/gpt2_xl_pairwise_89m_step_347634 in 13.910s
arcee-ai-llama-spark-v1-mkmlizer: creating bucket guanaco-reward-models
arcee-ai-llama-spark-v1-mkmlizer: Bucket 's3://guanaco-reward-models/' created
arcee-ai-llama-spark-v1-mkmlizer: uploading /tmp/reward_cache to s3://guanaco-reward-models/arcee-ai-llama-spark-v1_reward
arcee-ai-llama-spark-v1-mkmlizer: cp /tmp/reward_cache/config.json s3://guanaco-reward-models/arcee-ai-llama-spark-v1_reward/config.json
arcee-ai-llama-spark-v1-mkmlizer: cp /tmp/reward_cache/special_tokens_map.json s3://guanaco-reward-models/arcee-ai-llama-spark-v1_reward/special_tokens_map.json
arcee-ai-llama-spark-v1-mkmlizer: cp /tmp/reward_cache/tokenizer_config.json s3://guanaco-reward-models/arcee-ai-llama-spark-v1_reward/tokenizer_config.json
arcee-ai-llama-spark-v1-mkmlizer: cp /tmp/reward_cache/merges.txt s3://guanaco-reward-models/arcee-ai-llama-spark-v1_reward/merges.txt
arcee-ai-llama-spark-v1-mkmlizer: cp /tmp/reward_cache/vocab.json s3://guanaco-reward-models/arcee-ai-llama-spark-v1_reward/vocab.json
arcee-ai-llama-spark-v1-mkmlizer: cp /tmp/reward_cache/tokenizer.json s3://guanaco-reward-models/arcee-ai-llama-spark-v1_reward/tokenizer.json
arcee-ai-llama-spark-v1-mkmlizer: cp /tmp/reward_cache/reward.tensors s3://guanaco-reward-models/arcee-ai-llama-spark-v1_reward/reward.tensors
Job arcee-ai-llama-spark-v1-mkmlizer completed after 125.62s with status: succeeded
Stopping job with name arcee-ai-llama-spark-v1-mkmlizer
Pipeline stage MKMLizer completed in 126.49s
Running pipeline stage MKMLKubeTemplater
Pipeline stage MKMLKubeTemplater completed in 0.10s
Running pipeline stage ISVCDeployer
Creating inference service arcee-ai-llama-spark-v1
Waiting for inference service arcee-ai-llama-spark-v1 to be ready
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'activator request timeout')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'activator request timeout')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'activator request timeout')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'activator request timeout')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'activator request timeout')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'activator request timeout')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'dial tcp 127.0.0.1:8080: connect: connection refused\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Failed to get response for submission mistralai-mistral-nemo-_9330_v42: ('http://mistralai-mistral-nemo-9330-v42-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'EOF\n')
Inference service arcee-ai-llama-spark-v1 ready after 221.30248880386353s
Pipeline stage ISVCDeployer completed in 223.22s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.9502041339874268s
Received healthy response to inference request in 1.3092930316925049s
Received healthy response to inference request in 1.3389506340026855s
Received healthy response to inference request in 1.023970365524292s
Received healthy response to inference request in 1.45589280128479s
5 requests
0 failed requests
5th percentile: 1.0810348987579346
10th percentile: 1.1380994319915771
20th percentile: 1.2522284984588623
30th percentile: 1.315224552154541
40th percentile: 1.3270875930786132
50th percentile: 1.3389506340026855
60th percentile: 1.3857275009155274
70th percentile: 1.432504367828369
80th percentile: 1.5547550678253175
90th percentile: 1.7524796009063721
95th percentile: 1.8513418674468993
99th percentile: 1.9304316806793214
mean time: 1.4156621932983398
Pipeline stage StressChecker completed in 8.09s
arcee-ai-llama-spark_v1 status is now deployed due to DeploymentManager action
arcee-ai-llama-spark_v1 status is now inactive due to auto deactivation removed underperforming models
arcee-ai-llama-spark_v1 status is now torndown due to DeploymentManager action