submission_id: chaiml-lexical-nemo-v4-1k1e5_v2
developer_uid: chai_backend_admin
alignment_samples: 10667
alignment_score: 4.139469839045413
best_of: 4
celo_rating: 1246.47
display_name: chaiml-lexical-nemo-v4-1k1e5_v2
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 0.9, 'top_p': 1.0, 'min_p': 0.05, 'top_k': 80, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '</s>', '###', 'Bot:', 'User:', 'You:', '<|im_end|>'], 'max_input_tokens': 1024, 'best_of': 4, 'max_output_tokens': 64}
gpu_counts: {'NVIDIA RTX A5000': 1}
is_internal_developer: True
language_model: ChaiML/Lexical-Nemo-v4-1k1e5
latencies: [{'batch_size': 1, 'throughput': 0.6367456261004253, 'latency_mean': 1.5704111194610595, 'latency_p50': 1.5631016492843628, 'latency_p90': 1.736569333076477}, {'batch_size': 5, 'throughput': 1.5572106759444946, 'latency_mean': 3.2011868834495543, 'latency_p50': 3.211469054222107, 'latency_p90': 3.5602865695953367}, {'batch_size': 10, 'throughput': 1.8099513848376085, 'latency_mean': 5.488941376209259, 'latency_p50': 5.508442997932434, 'latency_p90': 6.23165271282196}]
max_input_tokens: 1024
max_output_tokens: 64
model_architecture: MistralForCausalLM
model_group: ChaiML/Lexical-Nemo-v4-1
model_name: chaiml-lexical-nemo-v4-1k1e5_v2
model_num_parameters: 12772070400.0
model_repo: ChaiML/Lexical-Nemo-v4-1k1e5
model_size: 13B
num_battles: 10667
num_wins: 5484
propriety_score: 0.7122593718338399
propriety_total_count: 987.0
ranking_group: single
status: torndown
submission_type: basic
throughput_3p7s: 1.65
timestamp: 2024-08-22T15:25:24+00:00
us_pacific_date: 2024-08-22
win_ratio: 0.5141089340958095
Download Preference Data
Resubmit model
Running pipeline stage MKMLizer
Starting job with name chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer
Waiting for job on chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer to finish
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ _____ __ __ ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ /___/ ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ Version: 0.10.1 ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ https://mk1.ai ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ belonging to: ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ Chai Research Corp. ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ║ ║
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: Downloaded to shared memory in 78.194s
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp6gmej_ek, device:0
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
Failed to get response for submission chaiml-lexical-nemo-v4-1k1e5_v1: ('http://chaiml-lexical-nemo-v4-1k1e5-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission blend_sehof_2024-08-22: ('http://zonemercy-lexical-nemo-1518-v18-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: quantized model in 40.958s
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: Processed model ChaiML/Lexical-Nemo-v4-1k1e5 in 119.152s
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: creating bucket guanaco-mkml-models
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v2
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v2/config.json
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v2/special_tokens_map.json
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v2/tokenizer_config.json
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v2/tokenizer.json
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v2/flywheel_model.0.safetensors
chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%|▏ | 5/363 [00:00<00:14, 24.52it/s] Loading 0: 3%|▎ | 10/363 [00:00<00:11, 29.77it/s] Loading 0: 4%|▍ | 14/363 [00:00<00:13, 25.96it/s] Loading 0: 6%|▌ | 20/363 [00:00<00:09, 34.94it/s] Loading 0: 7%|▋ | 24/363 [00:00<00:14, 23.25it/s] Loading 0: 7%|▋ | 27/363 [00:01<00:14, 22.49it/s] Loading 0: 9%|▉ | 32/363 [00:01<00:14, 23.21it/s] Loading 0: 11%|█ | 39/363 [00:01<00:10, 29.69it/s] Loading 0: 12%|█▏ | 43/363 [00:01<00:11, 28.93it/s] Loading 0: 13%|█▎ | 48/363 [00:01<00:10, 30.37it/s] Loading 0: 14%|█▍ | 52/363 [00:01<00:10, 28.56it/s] Loading 0: 15%|█▌ | 56/363 [00:02<00:10, 28.83it/s] Loading 0: 17%|█▋ | 61/363 [00:02<00:12, 24.65it/s] Loading 0: 18%|█▊ | 64/363 [00:02<00:13, 22.21it/s] Loading 0: 20%|█▉ | 71/363 [00:02<00:10, 28.96it/s] Loading 0: 21%|██ | 75/363 [00:02<00:10, 28.46it/s] Loading 0: 22%|██▏ | 79/363 [00:02<00:10, 27.62it/s] Loading 0: 23%|██▎ | 84/363 [00:03<00:09, 30.09it/s] Loading 0: 24%|██▍ | 88/363 [00:03<00:09, 29.11it/s] Loading 0: 26%|██▌ | 93/363 [00:03<00:08, 31.00it/s] Loading 0: 27%|██▋ | 97/363 [00:03<00:09, 29.48it/s] Loading 0: 28%|██▊ | 101/363 [00:03<00:11, 23.53it/s] Loading 0: 29%|██▊ | 104/363 [00:03<00:12, 21.18it/s] Loading 0: 31%|███ | 111/363 [00:04<00:08, 28.23it/s] Loading 0: 32%|███▏ | 115/363 [00:04<00:08, 28.16it/s] Loading 0: 33%|███▎ | 120/363 [00:04<00:07, 31.04it/s] Loading 0: 34%|███▍ | 124/363 [00:04<00:08, 29.52it/s] Loading 0: 36%|███▌ | 129/363 [00:04<00:07, 31.56it/s] Loading 0: 37%|███▋ | 133/363 [00:04<00:07, 30.13it/s] Loading 0: 38%|███▊ | 137/363 [00:04<00:07, 29.91it/s] Loading 0: 39%|███▉ | 142/363 [00:05<00:08, 25.26it/s] Loading 0: 40%|███▉ | 145/363 [00:05<00:09, 24.21it/s] Loading 0: 41%|████ | 149/363 [00:05<00:09, 23.15it/s] Loading 0: 43%|████▎ | 156/363 [00:05<00:06, 29.91it/s] Loading 0: 44%|████▍ | 160/363 [00:05<00:07, 28.83it/s] Loading 0: 45%|████▌ | 165/363 [00:05<00:06, 31.21it/s] Loading 0: 47%|████▋ | 169/363 [00:06<00:06, 30.01it/s] Loading 0: 48%|████▊ | 174/363 [00:06<00:05, 31.96it/s] Loading 0: 49%|████▉ | 178/363 [00:06<00:06, 30.66it/s] Loading 0: 50%|█████ | 182/363 [00:06<00:07, 24.17it/s] Loading 0: 51%|█████ | 185/363 [00:06<00:08, 21.86it/s] Loading 0: 53%|█████▎ | 192/363 [00:06<00:05, 28.90it/s] Loading 0: 54%|█████▍ | 196/363 [00:07<00:05, 28.50it/s] Loading 0: 55%|█████▌ | 201/363 [00:07<00:05, 31.05it/s] Loading 0: 56%|█████▋ | 205/363 [00:07<00:05, 30.09it/s] Loading 0: 58%|█████▊ | 210/363 [00:07<00:04, 32.07it/s] Loading 0: 59%|█████▉ | 214/363 [00:07<00:04, 30.18it/s] Loading 0: 60%|██████ | 218/363 [00:07<00:04, 30.30it/s] Loading 0: 61%|██████▏ | 223/363 [00:08<00:05, 25.90it/s] Loading 0: 62%|██████▏ | 226/363 [00:08<00:05, 24.51it/s] Loading 0: 63%|██████▎ | 230/363 [00:08<00:05, 23.30it/s] Loading 0: 65%|██████▌ | 237/363 [00:08<00:04, 29.91it/s] Loading 0: 66%|██████▋ | 241/363 [00:08<00:04, 28.96it/s] Loading 0: 68%|██████▊ | 246/363 [00:08<00:03, 30.84it/s] Loading 0: 69%|██████▉ | 250/363 [00:08<00:03, 29.55it/s] Loading 0: 70%|███████ | 255/363 [00:09<00:03, 31.71it/s] Loading 0: 71%|███████▏ | 259/363 [00:09<00:03, 28.96it/s] Loading 0: 72%|███████▏ | 263/363 [00:09<00:04, 23.12it/s] Loading 0: 73%|███████▎ | 266/363 [00:09<00:04, 20.48it/s] Loading 0: 75%|███████▌ | 273/363 [00:09<00:03, 27.23it/s] Loading 0: 76%|███████▌ | 276/363 [00:10<00:03, 25.77it/s] Loading 0: 78%|███████▊ | 282/363 [00:10<00:02, 30.22it/s] Loading 0: 79%|███████▉ | 286/363 [00:10<00:02, 28.98it/s] Loading 0: 80%|████████ | 291/363 [00:10<00:02, 31.30it/s] Loading 0: 81%|████████▏ | 295/363 [00:10<00:02, 29.99it/s] Loading 0: 82%|████████▏ | 299/363 [00:10<00:02, 29.63it/s] Loading 0: 84%|████████▎ | 304/363 [00:11<00:02, 24.76it/s] Loading 0: 85%|████████▍ | 307/363 [00:11<00:02, 23.81it/s] Loading 0: 86%|████████▌ | 311/363 [00:11<00:02, 23.05it/s] Loading 0: 88%|████████▊ | 318/363 [00:11<00:01, 29.64it/s] Loading 0: 89%|████████▊ | 322/363 [00:11<00:01, 28.94it/s] Loading 0: 90%|█████████ | 327/363 [00:11<00:01, 30.63it/s] Loading 0: 91%|█████████ | 331/363 [00:11<00:01, 29.25it/s] Loading 0: 93%|█████████▎| 336/363 [00:12<00:00, 31.24it/s] Loading 0: 94%|█████████▎| 340/363 [00:12<00:00, 30.37it/s] Loading 0: 95%|█████████▍| 344/363 [00:19<00:09, 1.99it/s] Loading 0: 96%|█████████▌| 348/363 [00:19<00:05, 2.68it/s] Loading 0: 97%|█████████▋| 353/363 [00:19<00:02, 3.88it/s] Loading 0: 98%|█████████▊| 357/363 [00:19<00:01, 5.01it/s]
Job chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer completed after 142.17s with status: succeeded
Stopping job with name chaiml-lexical-nemo-v4-1k1e5-v2-mkmlizer
Pipeline stage MKMLizer completed in 143.10s
Running pipeline stage MKMLKubeTemplater
Pipeline stage MKMLKubeTemplater completed in 0.15s
Running pipeline stage ISVCDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2 to be ready
Failed to get response for submission blend_filor_2024-08-16: ('http://zonemercy-lexical-nemo-1518-v18-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission blend_filor_2024-08-16: ('http://zonemercy-lexical-nemo-1518-v18-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission blend_jidor_2024-08-22: ('http://zonemercy-lexical-nemo-1518-v18-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission blend_jerun_2024-08-22: ('http://chaiml-lexical-nemo-v4-1k1e5-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission blend_dedat_2024-08-16: ('http://zonemercy-lexical-nemo-1518-v18-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission blend_katim_2024-08-22: ('http://zonemercy-lexical-nemo-1518-v18-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission blend_dedat_2024-08-16: ('http://zonemercy-lexical-nemo-1518-v18-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission chaiml-lexical-nemo-v4-1k1e5_v1: ('http://chaiml-lexical-nemo-v4-1k1e5-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission chaiml-lexical-nemo-v4-1k1e5_v1: ('http://chaiml-lexical-nemo-v4-1k1e5-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission blend_lubas_2024-08-22: ('http://chaiml-lexical-nemo-v4-1k1e5-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission chaiml-lexical-nemo-v4-1k1e5_v1: ('http://chaiml-lexical-nemo-v4-1k1e5-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Tearing down inference service chaiml-lexical-nemo-v4-1k1e5-v2
%s, retrying in %s seconds...
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2
Ignoring service chaiml-lexical-nemo-v4-1k1e5-v2 already deployed
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2 to be ready
Failed to get response for submission blend_susol_2024-08-22: ('http://zonemercy-lexical-nemo-1518-v18-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission blend_koran_2024-08-16: ('http://zonemercy-lexical-nemo-1518-v18-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission chaiml-lexical-nemo-v4-1k1e5_v1: ('http://chaiml-lexical-nemo-v4-1k1e5-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission blend_koran_2024-08-16: ('http://mistralai-mixtral-8x7b-3473-v130-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '{"error":"ValueError : [TypeError(\\"\'numpy.int64\' object is not iterable\\"), TypeError(\'vars() argument must have __dict__ attribute\')]"}')
Failed to get response for submission chaiml-lexical-nemo-v4-1k1e5_v1: ('http://chaiml-lexical-nemo-v4-1k1e5-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission chaiml-lexical-nemo-v4-1k1e5_v1: ('http://chaiml-lexical-nemo-v4-1k1e5-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission chaiml-lexical-nemo-v4-1k1e5_v1: ('http://chaiml-lexical-nemo-v4-1k1e5-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Tearing down inference service chaiml-lexical-nemo-v4-1k1e5-v2
%s, retrying in %s seconds...
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2 to be ready
Failed to get response for submission blend_remul_2024-08-22: ('http://zonemercy-lexical-nemov8-5966-v2-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:46900->127.0.0.1:8080: read: connection reset by peer\n')
Inference service chaiml-lexical-nemo-v4-1k1e5-v2 ready after 233.22279596328735s
Pipeline stage ISVCDeployer completed in 1208.90s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.250946521759033s
Received healthy response to inference request in 1.9325063228607178s
Received healthy response to inference request in 1.908973217010498s
Received healthy response to inference request in 1.833200216293335s
Received healthy response to inference request in 2.497621774673462s
5 requests
0 failed requests
5th percentile: 1.8483548164367676
10th percentile: 1.8635094165802002
20th percentile: 1.8938186168670654
30th percentile: 1.913679838180542
40th percentile: 1.9230930805206299
50th percentile: 1.9325063228607178
60th percentile: 2.059882402420044
70th percentile: 2.18725848197937
80th percentile: 2.3002815723419188
90th percentile: 2.3989516735076903
95th percentile: 2.4482867240905763
99th percentile: 2.4877547645568847
mean time: 2.084649610519409
Pipeline stage StressChecker completed in 11.30s
chaiml-lexical-nemo-v4-1k1e5_v2 status is now deployed due to DeploymentManager action
chaiml-lexical-nemo-v4-1k1e5_v2 status is now inactive due to auto deactivation removed underperforming models
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.32s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
Tearing down inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
%s, retrying in %s seconds...
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerTemplater completed in 0.22s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Ignoring service chaiml-lexical-nemo-v4-1k1e5-v2-profiler already deployed
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerTemplater completed in 0.22s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerTemplater completed in 0.07s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Ignoring service chaiml-lexical-nemo-v4-1k1e5-v2-profiler already deployed
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLProfilerTemplater completed in 0.08s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Ignoring service chaiml-lexical-nemo-v4-1k1e5-v2-profiler already deployed
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
Retrying (%r) after connection broken by '%r': %s
Retrying (%r) after connection broken by '%r': %s
Retrying (%r) after connection broken by '%r': %s
Tearing down inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Retrying (%r) after connection broken by '%r': %s
Retrying (%r) after connection broken by '%r': %s
Retrying (%r) after connection broken by '%r': %s
%s, retrying in %s seconds...
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Ignoring service chaiml-lexical-nemo-v4-1k1e5-v2-profiler already deployed
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
Inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler ready after 10.04883360862732s
Pipeline stage MKMLProfilerDeployer completed in 28.41s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
script pods %s
Pipeline stage MKMLProfilerRunner completed in 0.48s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service chaiml-lexical-nemo-v4-1k1e5-v2-profiler is running
Tearing down inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
%s, retrying in %s seconds...
Checking if service chaiml-lexical-nemo-v4-1k1e5-v2-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 3.28s
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
Inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler ready after 150.36959552764893s
Pipeline stage MKMLProfilerDeployer completed in 150.80s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
script pods %s
Pipeline stage MKMLProfilerRunner completed in 0.46s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service chaiml-lexical-nemo-v4-1k1e5-v2-profiler is running
Tearing down inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Service chaiml-lexical-nemo-v4-1k1e5-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 1.69s
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Ignoring service chaiml-lexical-nemo-v4-1k1e5-v2-profiler already deployed
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
Inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler ready after 10.050390243530273s
Pipeline stage MKMLProfilerDeployer completed in 10.74s
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Ignoring service chaiml-lexical-nemo-v4-1k1e5-v2-profiler already deployed
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
Inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler ready after 10.05111813545227s
Pipeline stage MKMLProfilerDeployer completed in 10.65s
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Ignoring service chaiml-lexical-nemo-v4-1k1e5-v2-profiler already deployed
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
Inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler ready after 10.249851942062378s
Pipeline stage MKMLProfilerDeployer completed in 11.51s
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service chaiml-lexical-nemo-v4-1k1e5-v2-profiler is running
Skipping teardown as no inference service was found
Pipeline stage MKMLProfilerDeleter completed in 1.72s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler to be ready
Inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler ready after 150.3602945804596s
Pipeline stage MKMLProfilerDeployer completed in 150.86s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/chaiml-lexical-nemo-fee25837140152244ada6359023bf212-deplol9bqk:/code/chaiverse_profiler_1725336717 --namespace tenant-chaiml-guanaco
kubectl exec -it chaiml-lexical-nemo-fee25837140152244ada6359023bf212-deplol9bqk --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1725336717 && chmod +x profiles.py && python profiles.py profile --best_of_n 4 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 1024 --output_tokens 64 --summary /code/chaiverse_profiler_1725336717/summary.json'
kubectl exec -it chaiml-lexical-nemo-fee25837140152244ada6359023bf212-deplol9bqk --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1725336717/summary.json'
Pipeline stage MKMLProfilerRunner completed in 557.69s
cleanup pipeline after completion
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service chaiml-lexical-nemo-v4-1k1e5-v2-profiler is running
Tearing down inference service chaiml-lexical-nemo-v4-1k1e5-v2-profiler
Service chaiml-lexical-nemo-v4-1k1e5-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 1.73s
chaiml-lexical-nemo-v4-1k1e5_v2 status is now torndown due to DeploymentManager action

Usage Metrics

Latency Metrics