submission_id: chaiml-lexical-nemo-v4-1k1e5_v3
developer_uid: chai_backend_admin
best_of: 8
celo_rating: 1254.88
display_name: chaiml-lexical-nemo-v4-1k1e5_v3
family_friendly_score: 0.5504632704514739
family_friendly_standard_error: 0.002343745545326314
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 0.9, 'top_p': 1.0, 'min_p': 0.05, 'top_k': 80, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '</s>', '###', 'Bot:', 'User:', 'You:', '<|im_end|>'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
gpu_counts: {'NVIDIA RTX A5000': 1}
is_internal_developer: True
language_model: ChaiML/Lexical-Nemo-v4-1k1e5
latencies: [{'batch_size': 1, 'throughput': 0.6166793636091756, 'latency_mean': 1.6215294754505158, 'latency_p50': 1.6303291320800781, 'latency_p90': 1.8143576383590698}, {'batch_size': 3, 'throughput': 1.0772203833725655, 'latency_mean': 2.7778813636302946, 'latency_p50': 2.7828482389450073, 'latency_p90': 3.0345614671707155}, {'batch_size': 5, 'throughput': 1.237692099170083, 'latency_mean': 4.023833891153336, 'latency_p50': 4.056706666946411, 'latency_p90': 4.497558355331421}, {'batch_size': 6, 'throughput': 1.251324957587887, 'latency_mean': 4.76826495885849, 'latency_p50': 4.775757789611816, 'latency_p90': 5.364778208732605}, {'batch_size': 8, 'throughput': 1.2559258037842993, 'latency_mean': 6.330206587314605, 'latency_p50': 6.370905518531799, 'latency_p90': 7.146477222442627}, {'batch_size': 10, 'throughput': 1.2073274119799124, 'latency_mean': 8.235195668935775, 'latency_p50': 8.304767489433289, 'latency_p90': 9.296771812438966}]
max_input_tokens: 1024
max_output_tokens: 64
model_architecture: MistralForCausalLM
model_group: ChaiML/Lexical-Nemo-v4-1
model_name: chaiml-lexical-nemo-v4-1k1e5_v3
model_num_parameters: 12772070400.0
model_repo: ChaiML/Lexical-Nemo-v4-1k1e5
model_size: 13B
num_battles: 7806335
num_wins: 3927774
ranking_group: single
status: inactive
submission_type: basic
throughput_3p7s: 1.21
timestamp: 2024-09-13T19:12:08+00:00
us_pacific_date: 2024-09-13
win_ratio: 0.5031521194004613
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer
Waiting for job on chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer to finish
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ _____ __ __ ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ /___/ ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ Version: 0.11.12 ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ https://mk1.ai ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ belonging to: ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ Chai Research Corp. ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ║ ║
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: Downloaded to shared memory in 75.014s
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpbuavujv4, device:0
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: Saving flywheel model at /dev/shm/model_cache
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: quantized model in 39.164s
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: Processed model ChaiML/Lexical-Nemo-v4-1k1e5 in 114.178s
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: creating bucket guanaco-mkml-models
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v3
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v3/config.json
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v3/special_tokens_map.json
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v3/tokenizer_config.json
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v3/tokenizer.json
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-lexical-nemo-v4-1k1e5-v3/flywheel_model.0.safetensors
chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%|▏ | 5/363 [00:00<00:15, 23.73it/s] Loading 0: 3%|▎ | 10/363 [00:00<00:11, 30.53it/s] Loading 0: 4%|▍ | 14/363 [00:00<00:12, 27.08it/s] Loading 0: 6%|▌ | 21/363 [00:00<00:08, 38.63it/s] Loading 0: 7%|▋ | 26/363 [00:01<00:14, 23.10it/s] Loading 0: 9%|▊ | 31/363 [00:01<00:12, 27.47it/s] Loading 0: 10%|▉ | 35/363 [00:01<00:11, 28.86it/s] Loading 0: 11%|█ | 39/363 [00:01<00:10, 30.84it/s] Loading 0: 12%|█▏ | 43/363 [00:01<00:10, 30.58it/s] Loading 0: 13%|█▎ | 48/363 [00:01<00:09, 33.27it/s] Loading 0: 14%|█▍ | 52/363 [00:01<00:09, 31.43it/s] Loading 0: 15%|█▌ | 56/363 [00:01<00:09, 32.22it/s] Loading 0: 17%|█▋ | 61/363 [00:02<00:10, 28.91it/s] Loading 0: 18%|█▊ | 65/363 [00:02<00:11, 26.30it/s] Loading 0: 20%|█▉ | 71/363 [00:02<00:09, 31.57it/s] Loading 0: 21%|██ | 75/363 [00:02<00:09, 31.33it/s] Loading 0: 22%|██▏ | 79/363 [00:02<00:09, 30.28it/s] Loading 0: 23%|██▎ | 84/363 [00:02<00:08, 32.80it/s] Loading 0: 24%|██▍ | 88/363 [00:02<00:08, 31.56it/s] Loading 0: 26%|██▌ | 93/363 [00:03<00:07, 34.46it/s] Loading 0: 27%|██▋ | 97/363 [00:03<00:08, 32.54it/s] Loading 0: 28%|██▊ | 101/363 [00:03<00:09, 26.98it/s] Loading 0: 29%|██▊ | 104/363 [00:03<00:11, 23.29it/s] Loading 0: 31%|███ | 111/363 [00:03<00:08, 30.87it/s] Loading 0: 32%|███▏ | 115/363 [00:03<00:08, 30.45it/s] Loading 0: 33%|███▎ | 120/363 [00:03<00:07, 32.98it/s] Loading 0: 34%|███▍ | 124/363 [00:04<00:07, 32.13it/s] Loading 0: 36%|███▌ | 129/363 [00:04<00:06, 34.75it/s] Loading 0: 37%|███▋ | 133/363 [00:04<00:06, 32.97it/s] Loading 0: 38%|███▊ | 137/363 [00:04<00:06, 32.60it/s] Loading 0: 39%|███▉ | 142/363 [00:04<00:07, 28.86it/s] Loading 0: 40%|████ | 146/363 [00:04<00:07, 28.50it/s] Loading 0: 41%|████ | 149/363 [00:05<00:08, 24.21it/s] Loading 0: 43%|████▎ | 156/363 [00:05<00:06, 32.02it/s] Loading 0: 44%|████▍ | 160/363 [00:05<00:06, 31.52it/s] Loading 0: 45%|████▌ | 165/363 [00:05<00:05, 33.66it/s] Loading 0: 47%|████▋ | 169/363 [00:05<00:06, 31.96it/s] Loading 0: 48%|████▊ | 174/363 [00:05<00:05, 34.73it/s] Loading 0: 49%|████▉ | 178/363 [00:05<00:05, 33.31it/s] Loading 0: 50%|█████ | 182/363 [00:06<00:06, 27.30it/s] Loading 0: 51%|█████ | 185/363 [00:06<00:07, 23.24it/s] Loading 0: 53%|█████▎ | 192/363 [00:06<00:05, 30.80it/s] Loading 0: 54%|█████▍ | 196/363 [00:06<00:05, 29.85it/s] Loading 0: 55%|█████▌ | 201/363 [00:06<00:04, 32.47it/s] Loading 0: 56%|█████▋ | 205/363 [00:06<00:04, 31.72it/s] Loading 0: 58%|█████▊ | 210/363 [00:06<00:04, 34.40it/s] Loading 0: 59%|█████▉ | 214/363 [00:07<00:04, 31.86it/s] Loading 0: 60%|██████ | 218/363 [00:07<00:04, 31.66it/s] Loading 0: 61%|██████▏ | 223/363 [00:07<00:04, 28.17it/s] Loading 0: 62%|██████▏ | 226/363 [00:07<00:05, 26.23it/s] Loading 0: 63%|██████▎ | 230/363 [00:07<00:05, 24.84it/s] Loading 0: 65%|██████▌ | 237/363 [00:07<00:03, 32.47it/s] Loading 0: 66%|██████▋ | 241/363 [00:07<00:03, 31.76it/s] Loading 0: 68%|██████▊ | 246/363 [00:08<00:03, 34.10it/s] Loading 0: 69%|██████▉ | 250/363 [00:08<00:03, 32.77it/s] Loading 0: 70%|███████ | 255/363 [00:08<00:03, 35.27it/s] Loading 0: 71%|███████▏ | 259/363 [00:08<00:03, 33.06it/s] Loading 0: 72%|███████▏ | 263/363 [00:08<00:03, 27.28it/s] Loading 0: 73%|███████▎ | 266/363 [00:08<00:04, 23.29it/s] Loading 0: 75%|███████▌ | 273/363 [00:09<00:02, 30.87it/s] Loading 0: 76%|███████▋ | 277/363 [00:09<00:02, 30.59it/s] Loading 0: 78%|███████▊ | 282/363 [00:09<00:02, 33.20it/s] Loading 0: 79%|███████▉ | 286/363 [00:09<00:02, 31.94it/s] Loading 0: 80%|████████ | 291/363 [00:09<00:02, 34.72it/s] Loading 0: 81%|████████▏ | 295/363 [00:09<00:02, 33.12it/s] Loading 0: 82%|████████▏ | 299/363 [00:09<00:01, 32.65it/s] Loading 0: 84%|████████▎ | 304/363 [00:10<00:02, 28.44it/s] Loading 0: 85%|████████▍ | 308/363 [00:10<00:01, 28.44it/s] Loading 0: 86%|████████▌ | 311/363 [00:10<00:02, 24.66it/s] Loading 0: 88%|████████▊ | 318/363 [00:10<00:01, 32.22it/s] Loading 0: 89%|████████▊ | 322/363 [00:10<00:01, 31.55it/s] Loading 0: 90%|█████████ | 327/363 [00:10<00:01, 34.20it/s] Loading 0: 91%|█████████ | 331/363 [00:10<00:00, 32.75it/s] Loading 0: 93%|█████████▎| 336/363 [00:10<00:00, 34.78it/s] Loading 0: 94%|█████████▎| 340/363 [00:11<00:00, 33.38it/s] Loading 0: 95%|█████████▍| 344/363 [00:17<00:09, 2.01it/s] Loading 0: 96%|█████████▌| 348/363 [00:18<00:05, 2.71it/s] Loading 0: 97%|█████████▋| 353/363 [00:18<00:02, 3.94it/s] Loading 0: 98%|█████████▊| 357/363 [00:18<00:01, 5.13it/s]
Job chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer completed after 137.65s with status: succeeded
Stopping job with name chaiml-lexical-nemo-v4-1k1e5-v3-mkmlizer
Pipeline stage MKMLizer completed in 138.63s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.25s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v3
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v3 to be ready
Inference service chaiml-lexical-nemo-v4-1k1e5-v3 ready after 30.477820873260498s
Pipeline stage MKMLDeployer completed in 31.25s
run pipeline stage %s
Running pipeline stage StressChecker
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 3.1270289421081543s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.1571156978607178s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 1.9745268821716309s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 2.423691987991333s
HTTP Request: %s %s "%s %d %s"
Received healthy response to inference request in 1.9651601314544678s
5 requests
0 failed requests
5th percentile: 1.9670334815979005
10th percentile: 1.968906831741333
20th percentile: 1.9726535320281982
30th percentile: 2.011044645309448
40th percentile: 2.084080171585083
50th percentile: 2.1571156978607178
60th percentile: 2.263746213912964
70th percentile: 2.37037672996521
80th percentile: 2.5643593788146974
90th percentile: 2.845694160461426
95th percentile: 2.98636155128479
99th percentile: 3.0988954639434816
mean time: 2.3295047283172607
Pipeline stage StressChecker completed in 13.62s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 1.68s
Shutdown handler de-registered
chaiml-lexical-nemo-v4-1k1e5_v3 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-lexical-nemo-v4-1k1e5-v3-profiler
Waiting for inference service chaiml-lexical-nemo-v4-1k1e5-v3-profiler to be ready
Inference service chaiml-lexical-nemo-v4-1k1e5-v3-profiler ready after 180.60158324241638s
Pipeline stage MKMLProfilerDeployer completed in 181.05s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/chaiml-lexical-nemo-051da947b039ce0348a4a7774f925e94-deplo772tp:/code/chaiverse_profiler_1726255156 --namespace tenant-chaiml-guanaco
kubectl exec -it chaiml-lexical-nemo-051da947b039ce0348a4a7774f925e94-deplo772tp --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1726255156 && python profiles.py profile --best_of_n 8 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 1024 --output_tokens 64 --summary /code/chaiverse_profiler_1726255156/summary.json'
kubectl exec -it chaiml-lexical-nemo-051da947b039ce0348a4a7774f925e94-deplo772tp --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1726255156/summary.json'
Pipeline stage MKMLProfilerRunner completed in 1162.31s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service chaiml-lexical-nemo-v4-1k1e5-v3-profiler is running
Tearing down inference service chaiml-lexical-nemo-v4-1k1e5-v3-profiler
Service chaiml-lexical-nemo-v4-1k1e5-v3-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 1.98s
Shutdown handler de-registered