submission_id: wendyhoang-mistral-merged_v5
developer_uid: chai_backend_admin
status: inactive
model_repo: WendyHoang/mistral-merged
reward_repo: WendyHoang/reward-model
generation_params: {'temperature': 1.2, 'top_p': 1.0, 'top_k': 20, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '</s>', '<|im_end|>'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '<|im_start|>system\n{memory}<|im_end|>\n', 'prompt_template': '<|im_start|>user\n{prompt}<|im_end|>\n', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{user_name}: {message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:'}
timestamp: 2024-01-02T14:29:33+00:00
model_name: test
safety_score: 0.99
entertaining: None
stay_in_character: None
user_preference: None
double_thumbs_up: 3011
thumbs_up: 5153
thumbs_down: 2458
num_battles: 86207
num_wins: 40918
win_ratio: 0.4746482304221235
celo_rating: 1137.96
Resubmit model
Running pipeline stage MKMLizer
Starting job with name wendyhoang-mistral-merged-v5-mkmlizer
Waiting for job on wendyhoang-mistral-merged-v5-mkmlizer to finish
Stopping job with name wendyhoang-mistral-merged-v5-mkmlizer
%s, retrying in %s seconds...
Starting job with name wendyhoang-mistral-merged-v5-mkmlizer
Waiting for job on wendyhoang-mistral-merged-v5-mkmlizer to finish
wendyhoang-mistral-merged-v5-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
wendyhoang-mistral-merged-v5-mkmlizer: ║ _______ __ __ _______ _____ ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ | | | |/ | | | |_ ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ | | <| | | ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ |__|_|__|__|\__|__|_|__|_______| ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ The license key for the current software has been verified as ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ belonging to: ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ Chai Research Corp. ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ Expiration: 2024-04-15 23:59:59 ║
wendyhoang-mistral-merged-v5-mkmlizer: ║ ║
wendyhoang-mistral-merged-v5-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
wendyhoang-mistral-merged-v5-mkmlizer: loading model from WendyHoang/mistral-merged
wendyhoang-mistral-merged-v5-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py:1067: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
wendyhoang-mistral-merged-v5-mkmlizer: warnings.warn(
wendyhoang-mistral-merged-v5-mkmlizer: config.json: 0%| | 0.00/595 [00:00<?, ?B/s] config.json: 100%|██████████| 595/595 [00:00<00:00, 6.93MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:690: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
wendyhoang-mistral-merged-v5-mkmlizer: warnings.warn(
wendyhoang-mistral-merged-v5-mkmlizer: tokenizer_config.json: 0%| | 0.00/675k [00:00<?, ?B/s] tokenizer_config.json: 100%|██████████| 675k/675k [00:00<00:00, 12.5MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: tokenizer.model: 0%| | 0.00/493k [00:00<?, ?B/s] tokenizer.model: 100%|██████████| 493k/493k [00:00<00:00, 58.1MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: tokenizer.json: 0%| | 0.00/1.80M [00:00<?, ?B/s] tokenizer.json: 100%|██████████| 1.80M/1.80M [00:00<00:00, 121MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: added_tokens.json: 0%| | 0.00/28.0 [00:00<?, ?B/s] added_tokens.json: 100%|██████████| 28.0/28.0 [00:00<00:00, 233kB/s]
wendyhoang-mistral-merged-v5-mkmlizer: special_tokens_map.json: 0%| | 0.00/563 [00:00<?, ?B/s] special_tokens_map.json: 100%|██████████| 563/563 [00:00<00:00, 4.59MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:472: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
wendyhoang-mistral-merged-v5-mkmlizer: warnings.warn(
wendyhoang-mistral-merged-v5-mkmlizer: model.safetensors.index.json: 0%| | 0.00/22.8k [00:00<?, ?B/s] model.safetensors.index.json: 100%|██████████| 22.8k/22.8k [00:00<00:00, 153MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: Downloading shards: 0%| | 0/2 [00:00<?, ?it/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 0%| | 0.00/9.94G [00:00<?, ?B/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 0%| | 10.5M/9.94G [00:00<04:18, 38.4MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 0%| | 21.0M/9.94G [00:00<03:20, 49.4MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 1%| | 52.4M/9.94G [00:00<01:21, 122MB/s] 
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 1%| | 105M/9.94G [00:00<00:41, 235MB/s] 
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 2%|▏ | 189M/9.94G [00:00<00:25, 382MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 2%|▏ | 241M/9.94G [00:00<00:23, 412MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 3%|▎ | 315M/9.94G [00:01<00:20, 469MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 4%|▎ | 367M/9.94G [00:01<00:19, 482MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 5%|▌ | 545M/9.94G [00:01<00:11, 834MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 7%|▋ | 724M/9.94G [00:01<00:08, 1.10GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 9%|▉ | 923M/9.94G [00:01<00:06, 1.35GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 12%|█▏ | 1.15G/9.94G [00:01<00:05, 1.62GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 13%|█▎ | 1.32G/9.94G [00:01<00:06, 1.41GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 15%|█▍ | 1.48G/9.94G [00:01<00:08, 1.00GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 17%|█▋ | 1.65G/9.94G [00:02<00:07, 1.13GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 18%|█▊ | 1.78G/9.94G [00:02<00:07, 1.13GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 20%|█▉ | 1.95G/9.94G [00:02<00:06, 1.23GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 21%|██▏ | 2.12G/9.94G [00:02<00:05, 1.32GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 23%|██▎ | 2.28G/9.94G [00:02<00:05, 1.37GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 24%|██▍ | 2.42G/9.94G [00:02<00:05, 1.32GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 26%|██▌ | 2.57G/9.94G [00:02<00:06, 1.16GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 27%|██▋ | 2.69G/9.94G [00:02<00:07, 1.03GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 29%|██▉ | 2.92G/9.94G [00:03<00:05, 1.30GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 31%|███ | 3.08G/9.94G [00:03<00:04, 1.39GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 33%|███▎ | 3.24G/9.94G [00:03<00:04, 1.38GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 34%|███▍ | 3.39G/9.94G [00:03<00:05, 1.28GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 36%|███▌ | 3.55G/9.94G [00:03<00:04, 1.30GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 37%|███▋ | 3.69G/9.94G [00:03<00:05, 1.24GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 38%|███▊ | 3.83G/9.94G [00:03<00:05, 1.17GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 40%|███▉ | 3.95G/9.94G [00:03<00:05, 1.14GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 42%|████▏ | 4.20G/9.94G [00:03<00:03, 1.49GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 44%|████▍ | 4.36G/9.94G [00:04<00:04, 1.35GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 46%|████▌ | 4.54G/9.94G [00:04<00:03, 1.41GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 47%|████▋ | 4.69G/9.94G [00:04<00:03, 1.40GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 49%|████▊ | 4.83G/9.94G [00:04<00:03, 1.34GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 50%|█████ | 4.98G/9.94G [00:04<00:03, 1.27GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 51%|█████▏ | 5.12G/9.94G [00:04<00:03, 1.23GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 53%|█████▎ | 5.26G/9.94G [00:04<00:03, 1.29GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 54%|█████▍ | 5.40G/9.94G [00:04<00:03, 1.15GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 56%|█████▌ | 5.53G/9.94G [00:05<00:03, 1.17GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 57%|█████▋ | 5.70G/9.94G [00:05<00:03, 1.32GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 59%|█████▉ | 5.85G/9.94G [00:05<00:03, 1.35GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00001-of-00002.safetensors: 98%|█████████▊| 9.73G/9.94G [00:08<00:00, 1.93GB/s] model-00001-of-00002.safetensors: 100%|█████████▉| 9.94G/9.94G [00:08<00:00, 1.21GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: Downloading shards: 50%|█████ | 1/2 [00:08<00:08, 8.45s/it]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 0%| | 0.00/4.54G [00:00<?, ?B/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 0%| | 10.5M/4.54G [00:00<02:47, 27.0MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 1%| | 31.5M/4.54G [00:00<01:10, 64.0MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 2%|▏ | 73.4M/4.54G [00:00<00:48, 91.6MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 2%|▏ | 83.9M/4.54G [00:01<00:53, 83.6MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 2%|▏ | 105M/4.54G [00:01<00:42, 105MB/s] 
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 8%|▊ | 367M/4.54G [00:01<00:06, 605MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 24%|██▍ | 1.11G/4.54G [00:01<00:01, 2.00GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 30%|███ | 1.37G/4.54G [00:01<00:02, 1.10GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 34%|███▍ | 1.56G/4.54G [00:02<00:02, 1.09GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 38%|███▊ | 1.73G/4.54G [00:02<00:02, 1.08GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 45%|████▌ | 2.07G/4.54G [00:02<00:01, 1.46GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 50%|█████ | 2.28G/4.54G [00:02<00:01, 1.44GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 54%|█████▍ | 2.46G/4.54G [00:02<00:01, 1.24GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 58%|█████▊ | 2.62G/4.54G [00:02<00:01, 1.18GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 61%|██████ | 2.77G/4.54G [00:03<00:01, 1.12GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 65%|██████▌ | 2.96G/4.54G [00:03<00:01, 1.27GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 70%|██████▉ | 3.17G/4.54G [00:03<00:00, 1.42GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 74%|███████▍ | 3.36G/4.54G [00:03<00:00, 1.53GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 78%|███████▊ | 3.52G/4.54G [00:03<00:00, 1.39GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 81%|████████ | 3.68G/4.54G [00:03<00:00, 1.15GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 86%|████████▌ | 3.89G/4.54G [00:03<00:00, 1.34GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: model-00002-of-00002.safetensors: 93%|█████████▎| 4.24G/4.54G [00:03<00:00, 1.83GB/s] model-00002-of-00002.safetensors: 100%|█████████▉| 4.54G/4.54G [00:04<00:00, 1.13GB/s]
wendyhoang-mistral-merged-v5-mkmlizer: Downloading shards: 100%|██████████| 2/2 [00:12<00:00, 6.06s/it] Downloading shards: 100%|██████████| 2/2 [00:12<00:00, 6.42s/it]
wendyhoang-mistral-merged-v5-mkmlizer: Loading checkpoint shards: 0%| | 0/2 [00:00<?, ?it/s] Loading checkpoint shards: 50%|█████ | 1/2 [00:01<00:01, 1.37s/it] Loading checkpoint shards: 100%|██████████| 2/2 [00:02<00:00, 1.05it/s] Loading checkpoint shards: 100%|██████████| 2/2 [00:02<00:00, 1.02s/it]
wendyhoang-mistral-merged-v5-mkmlizer: loaded model in 16.643s
wendyhoang-mistral-merged-v5-mkmlizer: saved to disk in 27.119s
wendyhoang-mistral-merged-v5-mkmlizer: quantizing model to /tmp/model_cache
wendyhoang-mistral-merged-v5-mkmlizer: Saving mkml model at /tmp/model_cache
wendyhoang-mistral-merged-v5-mkmlizer: Reading /tmp/tmpef5z8e1l/model.safetensors.index.json
wendyhoang-mistral-merged-v5-mkmlizer: Profiling: 0%| | 0/291 [00:00<?, ?it/s] Profiling: 0%| | 1/291 [00:01<05:25, 1.12s/it] Profiling: 1%| | 3/291 [00:01<02:33, 1.87it/s] Profiling: 1%|▏ | 4/291 [00:02<02:44, 1.74it/s] Profiling: 2%|▏ | 5/291 [00:03<02:51, 1.67it/s] Profiling: 3%|▎ | 8/291 [00:03<01:23, 3.37it/s] Profiling: 3%|▎ | 9/291 [00:03<01:21, 3.44it/s] Profiling: 4%|▍ | 12/291 [00:04<01:13, 3.80it/s] Profiling: 4%|▍ | 13/291 [00:04<01:33, 2.98it/s] Profiling: 5%|▍ | 14/291 [00:05<01:51, 2.48it/s] Profiling: 6%|▌ | 17/291 [00:05<01:08, 4.00it/s] Profiling: 6%|▌ | 18/291 [00:06<01:07, 4.04it/s] Profiling: 7%|▋ | 21/291 [00:06<01:05, 4.13it/s] Profiling: 8%|▊ | 22/291 [00:07<01:24, 3.18it/s] Profiling: 8%|▊ | 23/291 [00:08<01:42, 2.62it/s] Profiling: 9%|▉ | 26/291 [00:08<01:04, 4.13it/s] Profiling: 9%|▉ | 27/291 [00:08<01:01, 4.30it/s] Profiling: 10%|█ | 30/291 [00:09<01:00, 4.28it/s] Profiling: 11%|█ | 31/291 [00:09<01:19, 3.26it/s] Profiling: 11%|█ | 32/291 [00:10<01:36, 2.68it/s] Profiling: 12%|█▏ | 35/291 [00:10<01:00, 4.21it/s] Profiling: 12%|█▏ | 36/291 [00:10<00:58, 4.37it/s] Profiling: 13%|█▎ | 39/291 [00:11<00:57, 4.35it/s] Profiling: 14%|█▎ | 40/291 [00:12<01:16, 3.29it/s] Profiling: 14%|█▍ | 41/291 [00:12<01:33, 2.69it/s] Profiling: 15%|█▌ | 44/291 [00:13<00:58, 4.21it/s] Profiling: 15%|█▌ | 45/291 [00:13<00:56, 4.37it/s] Profiling: 16%|█▋ | 48/291 [00:13<00:41, 5.89it/s] Profiling: 17%|█▋ | 50/291 [00:14<01:00, 4.01it/s] Profiling: 18%|█▊ | 51/291 [00:15<01:16, 3.13it/s] Profiling: 18%|█▊ | 53/291 [00:15<00:59, 3.97it/s] Profiling: 19%|█▊ | 54/291 [00:15<00:56, 4.17it/s] Profiling: 20%|█▉ | 57/291 [00:16<00:55, 4.23it/s] Profiling: 20%|█▉ | 58/291 [00:16<01:12, 3.23it/s] Profiling: 20%|██ | 59/291 [00:17<01:28, 2.63it/s] Profiling: 21%|██ | 61/291 [00:17<01:03, 3.63it/s] Profiling: 22%|██▏ | 63/291 [00:18<01:06, 3.43it/s] Profiling: 22%|██▏ | 64/291 [00:19<01:21, 2.77it/s] Profiling: 22%|██▏ | 65/291 [00:19<01:35, 2.37it/s] Profiling: 23%|██▎ | 68/291 [00:19<00:56, 3.92it/s] Profiling: 24%|██▎ | 69/291 [00:20<00:53, 4.12it/s] Profiling: 25%|██▍ | 72/291 [00:20<00:51, 4.22it/s] Profiling: 25%|██▌ | 73/291 [00:21<01:07, 3.23it/s] Profiling: 25%|██▌ | 74/291 [00:22<01:22, 2.64it/s] Profiling: 26%|██▋ | 77/291 [00:22<00:51, 4.19it/s] Profiling: 27%|██▋ | 78/291 [00:22<00:48, 4.37it/s] Profiling: 28%|██▊ | 81/291 [00:23<00:48, 4.35it/s] Profiling: 28%|██▊ | 82/291 [00:23<01:03, 3.31it/s] Profiling: 29%|██▊ | 83/291 [00:24<01:17, 2.69it/s] Profiling: 30%|██▉ | 86/291 [00:24<00:48, 4.24it/s] Profiling: 30%|██▉ | 87/291 [00:24<00:46, 4.40it/s] Profiling: 31%|███ | 90/291 [00:25<00:45, 4.38it/s] Profiling: 31%|███▏ | 91/291 [00:26<00:59, 3.34it/s] Profiling: 32%|███▏ | 92/291 [00:26<01:13, 2.71it/s] Profiling: 33%|███▎ | 95/291 [00:27<00:46, 4.24it/s] Profiling: 33%|███▎ | 96/291 [00:27<00:44, 4.41it/s] Profiling: 34%|███▎ | 98/291 [00:27<00:35, 5.42it/s] Profiling: 34%|███▍ | 99/291 [00:28<00:52, 3.65it/s] Profiling: 35%|███▌ | 102/291 [00:28<00:47, 4.00it/s] Profiling: 35%|███▌ | 103/291 [00:29<01:00, 3.11it/s] Profiling: 36%|███▌ | 104/291 [00:30<01:12, 2.57it/s] Profiling: 37%|███▋ | 107/291 [00:30<00:45, 4.08it/s] Profiling: 37%|███▋ | 108/291 [00:30<00:43, 4.25it/s] Profiling: 38%|███▊ | 111/291 [00:31<00:41, 4.29it/s] Profiling: 38%|███▊ | 112/291 [00:31<00:54, 3.29it/s] Profiling: 39%|███▉ | 113/291 [00:32<01:06, 2.68it/s] Profiling: 40%|███▉ | 116/291 [00:32<00:41, 4.18it/s] Profiling: 40%|████ | 117/291 [00:33<00:40, 4.33it/s] Profiling: 41%|████ | 120/291 [00:33<00:39, 4.32it/s] Profiling: 42%|████▏ | 121/291 [00:34<00:51, 3.29it/s] Profiling: 42%|████▏ | 122/291 [00:35<01:02, 2.68it/s] Profiling: 43%|████▎ | 125/291 [00:35<00:39, 4.20it/s] Profiling: 43%|████▎ | 126/291 [00:35<00:37, 4.34it/s] Profiling: 44%|████▍ | 129/291 [00:36<00:37, 4.35it/s] Profiling: 45%|████▍ | 130/291 [00:36<00:49, 3.28it/s] Profiling: 45%|████▌ | 131/291 [00:37<01:00, 2.66it/s] Profiling: 46%|████▌ | 134/291 [00:37<00:37, 4.18it/s] Profiling: 46%|████▋ | 135/291 [00:37<00:35, 4.35it/s] Profiling: 47%|████▋ | 138/291 [00:38<00:35, 4.34it/s] Profiling: 48%|████▊ | 139/291 [00:39<00:46, 3.29it/s] Profiling: 48%|████▊ | 140/291 [00:39<00:56, 2.67it/s] Profiling: 49%|████▉ | 143/291 [00:40<00:35, 4.19it/s] Profiling: 49%|████▉ | 144/291 [00:40<00:33, 4.35it/s] Profiling: 50%|█████ | 146/291 [00:41<00:39, 3.70it/s] Profiling: 51%|█████ | 148/291 [00:41<00:31, 4.55it/s] Profiling: 51%|█████ | 149/291 [00:41<00:30, 4.68it/s] Profiling: 52%|█████▏ | 151/291 [00:41<00:24, 5.72it/s] Profiling: 52%|█████▏ | 152/291 [00:42<00:37, 3.68it/s] Profiling: 53%|█████▎ | 153/291 [00:43<00:49, 2.79it/s] Profiling: 54%|█████▎ | 156/291 [00:43<00:39, 3.42it/s] Profiling: 54%|█████▍ | 157/291 [00:44<00:48, 2.76it/s] Profiling: 54%|█████▍ | 158/291 [00:45<00:56, 2.35it/s] Profiling: 55%|█████▌ | 161/291 [00:45<00:33, 3.85it/s] Profiling: 56%|█████▌ | 162/291 [00:45<00:31, 4.07it/s] Profiling: 57%|█████▋ | 165/291 [00:46<00:30, 4.10it/s] Profiling: 57%|█████▋ | 166/291 [00:46<00:39, 3.13it/s] Profiling: 57%|█████▋ | 167/291 [00:47<00:48, 2.54it/s] Profiling: 58%|█████▊ | 170/291 [00:47<00:30, 4.03it/s] Profiling: 59%|█████▉ | 171/291 [00:47<00:28, 4.21it/s] Profiling: 60%|█████▉ | 174/291 [00:48<00:28, 4.12it/s] Profiling: 60%|██████ | 175/291 [00:49<00:36, 3.15it/s] Profiling: 60%|██████ | 176/291 [00:50<00:44, 2.57it/s] Profiling: 62%|██████▏ | 179/291 [00:50<00:27, 4.06it/s] Profiling: 62%|██████▏ | 180/291 [00:50<00:26, 4.25it/s] Profiling: 63%|██████▎ | 183/291 [00:51<00:25, 4.19it/s] Profiling: 63%|██████▎ | 184/291 [00:51<00:33, 3.16it/s] Profiling: 64%|██████▎ | 185/291 [00:52<00:41, 2.58it/s] Profiling: 65%|██████▍ | 188/291 [00:52<00:25, 4.07it/s] Profiling: 65%|██████▍ | 189/291 [00:52<00:23, 4.25it/s] Profiling: 66%|██████▌ | 192/291 [00:53<00:23, 4.15it/s] Profiling: 66%|██████▋ | 193/291 [00:54<00:31, 3.12it/s] Profiling: 67%|██████▋ | 194/291 [00:55<00:38, 2.55it/s] Profiling: 68%|██████▊ | 197/291 [00:55<00:23, 4.03it/s] Profiling: 68%|██████▊ | 198/291 [00:55<00:22, 4.21it/s] Profiling: 69%|██████▉ | 201/291 [00:55<00:15, 5.71it/s] Profiling: 69%|██████▉ | 202/291 [00:56<00:15, 5.61it/s] Profiling: 70%|███████ | 204/291 [00:56<00:13, 6.64it/s] Profiling: 70%|███████ | 205/291 [00:56<00:22, 3.87it/s] Profiling: 71%|███████ | 206/291 [00:57<00:30, 2.81it/s] Profiling: 71%|███████ | 207/291 [00:58<00:37, 2.27it/s] Profiling: 72%|███████▏ | 210/291 [00:59<00:27, 2.96it/s] Profiling: 73%|███████▎ | 211/291 [00:59<00:32, 2.43it/s] Profiling: 73%|███████▎ | 212/291 [01:00<00:37, 2.11it/s] Profiling: 74%|███████▍ | 215/291 [01:00<00:21, 3.52it/s] Profiling: 74%|███████▍ | 216/291 [01:00<00:20, 3.71it/s] Profiling: 75%|███████▌ | 219/291 [01:01<00:18, 3.86it/s] Profiling: 76%|███████▌ | 220/291 [01:02<00:24, 2.95it/s] Profiling: 76%|███████▌ | 221/291 [01:03<00:28, 2.44it/s] Profiling: 77%|███████▋ | 224/291 [01:03<00:17, 3.84it/s] Profiling: 77%|███████▋ | 225/291 [01:03<00:16, 3.97it/s] Profiling: 78%|███████▊ | 228/291 [01:04<00:15, 3.95it/s] Profiling: 79%|███████▊ | 229/291 [01:05<00:20, 2.99it/s] Profiling: 79%|███████▉ | 230/291 [01:05<00:25, 2.43it/s] Profiling: 80%|████████ | 233/291 [01:06<00:15, 3.83it/s] Profiling: 80%|████████ | 234/291 [01:06<00:14, 3.96it/s] Profiling: 81%|████████▏ | 237/291 [01:07<00:13, 3.92it/s] Profiling: 82%|████████▏ | 238/291 [01:07<00:17, 2.97it/s] Profiling: 82%|████████▏ | 239/291 [01:08<00:21, 2.42it/s] Profiling: 83%|████████▎ | 242/291 [01:08<00:12, 3.84it/s] Profiling: 84%|████████▎ | 243/291 [01:08<00:12, 3.95it/s] Profiling: 84%|████████▍ | 245/291 [01:09<00:13, 3.37it/s] Profiling: 85%|████████▍ | 246/291 [01:10<00:16, 2.65it/s] Profiling: 85%|████████▌ | 248/291 [01:10<00:12, 3.48it/s] Profiling: 86%|████████▌ | 249/291 [01:10<00:11, 3.66it/s] Profiling: 86%|████████▋ | 251/291 [01:12<00:19, 2.08it/s] Profiling: 87%|████████▋ | 253/291 [01:13<00:16, 2.29it/s] Profiling: 88%|████████▊ | 256/291 [01:13<00:12, 2.83it/s] Profiling: 88%|████████▊ | 257/291 [01:14<00:14, 2.40it/s] Profiling: 89%|████████▊ | 258/291 [01:15<00:15, 2.11it/s] Profiling: 90%|████████▉ | 261/291 [01:15<00:08, 3.38it/s] Profiling: 90%|█████████ | 262/291 [01:15<00:08, 3.55it/s] Profiling: 91%|█████████ | 265/291 [01:16<00:07, 3.70it/s] Profiling: 91%|█████████▏| 266/291 [01:17<00:08, 2.89it/s] Profiling: 92%|█████████▏| 267/291 [01:18<00:10, 2.39it/s] Profiling: 93%|█████████▎| 270/291 [01:18<00:05, 3.75it/s] Profiling: 93%|█████████▎| 271/291 [01:18<00:05, 3.92it/s] Profiling: 94%|█████████▍| 274/291 [01:19<00:04, 3.89it/s] Profiling: 95%|█████████▍| 275/291 [01:19<00:05, 3.01it/s] Profiling: 95%|█████████▍| 276/291 [01:20<00:06, 2.45it/s] Profiling: 96%|█████████▌| 279/291 [01:20<00:03, 3.81it/s] Profiling: 96%|█████████▌| 280/291 [01:21<00:02, 3.95it/s] Profiling: 97%|█████████▋| 283/291 [01:21<00:02, 3.92it/s] Profiling: 98%|█████████▊| 284/291 [01:22<00:02, 3.00it/s] Profiling: 98%|█████████▊| 285/291 [01:23<00:02, 2.46it/s] Profiling: 99%|█████████▉| 288/291 [01:23<00:00, 3.83it/s] Profiling: 99%|█████████▉| 289/291 [01:23<00:00, 3.98it/s] Profiling: 100%|██████████| 291/291 [01:24<00:00, 3.46it/s]
wendyhoang-mistral-merged-v5-mkmlizer: quantized model in 97.258s
wendyhoang-mistral-merged-v5-mkmlizer: Processed model WendyHoang/mistral-merged in 141.022s
wendyhoang-mistral-merged-v5-mkmlizer: creating bucket guanaco-mkml-models
wendyhoang-mistral-merged-v5-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
wendyhoang-mistral-merged-v5-mkmlizer: uploading /tmp/model_cache to s3://guanaco-mkml-models/wendyhoang-mistral-merged-v5
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/model_cache/config.json s3://guanaco-mkml-models/wendyhoang-mistral-merged-v5/config.json
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/model_cache/special_tokens_map.json s3://guanaco-mkml-models/wendyhoang-mistral-merged-v5/special_tokens_map.json
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/model_cache/added_tokens.json s3://guanaco-mkml-models/wendyhoang-mistral-merged-v5/added_tokens.json
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/model_cache/tokenizer_config.json s3://guanaco-mkml-models/wendyhoang-mistral-merged-v5/tokenizer_config.json
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/model_cache/tokenizer.model s3://guanaco-mkml-models/wendyhoang-mistral-merged-v5/tokenizer.model
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/model_cache/tokenizer.json s3://guanaco-mkml-models/wendyhoang-mistral-merged-v5/tokenizer.json
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/model_cache/mkml_model.tensors s3://guanaco-mkml-models/wendyhoang-mistral-merged-v5/mkml_model.tensors
wendyhoang-mistral-merged-v5-mkmlizer: loading reward model from WendyHoang/reward-model
wendyhoang-mistral-merged-v5-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/configuration_auto.py:1067: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
wendyhoang-mistral-merged-v5-mkmlizer: warnings.warn(
wendyhoang-mistral-merged-v5-mkmlizer: config.json: 0%| | 0.00/968 [00:00<?, ?B/s] config.json: 100%|██████████| 968/968 [00:00<00:00, 11.4MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/tokenization_auto.py:690: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
wendyhoang-mistral-merged-v5-mkmlizer: warnings.warn(
wendyhoang-mistral-merged-v5-mkmlizer: tokenizer_config.json: 0%| | 0.00/477 [00:00<?, ?B/s] tokenizer_config.json: 100%|██████████| 477/477 [00:00<00:00, 5.99MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: vocab.json: 0%| | 0.00/798k [00:00<?, ?B/s] vocab.json: 100%|██████████| 798k/798k [00:00<00:00, 88.3MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: merges.txt: 0%| | 0.00/456k [00:00<?, ?B/s] merges.txt: 100%|██████████| 456k/456k [00:00<00:00, 6.88MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: tokenizer.json: 0%| | 0.00/2.11M [00:00<?, ?B/s] tokenizer.json: 100%|██████████| 2.11M/2.11M [00:00<00:00, 34.3MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: special_tokens_map.json: 0%| | 0.00/131 [00:00<?, ?B/s] special_tokens_map.json: 100%|██████████| 131/131 [00:00<00:00, 2.01MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: /opt/conda/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py:472: FutureWarning: The `use_auth_token` argument is deprecated and will be removed in v5 of Transformers. Please use `token` instead.
wendyhoang-mistral-merged-v5-mkmlizer: warnings.warn(
wendyhoang-mistral-merged-v5-mkmlizer: model.safetensors: 0%| | 0.00/498M [00:00<?, ?B/s] model.safetensors: 2%|▏ | 10.5M/498M [00:00<00:22, 21.7MB/s] model.safetensors: 4%|▍ | 21.0M/498M [00:00<00:14, 32.6MB/s] model.safetensors: 6%|▋ | 31.5M/498M [00:01<00:14, 33.0MB/s] model.safetensors: 12%|█▏ | 57.4M/498M [00:01<00:07, 56.9MB/s] model.safetensors: 18%|█▊ | 88.8M/498M [00:01<00:04, 93.1MB/s] model.safetensors: 39%|███▉ | 194M/498M [00:01<00:01, 254MB/s] model.safetensors: 47%|████▋ | 236M/498M [00:01<00:00, 284MB/s] model.safetensors: 100%|█████████▉| 498M/498M [00:01<00:00, 289MB/s]
wendyhoang-mistral-merged-v5-mkmlizer: Saving model to /tmp/reward_cache/reward.tensors
wendyhoang-mistral-merged-v5-mkmlizer: Saving duration: 0.108s
wendyhoang-mistral-merged-v5-mkmlizer: Processed model WendyHoang/reward-model in 3.757s
wendyhoang-mistral-merged-v5-mkmlizer: creating bucket guanaco-reward-models
wendyhoang-mistral-merged-v5-mkmlizer: Bucket 's3://guanaco-reward-models/' created
wendyhoang-mistral-merged-v5-mkmlizer: uploading /tmp/reward_cache to s3://guanaco-reward-models/wendyhoang-mistral-merged-v5_reward
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/reward_cache/tokenizer_config.json s3://guanaco-reward-models/wendyhoang-mistral-merged-v5_reward/tokenizer_config.json
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/reward_cache/config.json s3://guanaco-reward-models/wendyhoang-mistral-merged-v5_reward/config.json
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/reward_cache/special_tokens_map.json s3://guanaco-reward-models/wendyhoang-mistral-merged-v5_reward/special_tokens_map.json
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/reward_cache/merges.txt s3://guanaco-reward-models/wendyhoang-mistral-merged-v5_reward/merges.txt
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/reward_cache/vocab.json s3://guanaco-reward-models/wendyhoang-mistral-merged-v5_reward/vocab.json
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/reward_cache/tokenizer.json s3://guanaco-reward-models/wendyhoang-mistral-merged-v5_reward/tokenizer.json
wendyhoang-mistral-merged-v5-mkmlizer: cp /tmp/reward_cache/reward.tensors s3://guanaco-reward-models/wendyhoang-mistral-merged-v5_reward/reward.tensors
Job wendyhoang-mistral-merged-v5-mkmlizer completed after 175.66s with status: succeeded
Stopping job with name wendyhoang-mistral-merged-v5-mkmlizer
Pipeline stage MKMLizer completed in 182.81s
Running pipeline stage MKMLKubeTemplater
Pipeline stage MKMLKubeTemplater completed in 0.16s
Running pipeline stage ISVCDeployer
Creating inference service wendyhoang-mistral-merged-v5
Waiting for inference service wendyhoang-mistral-merged-v5 to be ready
Inference service wendyhoang-mistral-merged-v5 ready after 161.47411727905273s
Pipeline stage ISVCDeployer completed in 169.70s
Running pipeline stage StressChecker
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received no response to inference request from service
Received healthy response to inference request with status code 200 in 2.6057841777801514s
Received no response to inference request from service
Received no response to inference request from service
Received healthy response to inference request with status code 200 in 2.4109933376312256s
Received no response to inference request from service
Received healthy response to inference request with status code 200 in 2.3474459648132324s
Received healthy response to inference request with status code 200 in 2.323291301727295s
Received healthy response to inference request with status code 200 in 2.3568899631500244s
Received healthy response to inference request with status code 200 in 1.534233570098877s
Received healthy response to inference request with status code 200 in 2.480989456176758s
Received healthy response to inference request with status code 200 in 1.5419700145721436s
Received healthy response to inference request with status code 200 in 2.398277521133423s
Received healthy response to inference request with status code 200 in 2.327228546142578s
Received healthy response to inference request with status code 200 in 1.5861167907714844s
Received healthy response to inference request with status code 200 in 1.1745727062225342s
Received healthy response to inference request with status code 200 in 1.679786205291748s
Received healthy response to inference request with status code 200 in 1.503847360610962s
Received healthy response to inference request with status code 200 in 1.3803460597991943s
Received healthy response to inference request with status code 200 in 1.500518798828125s
Received healthy response to inference request with status code 200 in 2.3523335456848145s
Received healthy response to inference request with status code 200 in 1.492746114730835s
Received healthy response to inference request with status code 200 in 2.363161563873291s
Received healthy response to inference request with status code 200 in 1.5215468406677246s
Received healthy response to inference request with status code 200 in 1.575585126876831s
Received healthy response to inference request with status code 200 in 1.1546638011932373s
Received healthy response to inference request with status code 200 in 1.5259606838226318s
Received healthy response to inference request with status code 200 in 1.5291211605072021s
Received healthy response to inference request with status code 200 in 1.549919605255127s
Received healthy response to inference request with status code 200 in 1.523350715637207s
Received healthy response to inference request with status code 200 in 1.608680009841919s
Received healthy response to inference request with status code 200 in 1.5170860290527344s
Received healthy response to inference request with status code 200 in 1.552306890487671s
Received healthy response to inference request with status code 200 in 1.4896066188812256s
Received healthy response to inference request with status code 200 in 1.5504150390625s
Received healthy response to inference request with status code 200 in 1.5295798778533936s
Received healthy response to inference request with status code 200 in 1.5346519947052002s
Received healthy response to inference request with status code 200 in 1.5470914840698242s
Received healthy response to inference request with status code 200 in 1.5293045043945312s
Received healthy response to inference request with status code 200 in 1.5448474884033203s
Received healthy response to inference request with status code 200 in 1.5212104320526123s
Received healthy response to inference request with status code 200 in 1.5332067012786865s
Received healthy response to inference request with status code 200 in 1.3603589534759521s
Received healthy response to inference request with status code 200 in 1.509936809539795s
Received healthy response to inference request with status code 200 in 1.4571614265441895s
Received healthy response to inference request with status code 200 in 1.3912162780761719s
Received healthy response to inference request with status code 200 in 1.1946592330932617s
Received healthy response to inference request with status code 200 in 1.5586893558502197s
Received healthy response to inference request with status code 200 in 1.5284481048583984s
Received healthy response to inference request with status code 200 in 1.4783375263214111s
Received healthy response to inference request with status code 200 in 1.523427963256836s
Received healthy response to inference request with status code 200 in 1.5394489765167236s
Received healthy response to inference request with status code 200 in 1.535815715789795s
Received healthy response to inference request with status code 200 in 1.3933188915252686s
Received healthy response to inference request with status code 200 in 1.4736590385437012s
100 requests
49 failed requests
5th percentile: 1.390672767162323
10th percentile: 1.488479709625244
20th percentile: 1.5254541397094725
30th percentile: 1.5412137031555175
40th percentile: 1.6513437271118163
50th percentile: 2.5433868169784546
60th percentile: 20.03762526512146
70th percentile: 20.040033650398254
80th percentile: 20.043456840515137
90th percentile: 20.076034021377563
95th percentile: 20.082874143123625
99th percentile: 20.100342042446133
mean time: 10.676417648792267
%s, retrying in %s seconds...
Received healthy response to inference request with status code 200 in 1.660050868988037s
Received healthy response to inference request with status code 200 in 1.440544605255127s
Received healthy response to inference request with status code 200 in 1.4196367263793945s
Received healthy response to inference request with status code 200 in 1.395059585571289s
Received healthy response to inference request with status code 200 in 1.3984544277191162s
Received healthy response to inference request with status code 200 in 1.45884108543396s
Received healthy response to inference request with status code 200 in 1.4063942432403564s
Received healthy response to inference request with status code 200 in 1.4543862342834473s
Received healthy response to inference request with status code 200 in 1.4177677631378174s
Received healthy response to inference request with status code 200 in 1.4268927574157715s
Received healthy response to inference request with status code 200 in 1.47426176071167s
Received healthy response to inference request with status code 200 in 1.4629323482513428s
Received healthy response to inference request with status code 200 in 1.2849624156951904s
Received healthy response to inference request with status code 200 in 1.4634912014007568s
Received healthy response to inference request with status code 200 in 1.4566280841827393s
Received healthy response to inference request with status code 200 in 1.4784343242645264s
Received healthy response to inference request with status code 200 in 1.471085786819458s
Received healthy response to inference request with status code 200 in 1.4534094333648682s
Received healthy response to inference request with status code 200 in 1.4869768619537354s
Received healthy response to inference request with status code 200 in 1.3060781955718994s
Received healthy response to inference request with status code 200 in 1.4532830715179443s
Received healthy response to inference request with status code 200 in 1.5278007984161377s
Received healthy response to inference request with status code 200 in 1.5020148754119873s
Received healthy response to inference request with status code 200 in 1.587172031402588s
Received healthy response to inference request with status code 200 in 1.610905408859253s
Received healthy response to inference request with status code 200 in 1.4270997047424316s
Received healthy response to inference request with status code 200 in 1.5950636863708496s
Received healthy response to inference request with status code 200 in 1.5709199905395508s
Received healthy response to inference request with status code 200 in 1.4958481788635254s
Received healthy response to inference request with status code 200 in 1.5100305080413818s
Received healthy response to inference request with status code 200 in 1.5166771411895752s
Received healthy response to inference request with status code 200 in 1.5282552242279053s
Received healthy response to inference request with status code 200 in 1.4686450958251953s
Received healthy response to inference request with status code 200 in 1.5041146278381348s
Received healthy response to inference request with status code 200 in 1.4959027767181396s
Received healthy response to inference request with status code 200 in 1.526353359222412s
Received healthy response to inference request with status code 200 in 1.4226901531219482s
Received healthy response to inference request with status code 200 in 1.5047128200531006s
Received healthy response to inference request with status code 200 in 1.4736449718475342s
Received healthy response to inference request with status code 200 in 4.0563764572143555s
Received healthy response to inference request with status code 200 in 1.4977517127990723s
Received healthy response to inference request with status code 200 in 1.5208148956298828s
Received healthy response to inference request with status code 200 in 3.059746503829956s
Received healthy response to inference request with status code 200 in 1.543022632598877s
Received healthy response to inference request with status code 200 in 2.2178401947021484s
Received healthy response to inference request with status code 200 in 1.519545078277588s
Received healthy response to inference request with status code 200 in 1.4635539054870605s
Received healthy response to inference request with status code 200 in 1.5359899997711182s
Received healthy response to inference request with status code 200 in 1.3726849555969238s
Received healthy response to inference request with status code 200 in 1.522517204284668s
Received healthy response to inference request with status code 200 in 1.5225238800048828s
Received healthy response to inference request with status code 200 in 1.368877649307251s
Received healthy response to inference request with status code 200 in 1.4511706829071045s
Received healthy response to inference request with status code 200 in 1.2655854225158691s
Received healthy response to inference request with status code 200 in 2.2631888389587402s
Received healthy response to inference request with status code 200 in 1.4957916736602783s
Received healthy response to inference request with status code 200 in 1.3204336166381836s
Received healthy response to inference request with status code 200 in 1.5130727291107178s
Received healthy response to inference request with status code 200 in 2.0238399505615234s
Received healthy response to inference request with status code 200 in 1.5043766498565674s
Received healthy response to inference request with status code 200 in 1.563661813735962s
Received healthy response to inference request with status code 200 in 1.0621812343597412s
Received healthy response to inference request with status code 200 in 1.0551090240478516s
Received healthy response to inference request with status code 200 in 1.4771807193756104s
Received healthy response to inference request with status code 200 in 1.496523141860962s
Received healthy response to inference request with status code 200 in 1.4783480167388916s
Received healthy response to inference request with status code 200 in 1.5026464462280273s
Received healthy response to inference request with status code 200 in 1.5215859413146973s
Received healthy response to inference request with status code 200 in 1.500882625579834s
Received healthy response to inference request with status code 200 in 1.4900681972503662s
Received healthy response to inference request with status code 200 in 1.5075187683105469s
Received healthy response to inference request with status code 200 in 1.5432510375976562s
Received healthy response to inference request with status code 200 in 1.5481853485107422s
Received healthy response to inference request with status code 200 in 1.4953083992004395s
Received healthy response to inference request with status code 200 in 1.6525275707244873s
Received healthy response to inference request with status code 200 in 1.5021347999572754s
Received healthy response to inference request with status code 200 in 1.509303092956543s
Received healthy response to inference request with status code 200 in 1.5450592041015625s
Received healthy response to inference request with status code 200 in 1.5109384059906006s
Connection pool is full, discarding connection: %s
Connection pool is full, discarding connection: %s
Received healthy response to inference request with status code 200 in 1.5631756782531738s
Connection pool is full, discarding connection: %s
Connection pool is full, discarding connection: %s
Received healthy response to inference request with status code 200 in 1.6353185176849365s
Connection pool is full, discarding connection: %s
Connection pool is full, discarding connection: %s
Received healthy response to inference request with status code 200 in 1.5488567352294922s
Received healthy response to inference request with status code 200 in 1.5777881145477295s
Connection pool is full, discarding connection: %s
Received healthy response to inference request with status code 200 in 1.5392420291900635s
Connection pool is full, discarding connection: %s
Received healthy response to inference request with status code 200 in 1.5902953147888184s
Received healthy response to inference request with status code 200 in 1.5516643524169922s
Received healthy response to inference request with status code 200 in 1.5101211071014404s
Connection pool is full, discarding connection: %s
Connection pool is full, discarding connection: %s
Received healthy response to inference request with status code 200 in 1.5095922946929932s
Connection pool is full, discarding connection: %s
Connection pool is full, discarding connection: %s
Received healthy response to inference request with status code 200 in 1.5629353523254395s
Connection pool is full, discarding connection: %s
Received healthy response to inference request with status code 200 in 1.4673116207122803s
Connection pool is full, discarding connection: %s
Received healthy response to inference request with status code 200 in 1.5031328201293945s
Received healthy response to inference request with status code 200 in 1.5394046306610107s
Received healthy response to inference request with status code 200 in 1.5479557514190674s
Received healthy response to inference request with status code 200 in 1.5239698886871338s
Received healthy response to inference request with status code 200 in 1.5471322536468506s
Received healthy response to inference request with status code 200 in 1.5128824710845947s
Received healthy response to inference request with status code 200 in 1.5188932418823242s
Received healthy response to inference request with status code 200 in 1.514794111251831s
Received healthy response to inference request with status code 200 in 1.5056743621826172s
Received healthy response to inference request with status code 200 in 1.5867807865142822s
100 requests
0 failed requests
5th percentile: 1.3197158455848694
10th percentile: 1.4056002616882324
20th percentile: 1.4541908740997314
30th percentile: 1.4740767240524293
40th percentile: 1.496274995803833
50th percentile: 1.5051935911178589
60th percentile: 1.5155473232269288
70th percentile: 1.527937126159668
80th percentile: 1.5483196258544922
90th percentile: 1.5907721519470215
95th percentile: 1.6782403230667104
99th percentile: 3.0697128033638053
mean time: 1.5489546298980712
Pipeline stage StressChecker completed in 1241.86s
Running pipeline stage SafetyScorer
Pipeline stage SafetyScorer completed in 39.09s
Running pipeline stage MEvalScorer
Running M-Eval for topic stay_in_character
Pipeline stage MEvalScorer completed in 382.01s
wendyhoang-mistral-merged_v5 status is now inactive due to auto deactivation removed underperforming models
wendyhoang-mistral-merged_v5 status is now deployed due to admin request
wendyhoang-mistral-merged_v5 status is now inactive due to auto deactivation removed underperforming models

Usage Metrics

Latency Metrics