developer_uid: cloudyu
submission_id: cloudyu-chaiml-nemo-dpo-v9_v1
model_name: cloudyu-chaiml-nemo-dpo-v9_v1
model_group: cloudyu/ChaiML-Nemo-DPO-
status: inactive
timestamp: 2024-11-24T15:30:18+00:00
num_battles: 19698
num_wins: 9933
celo_rating: 1264.68
family_friendly_score: 0.5554
family_friendly_standard_error: 0.007027529295563271
submission_type: basic
model_repo: cloudyu/ChaiML-Nemo-DPO-V9
model_architecture: MistralForCausalLM
model_num_parameters: 12772070400.0
best_of: 8
max_input_tokens: 1024
max_output_tokens: 64
latencies: [{'batch_size': 1, 'throughput': 0.6174712678661568, 'latency_mean': 1.6194457936286926, 'latency_p50': 1.6079362630844116, 'latency_p90': 1.7781689405441283}, {'batch_size': 3, 'throughput': 1.1473942827347627, 'latency_mean': 2.6024447429180144, 'latency_p50': 2.5845041275024414, 'latency_p90': 2.913749766349792}, {'batch_size': 5, 'throughput': 1.382339727574858, 'latency_mean': 3.599818688631058, 'latency_p50': 3.62924325466156, 'latency_p90': 4.059407162666321}, {'batch_size': 6, 'throughput': 1.4668420449875825, 'latency_mean': 4.067454805374146, 'latency_p50': 4.065704584121704, 'latency_p90': 4.5287316799163815}, {'batch_size': 8, 'throughput': 1.5162207385162092, 'latency_mean': 5.235783632993698, 'latency_p50': 5.195023536682129, 'latency_p90': 5.955360007286072}, {'batch_size': 10, 'throughput': 1.5431993423503862, 'latency_mean': 6.424742347002029, 'latency_p50': 6.453848600387573, 'latency_p90': 7.2897117137908936}]
gpu_counts: {'NVIDIA RTX A5000': 1}
display_name: cloudyu-chaiml-nemo-dpo-v9_v1
is_internal_developer: False
language_model: cloudyu/ChaiML-Nemo-DPO-V9
model_size: 13B
ranking_group: single
throughput_3p7s: 1.41
us_pacific_date: 2024-11-24
win_ratio: 0.5042643923240938
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer
Waiting for job on cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer to finish
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ _____ __ __ ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ /___/ ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ Version: 0.11.12 ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ https://mk1.ai ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ The license key for the current software has been verified as ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ belonging to: ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ Chai Research Corp. ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ║ ║
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: Downloaded to shared memory in 49.235s
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp4khtjf50, device:0
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: quantized model in 36.287s
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: Processed model cloudyu/ChaiML-Nemo-DPO-V9 in 85.522s
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: creating bucket guanaco-mkml-models
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/cloudyu-chaiml-nemo-dpo-v9-v1
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/cloudyu-chaiml-nemo-dpo-v9-v1/tokenizer_config.json
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/cloudyu-chaiml-nemo-dpo-v9-v1/tokenizer.json
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/cloudyu-chaiml-nemo-dpo-v9-v1/flywheel_model.0.safetensors
Connection pool is full, discarding connection: %s. Connection pool size: %s
cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%|▏ | 5/363 [00:00<00:13, 26.19it/s] Loading 0: 3%|▎ | 12/363 [00:00<00:08, 43.56it/s] Loading 0: 5%|▍ | 17/363 [00:00<00:08, 42.91it/s] Loading 0: 6%|▌ | 22/363 [00:00<00:07, 43.63it/s] Loading 0: 8%|▊ | 28/363 [00:00<00:08, 41.30it/s] Loading 0: 9%|▉ | 33/363 [00:00<00:08, 40.68it/s] Loading 0: 11%|█ | 40/363 [00:00<00:07, 46.10it/s] Loading 0: 12%|█▏ | 45/363 [00:01<00:06, 46.68it/s] Loading 0: 14%|█▍ | 50/363 [00:01<00:08, 37.50it/s] Loading 0: 15%|█▌ | 56/363 [00:01<00:07, 41.99it/s] Loading 0: 17%|█▋ | 61/363 [00:01<00:09, 31.32it/s] Loading 0: 18%|█▊ | 65/363 [00:01<00:09, 30.95it/s] Loading 0: 20%|█▉ | 72/363 [00:01<00:07, 37.87it/s] Loading 0: 21%|██ | 77/363 [00:01<00:07, 39.63it/s] Loading 0: 23%|██▎ | 82/363 [00:02<00:08, 34.29it/s] Loading 0: 25%|██▍ | 90/363 [00:02<00:06, 42.44it/s] Loading 0: 26%|██▌ | 95/363 [00:02<00:06, 43.85it/s] Loading 0: 28%|██▊ | 100/363 [00:02<00:07, 36.99it/s] Loading 0: 30%|██▉ | 108/363 [00:02<00:05, 46.32it/s] Loading 0: 31%|███▏ | 114/363 [00:02<00:05, 42.22it/s] Loading 0: 33%|███▎ | 119/363 [00:02<00:05, 41.33it/s] Loading 0: 34%|███▍ | 125/363 [00:03<00:05, 44.38it/s] Loading 0: 36%|███▌ | 130/363 [00:03<00:05, 43.06it/s] Loading 0: 37%|███▋ | 135/363 [00:03<00:05, 43.98it/s] Loading 0: 39%|███▊ | 140/363 [00:03<00:05, 44.39it/s] Loading 0: 40%|███▉ | 145/363 [00:03<00:07, 28.22it/s] Loading 0: 41%|████ | 149/363 [00:03<00:07, 28.44it/s] Loading 0: 43%|████▎ | 156/363 [00:04<00:05, 36.38it/s] Loading 0: 44%|████▍ | 161/363 [00:04<00:05, 37.76it/s] Loading 0: 46%|████▌ | 166/363 [00:04<00:04, 40.00it/s] Loading 0: 47%|████▋ | 171/363 [00:04<00:04, 41.67it/s] Loading 0: 48%|████▊ | 176/363 [00:04<00:05, 35.46it/s] Loading 0: 51%|█████ | 184/363 [00:04<00:04, 43.59it/s] Loading 0: 52%|█████▏ | 189/363 [00:04<00:03, 44.99it/s] Loading 0: 53%|█████▎ | 194/363 [00:04<00:04, 37.25it/s] Loading 0: 55%|█████▌ | 201/363 [00:05<00:03, 43.88it/s] Loading 0: 57%|█████▋ | 206/363 [00:05<00:03, 42.95it/s] Loading 0: 58%|█████▊ | 211/363 [00:05<00:03, 43.99it/s] Loading 0: 60%|█████▉ | 216/363 [00:05<00:03, 45.05it/s] Loading 0: 61%|██████ | 221/363 [00:05<00:03, 45.40it/s] Loading 0: 62%|██████▏ | 226/363 [00:05<00:05, 27.14it/s] Loading 0: 63%|██████▎ | 230/363 [00:06<00:04, 27.45it/s] Loading 0: 65%|██████▌ | 237/363 [00:06<00:03, 35.54it/s] Loading 0: 67%|██████▋ | 242/363 [00:06<00:03, 36.73it/s] Loading 0: 68%|██████▊ | 247/363 [00:06<00:03, 38.33it/s] Loading 0: 69%|██████▉ | 252/363 [00:06<00:02, 40.76it/s] Loading 0: 71%|███████ | 257/363 [00:06<00:03, 34.88it/s] Loading 0: 73%|███████▎ | 264/363 [00:06<00:02, 42.36it/s] Loading 0: 74%|███████▍ | 269/363 [00:06<00:02, 40.26it/s] Loading 0: 75%|███████▌ | 274/363 [00:07<00:02, 41.11it/s] Loading 0: 77%|███████▋ | 279/363 [00:07<00:01, 42.82it/s] Loading 0: 78%|███████▊ | 284/363 [00:07<00:02, 36.33it/s] Loading 0: 80%|████████ | 291/363 [00:07<00:01, 43.56it/s] Loading 0: 82%|████████▏ | 296/363 [00:07<00:01, 42.98it/s] Loading 0: 83%|████████▎ | 302/363 [00:07<00:01, 46.98it/s] Loading 0: 85%|████████▍ | 307/363 [00:14<00:22, 2.54it/s] Loading 0: 86%|████████▌ | 312/363 [00:14<00:14, 3.45it/s] Loading 0: 88%|████████▊ | 320/363 [00:14<00:07, 5.50it/s] Loading 0: 90%|████████▉ | 325/363 [00:14<00:05, 7.15it/s] Loading 0: 91%|█████████ | 330/363 [00:15<00:03, 8.93it/s] Loading 0: 93%|█████████▎| 338/363 [00:15<00:01, 13.41it/s] Loading 0: 95%|█████████▍| 344/363 [00:15<00:01, 16.58it/s] Loading 0: 96%|█████████▌| 349/363 [00:15<00:00, 19.52it/s] Loading 0: 98%|█████████▊| 356/363 [00:15<00:00, 25.31it/s] Loading 0: 99%|█████████▉| 361/363 [00:15<00:00, 28.62it/s]
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Job cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer completed after 114.62s with status: succeeded
Stopping job with name cloudyu-chaiml-nemo-dpo-v9-v1-mkmlizer
Pipeline stage MKMLizer completed in 115.10s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service cloudyu-chaiml-nemo-dpo-v9-v1
Waiting for inference service cloudyu-chaiml-nemo-dpo-v9-v1 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Inference service cloudyu-chaiml-nemo-dpo-v9-v1 ready after 120.44383001327515s
Pipeline stage MKMLDeployer completed in 120.98s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.270944595336914s
Received healthy response to inference request in 1.7498230934143066s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 1.9935097694396973s
Received healthy response to inference request in 1.6040928363800049s
Received healthy response to inference request in 2.0988802909851074s
5 requests
0 failed requests
5th percentile: 1.6332388877868653
10th percentile: 1.6623849391937255
20th percentile: 1.7206770420074462
30th percentile: 1.7985604286193848
40th percentile: 1.896035099029541
50th percentile: 1.9935097694396973
60th percentile: 2.0356579780578614
70th percentile: 2.0778061866760256
80th percentile: 2.1332931518554688
90th percentile: 2.2021188735961914
95th percentile: 2.2365317344665527
99th percentile: 2.264062023162842
mean time: 1.943450117111206
Pipeline stage StressChecker completed in 11.10s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.31s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 2.24s
Shutdown handler de-registered
cloudyu-chaiml-nemo-dpo-v9_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 2780.47s
Shutdown handler de-registered
cloudyu-chaiml-nemo-dpo-v9_v1 status is now inactive due to auto deactivation removed underperforming models