developer_uid: cloudyu
submission_id: cloudyu-nemo-dpo-v10_v2
model_name: cloudyu-nemo-dpo-v10_v1
model_group: cloudyu/Nemo-DPO-v10
status: torndown
timestamp: 2024-12-03T13:52:46+00:00
num_battles: 14290
num_wins: 6818
celo_rating: 1247.39
family_friendly_score: 0.5824
family_friendly_standard_error: 0.0069743851341892505
submission_type: basic
model_repo: cloudyu/Nemo-DPO-v10
model_architecture: MistralForCausalLM
model_num_parameters: 12772070400.0
best_of: 8
max_input_tokens: 1024
max_output_tokens: 64
latencies: [{'batch_size': 1, 'throughput': 0.6274242884438985, 'latency_mean': 1.5937504088878631, 'latency_p50': 1.5865322351455688, 'latency_p90': 1.7632481336593628}, {'batch_size': 3, 'throughput': 1.1658285733923646, 'latency_mean': 2.56971217751503, 'latency_p50': 2.5751755237579346, 'latency_p90': 2.8604968309402463}, {'batch_size': 5, 'throughput': 1.4024299802149327, 'latency_mean': 3.5412581396102905, 'latency_p50': 3.5444256067276, 'latency_p90': 3.910311794281006}, {'batch_size': 6, 'throughput': 1.5015585497361135, 'latency_mean': 3.9700499296188356, 'latency_p50': 3.974389433860779, 'latency_p90': 4.474990963935852}, {'batch_size': 8, 'throughput': 1.565687257547906, 'latency_mean': 5.067323036193848, 'latency_p50': 5.1213037967681885, 'latency_p90': 5.722669839859009}, {'batch_size': 10, 'throughput': 1.626762118386902, 'latency_mean': 6.106212030649186, 'latency_p50': 6.03664243221283, 'latency_p90': 7.021939754486084}]
gpu_counts: {'NVIDIA RTX A5000': 1}
display_name: cloudyu-nemo-dpo-v10_v1
is_internal_developer: False
language_model: cloudyu/Nemo-DPO-v10
model_size: 13B
ranking_group: single
throughput_3p7s: 1.45
us_pacific_date: 2024-12-03
win_ratio: 0.47711686494051786
generation_params: {'temperature': 0.99, 'top_p': 0.99, 'min_p': 0.01, 'top_k': 80, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name cloudyu-nemo-dpo-v10-v2-mkmlizer
Waiting for job on cloudyu-nemo-dpo-v10-v2-mkmlizer to finish
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
cloudyu-nemo-dpo-v10-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ _____ __ __ ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ /___/ ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ Version: 0.11.12 ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ The license key for the current software has been verified as ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ belonging to: ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ Chai Research Corp. ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ║ ║
cloudyu-nemo-dpo-v10-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Connection pool is full, discarding connection: %s. Connection pool size: %s
cloudyu-nemo-dpo-v10-v2-mkmlizer: Downloaded to shared memory in 49.178s
cloudyu-nemo-dpo-v10-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpin_mbm0p, device:0
cloudyu-nemo-dpo-v10-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
cloudyu-nemo-dpo-v10-v2-mkmlizer: creating bucket guanaco-mkml-models
cloudyu-nemo-dpo-v10-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
cloudyu-nemo-dpo-v10-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/cloudyu-nemo-dpo-v10-v2
cloudyu-nemo-dpo-v10-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/cloudyu-nemo-dpo-v10-v2/config.json
cloudyu-nemo-dpo-v10-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/cloudyu-nemo-dpo-v10-v2/special_tokens_map.json
cloudyu-nemo-dpo-v10-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/cloudyu-nemo-dpo-v10-v2/tokenizer_config.json
cloudyu-nemo-dpo-v10-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/cloudyu-nemo-dpo-v10-v2/tokenizer.json
cloudyu-nemo-dpo-v10-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/cloudyu-nemo-dpo-v10-v2/flywheel_model.0.safetensors
cloudyu-nemo-dpo-v10-v2-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%|▏ | 5/363 [00:00<00:12, 28.94it/s] Loading 0: 4%|▎ | 13/363 [00:00<00:06, 50.88it/s] Loading 0: 5%|▌ | 19/363 [00:00<00:07, 48.82it/s] Loading 0: 7%|▋ | 25/363 [00:00<00:07, 47.83it/s] Loading 0: 9%|▊ | 31/363 [00:00<00:06, 49.94it/s] Loading 0: 10%|█ | 37/363 [00:00<00:07, 45.21it/s] Loading 0: 12%|█▏ | 42/363 [00:00<00:07, 43.29it/s] Loading 0: 13%|█▎ | 49/363 [00:01<00:06, 48.86it/s] Loading 0: 15%|█▌ | 55/363 [00:01<00:06, 47.34it/s] Loading 0: 17%|█▋ | 61/363 [00:01<00:08, 36.83it/s] Loading 0: 18%|█▊ | 66/363 [00:01<00:08, 37.04it/s] Loading 0: 20%|█▉ | 72/363 [00:01<00:07, 40.97it/s] Loading 0: 21%|██ | 77/363 [00:01<00:06, 41.78it/s] Loading 0: 23%|██▎ | 82/363 [00:02<00:08, 33.51it/s] Loading 0: 25%|██▍ | 89/363 [00:02<00:06, 40.36it/s] Loading 0: 26%|██▌ | 94/363 [00:02<00:06, 40.17it/s] Loading 0: 27%|██▋ | 99/363 [00:02<00:06, 42.08it/s] Loading 0: 29%|██▉ | 105/363 [00:02<00:06, 42.81it/s] Loading 0: 31%|███ | 112/363 [00:02<00:05, 48.52it/s] Loading 0: 33%|███▎ | 118/363 [00:02<00:05, 41.92it/s] Loading 0: 35%|███▍ | 126/363 [00:02<00:04, 49.47it/s] Loading 0: 36%|███▋ | 132/363 [00:03<00:04, 47.65it/s] Loading 0: 38%|███▊ | 138/363 [00:03<00:04, 48.28it/s] Loading 0: 40%|███▉ | 144/363 [00:03<00:06, 33.94it/s] Loading 0: 41%|████ | 149/363 [00:03<00:05, 35.84it/s] Loading 0: 43%|████▎ | 157/363 [00:03<00:04, 43.93it/s] Loading 0: 45%|████▍ | 163/363 [00:03<00:04, 43.36it/s] Loading 0: 46%|████▋ | 168/363 [00:03<00:04, 43.79it/s] Loading 0: 48%|████▊ | 174/363 [00:04<00:03, 47.64it/s] Loading 0: 50%|████▉ | 180/363 [00:04<00:03, 48.51it/s] Loading 0: 51%|█████ | 186/363 [00:04<00:04, 43.54it/s] Loading 0: 53%|█████▎ | 193/363 [00:04<00:03, 49.18it/s] Loading 0: 55%|█████▍ | 199/363 [00:04<00:03, 47.98it/s] Loading 0: 56%|█████▋ | 205/363 [00:04<00:03, 47.86it/s] Loading 0: 58%|█████▊ | 211/363 [00:04<00:03, 48.77it/s] Loading 0: 60%|█████▉ | 217/363 [00:04<00:03, 47.00it/s] Loading 0: 61%|██████▏ | 223/363 [00:05<00:03, 36.91it/s] Loading 0: 63%|██████▎ | 228/363 [00:05<00:03, 36.33it/s] Loading 0: 64%|██████▍ | 232/363 [00:05<00:03, 35.28it/s] Loading 0: 66%|██████▌ | 238/363 [00:05<00:03, 39.18it/s] Loading 0: 67%|██████▋ | 243/363 [00:05<00:02, 41.47it/s] Loading 0: 68%|██████▊ | 248/363 [00:05<00:03, 35.99it/s] Loading 0: 71%|███████ | 256/363 [00:05<00:02, 44.09it/s] Loading 0: 72%|███████▏ | 262/363 [00:06<00:02, 42.46it/s] Loading 0: 74%|███████▎ | 267/363 [00:06<00:02, 41.59it/s] Loading 0: 75%|███████▌ | 274/363 [00:06<00:01, 46.34it/s] Loading 0: 77%|███████▋ | 279/363 [00:06<00:01, 45.81it/s] Loading 0: 78%|███████▊ | 284/363 [00:06<00:02, 38.40it/s] Loading 0: 80%|████████ | 291/363 [00:06<00:01, 45.48it/s] Loading 0: 82%|████████▏ | 296/363 [00:06<00:01, 44.68it/s] Loading 0: 83%|████████▎ | 303/363 [00:07<00:01, 45.50it/s] Loading 0: 85%|████████▍ | 308/363 [00:13<00:20, 2.71it/s] Loading 0: 86%|████████▌ | 312/363 [00:13<00:14, 3.43it/s] Loading 0: 88%|████████▊ | 320/363 [00:14<00:07, 5.51it/s] Loading 0: 90%|████████▉ | 326/363 [00:14<00:04, 7.46it/s] Loading 0: 91%|█████████ | 331/363 [00:14<00:03, 9.52it/s] Loading 0: 93%|█████████▎| 338/363 [00:14<00:01, 13.51it/s] Loading 0: 95%|█████████▍| 344/363 [00:14<00:01, 17.00it/s] Loading 0: 96%|█████████▌| 349/363 [00:14<00:00, 20.31it/s] Loading 0: 98%|█████████▊| 356/363 [00:14<00:00, 26.45it/s] Loading 0: 100%|█████████▉| 362/363 [00:14<00:00, 29.91it/s]
Job cloudyu-nemo-dpo-v10-v2-mkmlizer completed after 115.41s with status: succeeded
Stopping job with name cloudyu-nemo-dpo-v10-v2-mkmlizer
Pipeline stage MKMLizer completed in 115.91s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service cloudyu-nemo-dpo-v10-v2
Waiting for inference service cloudyu-nemo-dpo-v10-v2 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Inference service cloudyu-nemo-dpo-v10-v2 ready after 140.50946140289307s
Pipeline stage MKMLDeployer completed in 140.99s
run pipeline stage %s
Running pipeline stage StressChecker
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
HTTPConnectionPool(host='', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.0334861278533936s
Received healthy response to inference request in 3.244379997253418s
Received healthy response to inference request in 1.6629459857940674s
Received healthy response to inference request in 2.0443761348724365s
5 requests
1 failed requests
5th percentile: 1.7370540142059325
10th percentile: 1.811162042617798
20th percentile: 1.9593780994415284
30th percentile: 2.035664129257202
40th percentile: 2.0400201320648192
50th percentile: 2.0443761348724365
60th percentile: 2.5243776798248287
70th percentile: 3.0043792247772214
80th percentile: 6.620438241958621
90th percentile: 13.37255473136902
95th percentile: 16.748612976074217
99th percentile: 19.44945957183838
mean time: 5.821971893310547
%s, retrying in %s seconds...
Received healthy response to inference request in 1.6117355823516846s
Received healthy response to inference request in 1.7391116619110107s
Received healthy response to inference request in 1.7156381607055664s
Received healthy response to inference request in 2.151975631713867s
Received healthy response to inference request in 2.0302817821502686s
5 requests
0 failed requests
5th percentile: 1.632516098022461
10th percentile: 1.6532966136932372
20th percentile: 1.69485764503479
30th percentile: 1.7203328609466553
40th percentile: 1.729722261428833
50th percentile: 1.7391116619110107
60th percentile: 1.855579710006714
70th percentile: 1.972047758102417
80th percentile: 2.054620552062988
90th percentile: 2.1032980918884276
95th percentile: 2.1276368618011476
99th percentile: 2.1471078777313233
mean time: 1.8497485637664794
Pipeline stage StressChecker completed in 41.14s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.12s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 2.09s
Shutdown handler de-registered
cloudyu-nemo-dpo-v10_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3169.05s
Shutdown handler de-registered
cloudyu-nemo-dpo-v10_v2 status is now inactive due to auto deactivation removed underperforming models
cloudyu-nemo-dpo-v10_v2 status is now torndown due to DeploymentManager action
cloudyu-nemo-dpo-v10_v2 status is now torndown due to DeploymentManager action
cloudyu-nemo-dpo-v10_v2 status is now torndown due to DeploymentManager action
Generation Params
Prompt Formatter
Chat History
ChatMessage 1