developer_uid: udbhavbamba
submission_id: udbhavbamba-nemo13b-10k-sft_v1
model_name: udbhavbamba-nemo13b-10k-sft_v1
model_group: udbhavbamba/nemo13b_10k_
status: torndown
timestamp: 2025-03-22T16:20:03+00:00
num_battles: 8684
num_wins: 4029
celo_rating: 1255.8
family_friendly_score: 0.5764
family_friendly_standard_error: 0.006988033199692171
submission_type: basic
model_repo: udbhavbamba/nemo13b_10k_sft
model_architecture: MistralForCausalLM
model_num_parameters: 12772070400.0
best_of: 8
max_input_tokens: 1024
max_output_tokens: 64
latencies: [{'batch_size': 1, 'throughput': 0.5997211418741993, 'latency_mean': 1.6673712754249572, 'latency_p50': 1.6576422452926636, 'latency_p90': 1.8447005033493042}, {'batch_size': 3, 'throughput': 1.098509926252649, 'latency_mean': 2.718135472536087, 'latency_p50': 2.7155778408050537, 'latency_p90': 2.9981100082397463}, {'batch_size': 5, 'throughput': 1.323187726010934, 'latency_mean': 3.7547078692913054, 'latency_p50': 3.7585729360580444, 'latency_p90': 4.178810453414917}, {'batch_size': 6, 'throughput': 1.3938217592649402, 'latency_mean': 4.275747827291489, 'latency_p50': 4.296986937522888, 'latency_p90': 4.802404618263244}, {'batch_size': 8, 'throughput': 1.4495844810121152, 'latency_mean': 5.482124934196472, 'latency_p50': 5.548986554145813, 'latency_p90': 6.177094125747681}, {'batch_size': 10, 'throughput': 1.4929061726236723, 'latency_mean': 6.649401196241379, 'latency_p50': 6.652804017066956, 'latency_p90': 7.515755629539489}]
gpu_counts: {'NVIDIA RTX A5000': 1}
display_name: udbhavbamba-nemo13b-10k-sft_v1
is_internal_developer: False
language_model: udbhavbamba/nemo13b_10k_sft
model_size: 13B
ranking_group: single
throughput_3p7s: 1.32
us_pacific_date: 2025-03-22
win_ratio: 0.4639567019806541
generation_params: {'temperature': 1.0, 'top_p': 0.9, 'min_p': 0.1, 'top_k': 40, 'presence_penalty': 0.2, 'frequency_penalty': 0.0, 'stopping_words': ['</s>'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': "<s>{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}', 'bot_template': '{bot_name}: {message}</s>', 'user_template': '[INST]{user_name}: {message}[/INST]', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name udbhavbamba-nemo13b-10k-sft-v1-mkmlizer
Waiting for job on udbhavbamba-nemo13b-10k-sft-v1-mkmlizer to finish
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: Downloaded to shared memory in 44.876s
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpzpnsp_gj, device:0
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: quantized model in 35.431s
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: Processed model udbhavbamba/nemo13b_10k_sft in 80.307s
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: creating bucket guanaco-mkml-models
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/udbhavbamba-nemo13b-10k-sft-v1
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/udbhavbamba-nemo13b-10k-sft-v1/config.json
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/udbhavbamba-nemo13b-10k-sft-v1/special_tokens_map.json
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/udbhavbamba-nemo13b-10k-sft-v1/tokenizer_config.json
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/udbhavbamba-nemo13b-10k-sft-v1/tokenizer.json
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/udbhavbamba-nemo13b-10k-sft-v1/flywheel_model.0.safetensors
udbhavbamba-nemo13b-10k-sft-v1-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%|▏ | 5/363 [00:00<00:13, 25.65it/s] Loading 0: 4%|▎ | 13/363 [00:00<00:07, 45.87it/s] Loading 0: 5%|▌ | 19/363 [00:00<00:07, 44.82it/s] Loading 0: 7%|▋ | 24/363 [00:00<00:07, 45.13it/s] Loading 0: 9%|▉ | 32/363 [00:00<00:07, 46.70it/s] Loading 0: 11%|█ | 40/363 [00:00<00:06, 53.57it/s] Loading 0: 13%|█▎ | 46/363 [00:00<00:06, 51.49it/s] Loading 0: 14%|█▍ | 52/363 [00:01<00:06, 51.59it/s] Loading 0: 17%|█▋ | 60/363 [00:01<00:05, 52.41it/s] Loading 0: 18%|█▊ | 66/363 [00:01<00:08, 35.92it/s] Loading 0: 20%|█▉ | 72/363 [00:01<00:07, 39.72it/s] Loading 0: 21%|██▏ | 78/363 [00:01<00:06, 41.55it/s] Loading 0: 23%|██▎ | 83/363 [00:01<00:06, 42.60it/s] Loading 0: 25%|██▍ | 90/363 [00:01<00:05, 48.09it/s] Loading 0: 26%|██▋ | 96/363 [00:02<00:05, 47.15it/s] Loading 0: 28%|██▊ | 101/363 [00:02<00:05, 45.79it/s] Loading 0: 30%|███ | 109/363 [00:02<00:04, 54.00it/s] Loading 0: 32%|███▏ | 115/363 [00:02<00:05, 47.52it/s] Loading 0: 33%|███▎ | 121/363 [00:02<00:05, 44.60it/s] Loading 0: 35%|███▍ | 127/363 [00:02<00:05, 41.62it/s] Loading 0: 37%|███▋ | 135/363 [00:02<00:04, 49.14it/s] Loading 0: 39%|███▉ | 141/363 [00:03<00:04, 47.94it/s] Loading 0: 40%|████ | 147/363 [00:03<00:06, 35.57it/s] Loading 0: 42%|████▏ | 152/363 [00:03<00:05, 35.96it/s] Loading 0: 43%|████▎ | 157/363 [00:03<00:05, 36.61it/s] Loading 0: 44%|████▍ | 161/363 [00:03<00:05, 36.39it/s] Loading 0: 45%|████▌ | 165/363 [00:03<00:05, 36.82it/s] Loading 0: 47%|████▋ | 169/363 [00:03<00:05, 35.99it/s] Loading 0: 48%|████▊ | 175/363 [00:04<00:04, 40.62it/s] Loading 0: 50%|████▉ | 181/363 [00:04<00:04, 40.82it/s] Loading 0: 51%|█████ | 186/363 [00:04<00:04, 40.24it/s] Loading 0: 53%|█████▎ | 193/363 [00:04<00:03, 46.35it/s] Loading 0: 55%|█████▍ | 199/363 [00:04<00:03, 45.40it/s] Loading 0: 56%|█████▌ | 204/363 [00:04<00:03, 42.93it/s] Loading 0: 58%|█████▊ | 211/363 [00:04<00:03, 47.07it/s] Loading 0: 60%|█████▉ | 216/363 [00:04<00:03, 46.83it/s] Loading 0: 61%|██████ | 222/363 [00:05<00:03, 45.54it/s] Loading 0: 63%|██████▎ | 227/363 [00:05<00:04, 32.06it/s] Loading 0: 64%|██████▎ | 231/363 [00:05<00:04, 32.48it/s] Loading 0: 66%|██████▌ | 238/363 [00:05<00:03, 39.79it/s] Loading 0: 67%|██████▋ | 244/363 [00:05<00:02, 41.23it/s] Loading 0: 69%|██████▊ | 249/363 [00:05<00:02, 41.49it/s] Loading 0: 71%|███████ | 256/363 [00:05<00:02, 47.08it/s] Loading 0: 72%|███████▏ | 262/363 [00:06<00:02, 46.44it/s] Loading 0: 74%|███████▎ | 267/363 [00:06<00:02, 45.37it/s] Loading 0: 75%|███████▌ | 274/363 [00:06<00:01, 49.37it/s] Loading 0: 77%|███████▋ | 280/363 [00:06<00:01, 47.41it/s] Loading 0: 79%|███████▊ | 285/363 [00:06<00:01, 45.02it/s] Loading 0: 80%|████████ | 292/363 [00:06<00:01, 49.70it/s] Loading 0: 82%|████████▏ | 298/363 [00:06<00:01, 47.82it/s] Loading 0: 84%|████████▎ | 304/363 [00:13<00:20, 2.85it/s] Loading 0: 85%|████████▍ | 308/363 [00:13<00:15, 3.58it/s] Loading 0: 86%|████████▌ | 312/363 [00:13<00:11, 4.57it/s] Loading 0: 88%|████████▊ | 320/363 [00:13<00:05, 7.35it/s] Loading 0: 90%|████████▉ | 325/363 [00:14<00:04, 9.47it/s] Loading 0: 91%|█████████ | 330/363 [00:14<00:02, 11.37it/s] Loading 0: 93%|█████████▎| 337/363 [00:14<00:01, 16.13it/s] Loading 0: 94%|█████████▍| 342/363 [00:14<00:01, 19.54it/s] Loading 0: 96%|█████████▌| 347/363 [00:14<00:00, 23.41it/s] Loading 0: 97%|█████████▋| 353/363 [00:14<00:00, 27.36it/s] Loading 0: 99%|█████████▊| 358/363 [00:14<00:00, 30.61it/s]
Job udbhavbamba-nemo13b-10k-sft-v1-mkmlizer completed after 113.5s with status: succeeded
Stopping job with name udbhavbamba-nemo13b-10k-sft-v1-mkmlizer
Pipeline stage MKMLizer completed in 113.98s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service udbhavbamba-nemo13b-10k-sft-v1
Waiting for inference service udbhavbamba-nemo13b-10k-sft-v1 to be ready
Inference service udbhavbamba-nemo13b-10k-sft-v1 ready after 70.60409164428711s
Pipeline stage MKMLDeployer completed in 71.18s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.5903327465057373s
Received healthy response to inference request in 1.567274808883667s
Received healthy response to inference request in 1.9285895824432373s
Received healthy response to inference request in 1.9628188610076904s
Received healthy response to inference request in 1.7139835357666016s
5 requests
0 failed requests
5th percentile: 1.596616554260254
10th percentile: 1.625958299636841
20th percentile: 1.6846417903900146
30th percentile: 1.7569047451019286
40th percentile: 1.842747163772583
50th percentile: 1.9285895824432373
60th percentile: 1.9422812938690186
70th percentile: 1.9559730052948
80th percentile: 2.0883216381073
90th percentile: 2.3393271923065186
95th percentile: 2.4648299694061278
99th percentile: 2.5652321910858156
mean time: 1.9525999069213866
Pipeline stage StressChecker completed in 11.16s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.70s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.70s
Shutdown handler de-registered
udbhavbamba-nemo13b-10k-sft_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 2529.58s
Shutdown handler de-registered
udbhavbamba-nemo13b-10k-sft_v1 status is now inactive due to auto deactivation removed underperforming models
udbhavbamba-nemo13b-10k-sft_v1 status is now torndown due to DeploymentManager action