submission_id: cgato-nemo-12b-thespice-_8696_v1
developer_uid: c.gato
best_of: 8
celo_rating: 1258.14
display_name: cgato-nemo-12b-thespice-_8696_v1
family_friendly_score: 0.5714
family_friendly_standard_error: 0.006998600431514861
formatter: {'memory_template': '<|im_start|>system\n{memory}<|im_end|>\n', 'prompt_template': '<|im_start|>user\n{prompt}<|im_end|>\n', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{user_name}: {message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
gpu_counts: {'NVIDIA RTX A5000': 1}
is_internal_developer: False
language_model: cgato/Nemo-12b-TheSpice-V0.9-All-v2-KTO-v0.3-BigBoy-20k-2
latencies: [{'batch_size': 1, 'throughput': 0.6071262393057666, 'latency_mean': 1.6470407831668854, 'latency_p50': 1.6502286195755005, 'latency_p90': 1.808981227874756}, {'batch_size': 3, 'throughput': 1.1048651155727112, 'latency_mean': 2.7113428592681883, 'latency_p50': 2.714448571205139, 'latency_p90': 3.0282235145568848}, {'batch_size': 5, 'throughput': 1.3275566921129613, 'latency_mean': 3.7437720000743866, 'latency_p50': 3.745645761489868, 'latency_p90': 4.16625063419342}, {'batch_size': 6, 'throughput': 1.4051403040619144, 'latency_mean': 4.24191912651062, 'latency_p50': 4.26549756526947, 'latency_p90': 4.699995660781861}, {'batch_size': 8, 'throughput': 1.4779819754203793, 'latency_mean': 5.390749918222427, 'latency_p50': 5.3414546251297, 'latency_p90': 6.100323581695557}, {'batch_size': 10, 'throughput': 1.4999727528276352, 'latency_mean': 6.616509690284729, 'latency_p50': 6.645908236503601, 'latency_p90': 7.563496017456054}]
max_input_tokens: 1024
max_output_tokens: 64
model_architecture: MistralForCausalLM
model_group: cgato/Nemo-12b-TheSpice-
model_name: cgato-nemo-12b-thespice-_8696_v1
model_num_parameters: 12772121600.0
model_repo: cgato/Nemo-12b-TheSpice-V0.9-All-v2-KTO-v0.3-BigBoy-20k-2
model_size: 13B
num_battles: 14574
num_wins: 7488
ranking_group: single
status: torndown
submission_type: basic
throughput_3p7s: 1.33
timestamp: 2024-11-12T13:41:57+00:00
us_pacific_date: 2024-11-12
win_ratio: 0.5137916838205022
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name cgato-nemo-12b-thespice-8696-v1-mkmlizer
Waiting for job on cgato-nemo-12b-thespice-8696-v1-mkmlizer to finish
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ _____ __ __ ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ /___/ ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ Version: 0.11.33 ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ https://mk1.ai ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ The license key for the current software has been verified as ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ belonging to: ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ Chai Research Corp. ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ║ ║
cgato-nemo-12b-thespice-8696-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Failed to get response for submission rinen0721-mistral12b-exp5_v4: ('http://rinen0721-mistral12b-exp5-v4-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
Failed to get response for submission rinen0721-mistral12b-exp5_v4: ('http://rinen0721-mistral12b-exp5-v4-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'request timeout')
cgato-nemo-12b-thespice-8696-v1-mkmlizer: Downloaded to shared memory in 47.044s
cgato-nemo-12b-thespice-8696-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpdlwi0k4r, device:0
cgato-nemo-12b-thespice-8696-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
cgato-nemo-12b-thespice-8696-v1-mkmlizer: quantized model in 36.107s
cgato-nemo-12b-thespice-8696-v1-mkmlizer: Processed model cgato/Nemo-12b-TheSpice-V0.9-All-v2-KTO-v0.3-BigBoy-20k-2 in 83.151s
cgato-nemo-12b-thespice-8696-v1-mkmlizer: creating bucket guanaco-mkml-models
cgato-nemo-12b-thespice-8696-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/cgato-nemo-12b-thespice-8696-v1/config.json
cgato-nemo-12b-thespice-8696-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/cgato-nemo-12b-thespice-8696-v1/special_tokens_map.json
cgato-nemo-12b-thespice-8696-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/cgato-nemo-12b-thespice-8696-v1/tokenizer_config.json
cgato-nemo-12b-thespice-8696-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/cgato-nemo-12b-thespice-8696-v1/tokenizer.json
cgato-nemo-12b-thespice-8696-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/cgato-nemo-12b-thespice-8696-v1/flywheel_model.0.safetensors
cgato-nemo-12b-thespice-8696-v1-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%|▏ | 5/363 [00:00<00:12, 28.69it/s] Loading 0: 4%|▎ | 13/363 [00:00<00:07, 48.63it/s] Loading 0: 5%|▌ | 19/363 [00:00<00:07, 44.47it/s] Loading 0: 7%|▋ | 24/363 [00:00<00:07, 43.63it/s] Loading 0: 9%|▊ | 31/363 [00:00<00:06, 49.98it/s] Loading 0: 10%|█ | 37/363 [00:00<00:06, 46.64it/s] Loading 0: 12%|█▏ | 42/363 [00:00<00:07, 45.36it/s] Loading 0: 13%|█▎ | 49/363 [00:01<00:06, 50.69it/s] Loading 0: 15%|█▌ | 55/363 [00:01<00:06, 48.25it/s] Loading 0: 17%|█▋ | 61/363 [00:01<00:08, 35.21it/s] Loading 0: 18%|█▊ | 66/363 [00:01<00:08, 35.58it/s] Loading 0: 20%|█▉ | 72/363 [00:01<00:07, 40.00it/s] Loading 0: 21%|██ | 77/363 [00:01<00:06, 42.09it/s] Loading 0: 23%|██▎ | 82/363 [00:01<00:07, 37.39it/s] Loading 0: 25%|██▍ | 90/363 [00:02<00:06, 45.46it/s] Loading 0: 26%|██▋ | 96/363 [00:02<00:06, 44.09it/s] Loading 0: 28%|██▊ | 101/363 [00:02<00:06, 43.01it/s] Loading 0: 30%|██▉ | 108/363 [00:02<00:05, 49.04it/s] Loading 0: 31%|███▏ | 114/363 [00:02<00:05, 43.55it/s] Loading 0: 33%|███▎ | 119/363 [00:02<00:05, 42.20it/s] Loading 0: 35%|███▍ | 126/363 [00:02<00:05, 46.55it/s] Loading 0: 36%|███▋ | 132/363 [00:03<00:05, 45.86it/s] Loading 0: 38%|███▊ | 137/363 [00:03<00:05, 44.20it/s] Loading 0: 39%|███▉ | 142/363 [00:03<00:06, 32.73it/s] Loading 0: 40%|████ | 146/363 [00:03<00:06, 33.18it/s] Loading 0: 41%|████▏ | 150/363 [00:03<00:06, 31.97it/s] Loading 0: 43%|████▎ | 156/363 [00:03<00:05, 37.99it/s] Loading 0: 44%|████▍ | 161/363 [00:03<00:05, 38.90it/s] Loading 0: 46%|████▌ | 166/363 [00:04<00:04, 40.27it/s] Loading 0: 47%|████▋ | 171/363 [00:04<00:04, 41.50it/s] Loading 0: 48%|████▊ | 176/363 [00:04<00:05, 34.99it/s] Loading 0: 51%|█████ | 184/363 [00:04<00:04, 43.64it/s] Loading 0: 52%|█████▏ | 190/363 [00:04<00:04, 42.16it/s] Loading 0: 54%|█████▎ | 195/363 [00:04<00:04, 40.29it/s] Loading 0: 55%|█████▌ | 201/363 [00:04<00:03, 44.34it/s] Loading 0: 57%|█████▋ | 206/363 [00:04<00:03, 43.40it/s] Loading 0: 58%|█████▊ | 211/363 [00:05<00:03, 44.64it/s] Loading 0: 60%|█████▉ | 217/363 [00:05<00:03, 44.11it/s] Loading 0: 61%|██████▏ | 223/363 [00:05<00:04, 34.06it/s] Loading 0: 63%|██████▎ | 227/363 [00:05<00:03, 34.48it/s] Loading 0: 64%|██████▎ | 231/363 [00:05<00:03, 33.87it/s] Loading 0: 66%|██████▌ | 238/363 [00:05<00:03, 40.25it/s] Loading 0: 67%|██████▋ | 244/363 [00:05<00:02, 40.28it/s] Loading 0: 69%|██████▊ | 249/363 [00:06<00:02, 39.26it/s] Loading 0: 70%|███████ | 255/363 [00:06<00:02, 43.73it/s] Loading 0: 72%|███████▏ | 260/363 [00:06<00:02, 42.68it/s] Loading 0: 73%|███████▎ | 265/363 [00:06<00:02, 44.27it/s] Loading 0: 75%|███████▍ | 271/363 [00:06<00:02, 44.54it/s] Loading 0: 76%|███████▌ | 276/363 [00:06<00:01, 44.18it/s] Loading 0: 78%|███████▊ | 283/363 [00:06<00:01, 49.63it/s] Loading 0: 80%|███████▉ | 289/363 [00:06<00:01, 47.22it/s] Loading 0: 81%|████████ | 294/363 [00:07<00:01, 45.94it/s] Loading 0: 83%|████████▎ | 301/363 [00:07<00:01, 51.74it/s] Loading 0: 85%|████████▍ | 307/363 [00:14<00:20, 2.76it/s] Loading 0: 86%|████████▌ | 312/363 [00:14<00:14, 3.62it/s] Loading 0: 88%|████████▊ | 320/363 [00:14<00:07, 5.61it/s] Loading 0: 90%|████████▉ | 326/363 [00:14<00:04, 7.46it/s] Loading 0: 91%|█████████ | 331/363 [00:14<00:03, 9.40it/s] Loading 0: 93%|█████████▎| 338/363 [00:14<00:01, 13.16it/s] Loading 0: 95%|█████████▍| 344/363 [00:14<00:01, 16.48it/s] Loading 0: 96%|█████████▌| 349/363 [00:15<00:00, 19.56it/s] Loading 0: 98%|█████████▊| 356/363 [00:15<00:00, 25.44it/s] Loading 0: 100%|█████████▉| 362/363 [00:15<00:00, 28.76it/s]
Failed to get response for submission rica40325-10-14dpo_v2: ('http://rica40325-10-14dpo-v2-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Job cgato-nemo-12b-thespice-8696-v1-mkmlizer completed after 114.44s with status: succeeded
Stopping job with name cgato-nemo-12b-thespice-8696-v1-mkmlizer
Pipeline stage MKMLizer completed in 115.02s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.18s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service cgato-nemo-12b-thespice-8696-v1
Waiting for inference service cgato-nemo-12b-thespice-8696-v1 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 5.259059906005859s
Received healthy response to inference request in 5.528363466262817s
Received healthy response to inference request in 3.579012393951416s
Received healthy response to inference request in 5.420649290084839s
Received healthy response to inference request in 6.1713762283325195s
5 requests
0 failed requests
5th percentile: 3.9150218963623047
10th percentile: 4.251031398773193
20th percentile: 4.923050403594971
30th percentile: 5.291377782821655
40th percentile: 5.356013536453247
50th percentile: 5.420649290084839
60th percentile: 5.46373496055603
70th percentile: 5.506820631027222
80th percentile: 5.656966018676758
90th percentile: 5.914171123504639
95th percentile: 6.042773675918579
99th percentile: 6.145655717849731
mean time: 5.191692256927491
%s, retrying in %s seconds...
Received healthy response to inference request in 5.603778839111328s
Received healthy response to inference request in 3.1446893215179443s
Received healthy response to inference request in 6.504237651824951s
Received healthy response to inference request in 4.3795225620269775s
Received healthy response to inference request in 5.45273494720459s
5 requests
0 failed requests
5th percentile: 3.391655969619751
10th percentile: 3.6386226177215577
20th percentile: 4.132555913925171
30th percentile: 4.5941650390625
40th percentile: 5.023449993133545
50th percentile: 5.45273494720459
60th percentile: 5.5131525039672855
70th percentile: 5.57357006072998
80th percentile: 5.783870601654053
90th percentile: 6.144054126739502
95th percentile: 6.324145889282226
99th percentile: 6.468219299316406
mean time: 5.016992664337158
%s, retrying in %s seconds...
Received healthy response to inference request in 5.139496326446533s
Received healthy response to inference request in 5.051681280136108s
Received healthy response to inference request in 3.4042160511016846s
Received healthy response to inference request in 4.199659109115601s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 3.83487606048584s
5 requests
0 failed requests
5th percentile: 3.490348052978516
10th percentile: 3.5764800548553466
20th percentile: 3.7487440586090086
30th percentile: 3.907832670211792
40th percentile: 4.053745889663697
50th percentile: 4.199659109115601
60th percentile: 4.540467977523804
70th percentile: 4.881276845932007
80th percentile: 5.069244289398194
90th percentile: 5.1043703079223635
95th percentile: 5.121933317184448
99th percentile: 5.135983724594116
mean time: 4.325985765457153
clean up pipeline due to error=DeploymentChecksError('Unacceptable 70th percentile latency 4.881276845932007s')
Shutdown handler de-registered
DeploymentChecksError('Unacceptable 70th percentile latency 4.881276845932007s')
function_hagob_2024-11-12 status is now failed due to DeploymentManager action
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Inference service cgato-nemo-12b-thespice-8696-v1 ready after 190.9205515384674s
Pipeline stage MKMLDeployer completed in 191.62s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.2079570293426514s
Received healthy response to inference request in 1.9701244831085205s
Received healthy response to inference request in 1.692664384841919s
Received healthy response to inference request in 1.7308251857757568s
Received healthy response to inference request in 1.5430097579956055s
5 requests
0 failed requests
5th percentile: 1.5729406833648683
10th percentile: 1.6028716087341308
20th percentile: 1.6627334594726562
30th percentile: 1.7002965450286864
40th percentile: 1.7155608654022216
50th percentile: 1.7308251857757568
60th percentile: 1.8265449047088622
70th percentile: 1.9222646236419678
80th percentile: 2.0176909923553468
90th percentile: 2.112824010848999
95th percentile: 2.160390520095825
99th percentile: 2.198443727493286
mean time: 1.8289161682128907
Pipeline stage StressChecker completed in 11.98s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 2.51s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 3.39s
Shutdown handler de-registered
cgato-nemo-12b-thespice-_8696_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3131.65s
Shutdown handler de-registered
cgato-nemo-12b-thespice-_8696_v1 status is now inactive due to auto deactivation removed underperforming models
cgato-nemo-12b-thespice-_8696_v1 status is now torndown due to DeploymentManager action