developer_uid: rirv938
submission_id: rirv938-80k-98p-2ff-rir_93843_v2
model_name: rirv938-80k-98p-2ff-rir_93843_v2
model_group: rirv938/80k_98p_2ff_rirv
status: torndown
timestamp: 2025-04-09T06:15:38+00:00
num_battles: 6177
num_wins: 3380
celo_rating: 1314.65
family_friendly_score: 0.5398000000000001
family_friendly_standard_error: 0.0070486305052825686
submission_type: basic
model_repo: rirv938/80k_98p_2ff_rirv938_mistral_24b_bon_82623_v1_cp1872_merged
model_architecture: MistralForCausalLM
model_num_parameters: 24096691200.0
best_of: 8
max_input_tokens: 768
max_output_tokens: 64
reward_model: default
latencies: [{'batch_size': 1, 'throughput': 0.38548415971822203, 'latency_mean': 2.5940460181236267, 'latency_p50': 2.5933778285980225, 'latency_p90': 2.9060864210128785}, {'batch_size': 3, 'throughput': 0.790561101125278, 'latency_mean': 3.773658758401871, 'latency_p50': 3.7651489973068237, 'latency_p90': 4.205441689491272}, {'batch_size': 5, 'throughput': 1.0329640550080825, 'latency_mean': 4.810755873918533, 'latency_p50': 4.832937240600586, 'latency_p90': 5.340883040428162}, {'batch_size': 6, 'throughput': 1.1157261335493247, 'latency_mean': 5.342603124380112, 'latency_p50': 5.342889666557312, 'latency_p90': 5.9054034948349}, {'batch_size': 10, 'throughput': 1.27755194683442, 'latency_mean': 7.754640476703644, 'latency_p50': 7.860140562057495, 'latency_p90': 8.702806043624879}]
gpu_counts: {'NVIDIA RTX A6000': 1}
display_name: rirv938-80k-98p-2ff-rir_93843_v2
ineligible_reason: num_battles<10000
is_internal_developer: True
language_model: rirv938/80k_98p_2ff_rirv938_mistral_24b_bon_82623_v1_cp1872_merged
model_size: 24B
ranking_group: single
throughput_3p7s: 0.78
us_pacific_date: 2025-04-08
win_ratio: 0.5471911931358264
generation_params: {'temperature': 0.9, 'top_p': 0.9, 'min_p': 0.6, 'top_k': 80, 'presence_penalty': 0.5, 'frequency_penalty': 0.5, 'stopping_words': ['###', 'You:', '\n', '</s>'], 'max_input_tokens': 768, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '[INST]', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '[/INST]{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer
Waiting for job on rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer to finish
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ _____ __ __ ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ /___/ ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ Version: 0.12.8 ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ https://mk1.ai ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ The license key for the current software has been verified as ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ belonging to: ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ Chai Research Corp. ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ Expiration: 2025-04-15 23:59:59 ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ║ ║
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Failed to get response for submission jellywibble-kailey-gl-n_40898_v1: HTTPConnectionPool(host='jellywibble-kailey-gl-n-40898-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Unable to record family friendly update due to error: Invalid JSON input: JSON must contain 'User Safety' and 'Response Safety' fields
Failed to get response for submission jellywibble-kailey-gl-n_40898_v1: HTTPConnectionPool(host='jellywibble-kailey-gl-n-40898-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: Downloaded to shared memory in 117.621s
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpbpvoidr_, device:0
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: quantized model in 61.309s
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: Processed model rirv938/80k_98p_2ff_rirv938_mistral_24b_bon_82623_v1_cp1872_merged in 178.931s
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: creating bucket guanaco-mkml-models
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rirv938-80k-98p-2ff-rir-93843-v2
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/rirv938-80k-98p-2ff-rir-93843-v2/config.json
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/rirv938-80k-98p-2ff-rir-93843-v2/special_tokens_map.json
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/rirv938-80k-98p-2ff-rir-93843-v2/tokenizer_config.json
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rirv938-80k-98p-2ff-rir-93843-v2/tokenizer.json
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.1.safetensors s3://guanaco-mkml-models/rirv938-80k-98p-2ff-rir-93843-v2/flywheel_model.1.safetensors
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/rirv938-80k-98p-2ff-rir-93843-v2/flywheel_model.0.safetensors
rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%| | 3/363 [00:00<00:12, 29.07it/s] Loading 0: 2%|▏ | 6/363 [00:00<00:24, 14.48it/s] Loading 0: 3%|▎ | 11/363 [00:00<00:15, 22.84it/s] Loading 0: 4%|▍ | 14/363 [00:00<00:24, 14.47it/s] Loading 0: 5%|▍ | 17/363 [00:01<00:22, 15.36it/s] Loading 0: 6%|▌ | 21/363 [00:01<00:19, 17.48it/s] Loading 0: 7%|▋ | 24/363 [00:01<00:21, 15.41it/s] Loading 0: 8%|▊ | 29/363 [00:01<00:15, 21.07it/s] Loading 0: 9%|▉ | 32/363 [00:01<00:14, 22.76it/s] Loading 0: 10%|▉ | 35/363 [00:02<00:21, 15.51it/s] Loading 0: 10%|█ | 38/363 [00:02<00:19, 16.38it/s] Loading 0: 11%|█▏ | 41/363 [00:02<00:25, 12.67it/s] Loading 0: 13%|█▎ | 47/363 [00:02<00:16, 19.36it/s] Loading 0: 14%|█▍ | 51/363 [00:02<00:16, 18.58it/s] Loading 0: 15%|█▍ | 54/363 [00:03<00:21, 14.46it/s] Loading 0: 16%|█▌ | 57/363 [00:03<00:18, 16.54it/s] Loading 0: 17%|█▋ | 60/363 [00:03<00:20, 14.81it/s] Loading 0: 18%|█▊ | 65/363 [00:03<00:14, 20.11it/s] Loading 0: 19%|█▉ | 69/363 [00:03<00:15, 19.09it/s] Loading 0: 20%|█▉ | 72/363 [00:04<00:19, 14.66it/s] Loading 0: 21%|██ | 75/363 [00:04<00:17, 16.76it/s] Loading 0: 21%|██▏ | 78/363 [00:04<00:18, 15.22it/s] Loading 0: 23%|██▎ | 83/363 [00:04<00:13, 20.56it/s] Loading 0: 24%|██▍ | 87/363 [00:05<00:14, 19.33it/s] Loading 0: 25%|██▍ | 90/363 [00:05<00:18, 14.58it/s] Loading 0: 26%|██▌ | 95/363 [00:05<00:13, 19.65it/s] Loading 0: 28%|██▊ | 101/363 [00:05<00:11, 22.97it/s] Loading 0: 29%|██▊ | 104/363 [00:05<00:13, 19.43it/s] Loading 0: 29%|██▉ | 107/363 [00:06<00:16, 15.18it/s] Loading 0: 30%|███ | 109/363 [00:06<00:17, 14.63it/s] Loading 0: 31%|███ | 111/363 [00:06<00:16, 15.08it/s] Loading 0: 31%|███ | 113/363 [00:06<00:19, 12.60it/s] Loading 0: 33%|███▎ | 118/363 [00:06<00:13, 18.77it/s] Loading 0: 34%|███▍ | 123/363 [00:07<00:12, 19.37it/s] Loading 0: 35%|███▍ | 126/363 [00:07<00:16, 14.73it/s] Loading 0: 36%|███▌ | 129/363 [00:07<00:13, 16.80it/s] Loading 0: 36%|███▋ | 132/363 [00:07<00:14, 15.47it/s] Loading 0: 38%|███▊ | 137/363 [00:07<00:10, 21.02it/s] Loading 0: 39%|███▉ | 141/363 [00:08<00:11, 19.20it/s] Loading 0: 40%|███▉ | 144/363 [00:08<00:14, 14.72it/s] Loading 0: 40%|████ | 147/363 [00:08<00:12, 16.73it/s] Loading 0: 41%|████▏ | 150/363 [00:08<00:13, 15.46it/s] Loading 0: 43%|████▎ | 155/363 [00:08<00:09, 21.03it/s] Loading 0: 44%|████▍ | 159/363 [00:09<00:10, 19.81it/s] Loading 0: 45%|████▍ | 162/363 [00:09<00:13, 14.80it/s] Loading 0: 45%|████▌ | 165/363 [00:09<00:11, 16.88it/s] Loading 0: 46%|████▋ | 168/363 [00:09<00:13, 14.86it/s] Loading 0: 48%|████▊ | 173/363 [00:10<00:09, 20.21it/s] Loading 0: 49%|████▉ | 177/363 [00:10<00:09, 19.11it/s] Loading 0: 50%|████▉ | 180/363 [00:10<00:12, 14.66it/s] Loading 0: 50%|█████ | 183/363 [00:10<00:10, 16.77it/s] Loading 0: 51%|█████ | 186/363 [00:10<00:11, 15.32it/s] Loading 0: 53%|█████▎ | 191/363 [00:11<00:08, 20.95it/s] Loading 0: 54%|█████▎ | 195/363 [00:11<00:08, 19.79it/s] Loading 0: 55%|█████▍ | 198/363 [00:11<00:11, 14.98it/s] Loading 0: 55%|█████▌ | 201/363 [00:25<03:30, 1.30s/it] Loading 0: 56%|█████▌ | 203/363 [00:26<02:50, 1.07s/it] Loading 0: 57%|█████▋ | 207/363 [00:26<01:47, 1.45it/s] Loading 0: 58%|█████▊ | 210/363 [00:26<01:17, 1.98it/s] Loading 0: 59%|█████▊ | 213/363 [00:26<00:57, 2.60it/s] Loading 0: 60%|█████▉ | 216/363 [00:27<00:45, 3.25it/s] Loading 0: 60%|██████ | 219/363 [00:27<00:32, 4.39it/s] Loading 0: 61%|██████ | 222/363 [00:27<00:25, 5.43it/s] Loading 0: 62%|██████▏ | 226/363 [00:27<00:17, 7.83it/s] Loading 0: 63%|██████▎ | 230/363 [00:27<00:12, 10.49it/s] Loading 0: 64%|██████▍ | 233/363 [00:27<00:13, 9.85it/s] Loading 0: 65%|██████▌ | 236/363 [00:28<00:11, 11.36it/s] Loading 0: 66%|██████▌ | 239/363 [00:28<00:11, 10.43it/s] Loading 0: 67%|██████▋ | 245/363 [00:28<00:07, 16.39it/s] Loading 0: 69%|██████▊ | 249/363 [00:28<00:06, 16.46it/s] Loading 0: 69%|██████▉ | 252/363 [00:29<00:08, 13.23it/s] Loading 0: 70%|███████ | 255/363 [00:29<00:06, 15.45it/s] Loading 0: 71%|███████ | 258/363 [00:29<00:07, 14.12it/s] Loading 0: 72%|███████▏ | 262/363 [00:29<00:05, 17.88it/s] Loading 0: 74%|███████▎ | 267/363 [00:29<00:05, 19.12it/s] Loading 0: 74%|███████▍ | 270/363 [00:30<00:06, 14.56it/s] Loading 0: 75%|███████▌ | 273/363 [00:30<00:05, 16.64it/s] Loading 0: 76%|███████▌ | 276/363 [00:30<00:05, 15.01it/s] Loading 0: 77%|███████▋ | 280/363 [00:30<00:04, 18.94it/s] Loading 0: 79%|███████▊ | 285/363 [00:30<00:04, 19.47it/s] Loading 0: 79%|███████▉ | 288/363 [00:31<00:05, 14.58it/s] Loading 0: 80%|████████ | 291/363 [00:31<00:04, 16.71it/s] Loading 0: 81%|████████ | 294/363 [00:31<00:04, 14.81it/s] Loading 0: 82%|████████▏ | 299/363 [00:31<00:03, 20.18it/s] Loading 0: 83%|████████▎ | 303/363 [00:32<00:03, 19.28it/s] Loading 0: 84%|████████▍ | 306/363 [00:32<00:03, 14.73it/s] Loading 0: 85%|████████▌ | 309/363 [00:32<00:03, 16.87it/s] Loading 0: 86%|████████▌ | 312/363 [00:32<00:03, 15.31it/s] Loading 0: 87%|████████▋ | 317/363 [00:32<00:02, 20.96it/s] Loading 0: 88%|████████▊ | 321/363 [00:33<00:02, 19.81it/s] Loading 0: 89%|████████▉ | 324/363 [00:33<00:02, 14.97it/s] Loading 0: 90%|█████████ | 327/363 [00:33<00:02, 17.11it/s] Loading 0: 91%|█████████ | 330/363 [00:33<00:02, 15.30it/s] Loading 0: 92%|█████████▏| 335/363 [00:33<00:01, 20.65it/s] Loading 0: 93%|█████████▎| 339/363 [00:34<00:01, 19.34it/s] Loading 0: 94%|█████████▍| 342/363 [00:34<00:01, 14.73it/s] Loading 0: 95%|█████████▌| 345/363 [00:34<00:01, 16.91it/s] Loading 0: 96%|█████████▌| 348/363 [00:34<00:00, 15.19it/s] Loading 0: 97%|█████████▋| 353/363 [00:34<00:00, 20.61it/s] Loading 0: 98%|█████████▊| 357/363 [00:41<00:03, 1.74it/s] Loading 0: 99%|█████████▉| 360/363 [00:42<00:01, 2.23it/s] Loading 0: 100%|█████████▉| 362/363 [00:42<00:00, 2.64it/s]
Job rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer completed after 206.6s with status: succeeded
Stopping job with name rirv938-80k-98p-2ff-rir-93843-v2-mkmlizer
Pipeline stage MKMLizer completed in 207.12s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.17s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rirv938-80k-98p-2ff-rir-93843-v2
Waiting for inference service rirv938-80k-98p-2ff-rir-93843-v2 to be ready
Failed to get response for submission jellywibble-kailey-gl-n_40898_v1: HTTPConnectionPool(host='jellywibble-kailey-gl-n-40898-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Inference service rirv938-80k-98p-2ff-rir-93843-v2 ready after 110.49024105072021s
Pipeline stage MKMLDeployer completed in 110.96s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.1766927242279053s
Received healthy response to inference request in 2.5169262886047363s
Received healthy response to inference request in 2.5079777240753174s
Received healthy response to inference request in 2.547832489013672s
5 requests
1 failed requests
5th percentile: 2.509767436981201
10th percentile: 2.511557149887085
20th percentile: 2.5151365756988526
30th percentile: 2.5231075286865234
40th percentile: 2.5354700088500977
50th percentile: 2.547832489013672
60th percentile: 2.7993765830993653
70th percentile: 3.0509206771850583
80th percentile: 6.57753190994263
90th percentile: 13.379210281372071
95th percentile: 16.78004946708679
99th percentile: 19.500720815658568
mean time: 6.186063575744629
%s, retrying in %s seconds...
Received healthy response to inference request in 2.795534610748291s
Received healthy response to inference request in 2.589838981628418s
Received healthy response to inference request in 2.727210283279419s
Received healthy response to inference request in 2.388988971710205s
Received healthy response to inference request in 2.4968338012695312s
5 requests
0 failed requests
5th percentile: 2.4105579376220705
10th percentile: 2.4321269035339355
20th percentile: 2.475264835357666
30th percentile: 2.5154348373413087
40th percentile: 2.552636909484863
50th percentile: 2.589838981628418
60th percentile: 2.6447875022888185
70th percentile: 2.6997360229492187
80th percentile: 2.7408751487731933
90th percentile: 2.7682048797607424
95th percentile: 2.7818697452545167
99th percentile: 2.792801637649536
mean time: 2.599681329727173
Pipeline stage StressChecker completed in 47.14s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.71s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.71s
Shutdown handler de-registered
rirv938-80k-98p-2ff-rir_93843_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.11s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.10s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-80k-98p-2ff-rir-93843-v2-profiler
Waiting for inference service rirv938-80k-98p-2ff-rir-93843-v2-profiler to be ready
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 2922.05s
Shutdown handler de-registered
rirv938-80k-98p-2ff-rir_93843_v2 status is now inactive due to auto deactivation removed underperforming models
rirv938-80k-98p-2ff-rir_93843_v2 status is now torndown due to DeploymentManager action