developer_uid: rirv938
submission_id: rirv938-tune-mistral-grp_3576_v1
model_name: rirv938-tune-mistral-grp_3576_v1
model_group: rirv938/tune_mistral_grp
status: torndown
timestamp: 2025-05-01T16:32:28+00:00
num_battles: 6143
num_wins: 3511
celo_rating: 1337.7
family_friendly_score: 0.5993999999999999
family_friendly_standard_error: 0.006929929869775018
submission_type: basic
model_repo: rirv938/tune_mistral_grpo_cp592_92ff_new_merged
model_architecture: MistralForCausalLM
model_num_parameters: 24096691200.0
best_of: 8
max_input_tokens: 768
max_output_tokens: 64
reward_model: default
latencies: [{'batch_size': 1, 'throughput': 0.5555427070045379, 'latency_mean': 1.7999833631515503, 'latency_p50': 1.7978562116622925, 'latency_p90': 2.011532163619995}, {'batch_size': 4, 'throughput': 1.378536316510072, 'latency_mean': 2.885068128108978, 'latency_p50': 2.8966809511184692, 'latency_p90': 3.2241681575775147}, {'batch_size': 5, 'throughput': 1.5451882510559412, 'latency_mean': 3.214450021982193, 'latency_p50': 3.206656813621521, 'latency_p90': 3.586640644073486}, {'batch_size': 8, 'throughput': 1.8710443456715065, 'latency_mean': 4.233537783622742, 'latency_p50': 4.246500015258789, 'latency_p90': 4.7189347505569454}, {'batch_size': 10, 'throughput': 1.9912072118139892, 'latency_mean': 4.984553135633469, 'latency_p50': 5.03824257850647, 'latency_p90': 5.545498967170715}, {'batch_size': 12, 'throughput': 2.0729465298599044, 'latency_mean': 5.697925702333451, 'latency_p50': 5.745441317558289, 'latency_p90': 6.524534583091736}, {'batch_size': 15, 'throughput': 2.150338179614783, 'latency_mean': 6.895310111045838, 'latency_p50': 6.959148168563843, 'latency_p90': 7.888605380058289}]
gpu_counts: {'NVIDIA A100-SXM4-80GB': 1}
display_name: rirv938-tune-mistral-grp_3576_v1
is_internal_developer: True
language_model: rirv938/tune_mistral_grpo_cp592_92ff_new_merged
model_size: 24B
ranking_group: single
throughput_3p7s: 1.74
us_pacific_date: 2025-05-01
win_ratio: 0.5715448477942373
generation_params: {'temperature': 0.9, 'top_p': 0.9, 'min_p': 0.2, 'top_k': 80, 'presence_penalty': 0.5, 'frequency_penalty': 0.5, 'stopping_words': ['</s>', '\n', 'You:', '###'], 'max_input_tokens': 768, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rirv938-tune-mistral-grp-3576-v1-mkmlizer
Waiting for job on rirv938-tune-mistral-grp-3576-v1-mkmlizer to finish
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ _____ __ __ ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ /___/ ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ Version: 0.12.8 ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ https://mk1.ai ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ The license key for the current software has been verified as ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ belonging to: ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ Chai Research Corp. ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ Expiration: 2028-03-31 23:59:59 ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ║ ║
rirv938-tune-mistral-grp-3576-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Failed to get response for submission bogoconic1-nemo-280k-s6_46455_v1: HTTPConnectionPool(host='bogoconic1-nemo-280k-s6-46455-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
rirv938-tune-mistral-grp-3576-v1-mkmlizer: Downloaded to shared memory in 170.688s
rirv938-tune-mistral-grp-3576-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp776oj9m9, device:0
rirv938-tune-mistral-grp-3576-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
rirv938-tune-mistral-grp-3576-v1-mkmlizer: quantized model in 66.964s
rirv938-tune-mistral-grp-3576-v1-mkmlizer: Processed model rirv938/tune_mistral_grpo_cp592_92ff_new_merged in 237.652s
rirv938-tune-mistral-grp-3576-v1-mkmlizer: creating bucket guanaco-mkml-models
rirv938-tune-mistral-grp-3576-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rirv938-tune-mistral-grp-3576-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rirv938-tune-mistral-grp-3576-v1
rirv938-tune-mistral-grp-3576-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/rirv938-tune-mistral-grp-3576-v1/config.json
rirv938-tune-mistral-grp-3576-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/rirv938-tune-mistral-grp-3576-v1/special_tokens_map.json
rirv938-tune-mistral-grp-3576-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/rirv938-tune-mistral-grp-3576-v1/flywheel_model.0.safetensors
rirv938-tune-mistral-grp-3576-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.1.safetensors s3://guanaco-mkml-models/rirv938-tune-mistral-grp-3576-v1/flywheel_model.1.safetensors
rirv938-tune-mistral-grp-3576-v1-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%| | 3/363 [00:00<00:12, 28.71it/s] Loading 0: 2%|▏ | 6/363 [00:00<00:26, 13.38it/s] Loading 0: 3%|▎ | 10/363 [00:00<00:17, 20.52it/s] Loading 0: 4%|▎ | 13/363 [00:00<00:26, 13.40it/s] Loading 0: 4%|▍ | 15/363 [00:01<00:31, 11.09it/s] Loading 0: 5%|▌ | 19/363 [00:01<00:21, 15.78it/s] Loading 0: 6%|▌ | 22/363 [00:01<00:22, 15.19it/s] Loading 0: 7%|▋ | 24/363 [00:01<00:24, 14.10it/s] Loading 0: 8%|▊ | 28/363 [00:01<00:18, 18.32it/s] Loading 0: 9%|▉ | 32/363 [00:01<00:15, 22.05it/s] Loading 0: 10%|▉ | 35/363 [00:02<00:21, 14.92it/s] Loading 0: 10%|█ | 38/363 [00:02<00:20, 15.62it/s] Loading 0: 11%|█ | 40/363 [00:02<00:23, 13.71it/s] Loading 0: 12%|█▏ | 42/363 [00:02<00:24, 13.19it/s] Loading 0: 13%|█▎ | 46/363 [00:02<00:18, 17.36it/s] Loading 0: 14%|█▍ | 50/363 [00:03<00:14, 21.15it/s] Loading 0: 15%|█▍ | 53/363 [00:03<00:21, 14.61it/s] Loading 0: 15%|█▌ | 55/363 [00:03<00:22, 13.95it/s] Loading 0: 16%|█▌ | 57/363 [00:03<00:20, 14.80it/s] Loading 0: 16%|█▋ | 59/363 [00:03<00:25, 11.84it/s] Loading 0: 18%|█▊ | 64/363 [00:04<00:17, 17.59it/s] Loading 0: 19%|█▊ | 68/363 [00:04<00:14, 21.05it/s] Loading 0: 20%|█▉ | 71/363 [00:04<00:19, 14.60it/s] Loading 0: 20%|██ | 74/363 [00:04<00:18, 15.33it/s] Loading 0: 21%|██ | 76/363 [00:04<00:21, 13.43it/s] Loading 0: 21%|██▏ | 78/363 [00:05<00:21, 13.01it/s] Loading 0: 23%|██▎ | 82/363 [00:05<00:16, 17.09it/s] Loading 0: 24%|██▎ | 86/363 [00:05<00:13, 20.89it/s] Loading 0: 25%|██▍ | 89/363 [00:05<00:18, 14.43it/s] Loading 0: 25%|██▌ | 91/363 [00:05<00:19, 13.89it/s] Loading 0: 26%|██▌ | 95/363 [00:06<00:14, 17.93it/s] Loading 0: 27%|██▋ | 99/363 [00:06<00:12, 21.34it/s] Loading 0: 28%|██▊ | 102/363 [00:06<00:13, 19.55it/s] Loading 0: 29%|██▉ | 105/363 [00:06<00:17, 14.42it/s] Loading 0: 29%|██▉ | 107/363 [00:06<00:19, 13.29it/s] Loading 0: 30%|███ | 109/363 [00:07<00:19, 12.90it/s] Loading 0: 31%|███ | 111/363 [00:07<00:18, 13.73it/s] Loading 0: 31%|███ | 113/363 [00:07<00:21, 11.52it/s] Loading 0: 33%|███▎ | 118/363 [00:07<00:13, 17.70it/s] Loading 0: 34%|███▎ | 122/363 [00:07<00:11, 21.45it/s] Loading 0: 34%|███▍ | 125/363 [00:08<00:15, 14.89it/s] Loading 0: 35%|███▌ | 128/363 [00:08<00:15, 15.60it/s] Loading 0: 36%|███▌ | 130/363 [00:08<00:16, 13.93it/s] Loading 0: 36%|███▋ | 132/363 [00:08<00:17, 13.51it/s] Loading 0: 37%|███▋ | 136/363 [00:08<00:12, 17.62it/s] Loading 0: 39%|███▊ | 140/363 [00:08<00:10, 21.39it/s] Loading 0: 39%|███▉ | 143/363 [00:09<00:14, 14.70it/s] Loading 0: 40%|███▉ | 145/363 [00:09<00:15, 13.84it/s] Loading 0: 40%|████ | 147/363 [00:09<00:14, 14.67it/s] Loading 0: 41%|████ | 149/363 [00:09<00:17, 11.98it/s] Loading 0: 42%|████▏ | 154/363 [00:09<00:11, 18.05it/s] Loading 0: 44%|████▎ | 158/363 [00:09<00:09, 21.56it/s] Loading 0: 44%|████▍ | 161/363 [00:10<00:13, 15.11it/s] Loading 0: 45%|████▌ | 164/363 [00:10<00:12, 15.88it/s] Loading 0: 46%|████▌ | 167/363 [00:10<00:15, 12.59it/s] Loading 0: 47%|████▋ | 172/363 [00:10<00:10, 17.43it/s] Loading 0: 48%|████▊ | 176/363 [00:11<00:09, 20.58it/s] Loading 0: 49%|████▉ | 179/363 [00:11<00:12, 15.12it/s] Loading 0: 50%|█████ | 182/363 [00:11<00:11, 15.75it/s] Loading 0: 51%|█████ | 185/363 [00:11<00:14, 12.22it/s] Loading 0: 52%|█████▏ | 190/363 [00:12<00:10, 16.86it/s] Loading 0: 53%|█████▎ | 194/363 [00:12<00:08, 19.95it/s] Loading 0: 54%|█████▍ | 197/363 [00:12<00:11, 14.96it/s] Loading 0: 55%|█████▌ | 200/363 [00:12<00:10, 15.61it/s] Loading 0: 56%|█████▌ | 203/363 [00:30<04:21, 1.63s/it] Loading 0: 57%|█████▋ | 207/363 [00:30<02:48, 1.08s/it] Loading 0: 58%|█████▊ | 210/363 [00:30<02:03, 1.24it/s] Loading 0: 59%|█████▊ | 213/363 [00:30<01:31, 1.64it/s] Loading 0: 60%|█████▉ | 216/363 [00:31<01:09, 2.13it/s] Loading 0: 60%|██████ | 219/363 [00:31<00:49, 2.90it/s] Loading 0: 61%|██████ | 221/363 [00:31<00:42, 3.34it/s] Loading 0: 62%|██████▏ | 226/363 [00:31<00:24, 5.56it/s] Loading 0: 63%|██████▎ | 230/363 [00:31<00:17, 7.67it/s] Loading 0: 64%|██████▍ | 233/363 [00:32<00:16, 7.78it/s] Loading 0: 65%|██████▌ | 236/363 [00:32<00:13, 9.17it/s] Loading 0: 66%|██████▌ | 238/363 [00:32<00:13, 9.31it/s] Loading 0: 66%|██████▌ | 240/363 [00:32<00:12, 9.78it/s] Loading 0: 67%|██████▋ | 244/363 [00:32<00:08, 13.45it/s] Loading 0: 68%|██████▊ | 248/363 [00:32<00:06, 17.11it/s] Loading 0: 69%|██████▉ | 251/363 [00:33<00:08, 13.28it/s] Loading 0: 70%|██████▉ | 253/363 [00:33<00:08, 13.15it/s] Loading 0: 70%|███████ | 255/363 [00:33<00:07, 14.19it/s] Loading 0: 71%|███████ | 257/363 [00:33<00:09, 11.73it/s] Loading 0: 72%|███████▏ | 262/363 [00:33<00:05, 17.63it/s] Loading 0: 73%|███████▎ | 266/363 [00:33<00:04, 20.92it/s] Loading 0: 74%|███████▍ | 269/363 [00:34<00:06, 15.03it/s] Loading 0: 75%|███████▍ | 272/363 [00:34<00:05, 15.75it/s] Loading 0: 75%|███████▌ | 274/363 [00:34<00:06, 14.11it/s] Loading 0: 76%|███████▌ | 276/363 [00:34<00:06, 13.45it/s] Loading 0: 77%|███████▋ | 280/363 [00:34<00:04, 17.60it/s] Loading 0: 78%|███████▊ | 284/363 [00:35<00:03, 21.10it/s] Loading 0: 79%|███████▉ | 287/363 [00:35<00:05, 14.90it/s] Loading 0: 80%|███████▉ | 289/363 [00:35<00:05, 14.17it/s] Loading 0: 80%|████████ | 291/363 [00:35<00:04, 15.06it/s] Loading 0: 81%|████████ | 293/363 [00:35<00:05, 12.20it/s] Loading 0: 82%|████████▏ | 298/363 [00:36<00:03, 18.21it/s] Loading 0: 83%|████████▎ | 302/363 [00:36<00:02, 21.54it/s] Loading 0: 84%|████████▍ | 305/363 [00:36<00:03, 15.07it/s] Loading 0: 85%|████████▍ | 308/363 [00:36<00:03, 15.75it/s] Loading 0: 85%|████████▌ | 310/363 [00:36<00:03, 14.07it/s] Loading 0: 86%|████████▌ | 312/363 [00:37<00:03, 13.38it/s] Loading 0: 87%|████████▋ | 316/363 [00:37<00:02, 17.49it/s] Loading 0: 88%|████████▊ | 320/363 [00:37<00:02, 21.00it/s] Loading 0: 89%|████████▉ | 323/363 [00:37<00:02, 14.67it/s] Loading 0: 90%|████████▉ | 325/363 [00:37<00:02, 14.08it/s] Loading 0: 90%|█████████ | 327/363 [00:37<00:02, 15.01it/s] Loading 0: 91%|█████████ | 329/363 [00:38<00:02, 12.07it/s] Loading 0: 92%|█████████▏| 334/363 [00:38<00:01, 18.13it/s] Loading 0: 93%|█████████▎| 338/363 [00:38<00:01, 21.41it/s] Loading 0: 94%|█████████▍| 341/363 [00:38<00:01, 14.97it/s] Loading 0: 95%|█████████▍| 344/363 [00:39<00:01, 15.71it/s] Loading 0: 95%|█████████▌| 346/363 [00:39<00:01, 14.26it/s] Loading 0: 96%|█████████▌| 348/363 [00:39<00:01, 13.53it/s] Loading 0: 97%|█████████▋| 352/363 [00:39<00:00, 17.76it/s] Loading 0: 98%|█████████▊| 356/363 [00:39<00:00, 21.39it/s] Loading 0: 99%|█████████▉| 359/363 [00:46<00:02, 1.42it/s] Loading 0: 99%|█████████▉| 361/363 [00:47<00:01, 1.71it/s]
Job rirv938-tune-mistral-grp-3576-v1-mkmlizer completed after 268.89s with status: succeeded
Stopping job with name rirv938-tune-mistral-grp-3576-v1-mkmlizer
Pipeline stage MKMLizer completed in 269.38s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rirv938-tune-mistral-grp-3576-v1
Waiting for inference service rirv938-tune-mistral-grp-3576-v1 to be ready
Failed to get response for submission bogoconic1-nemo-280k-s6_46455_v1: HTTPConnectionPool(host='bogoconic1-nemo-280k-s6-46455-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Inference service rirv938-tune-mistral-grp-3576-v1 ready after 160.7558617591858s
Pipeline stage MKMLDeployer completed in 161.25s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.5469982624053955s
Received healthy response to inference request in 1.8791680335998535s
Failed to get response for submission rirv938-snug-grpo-40k-c_63788_v2: HTTPConnectionPool(host='rirv938-snug-grpo-40k-c-63788-v2-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Received healthy response to inference request in 2.0426008701324463s
Received healthy response to inference request in 2.226126194000244s
Received healthy response to inference request in 1.9898300170898438s
5 requests
0 failed requests
5th percentile: 1.9013004302978516
10th percentile: 1.9234328269958496
20th percentile: 1.9676976203918457
30th percentile: 2.0003841876983643
40th percentile: 2.021492528915405
50th percentile: 2.0426008701324463
60th percentile: 2.1160109996795655
70th percentile: 2.1894211292266847
80th percentile: 2.2903006076812744
90th percentile: 2.418649435043335
95th percentile: 2.4828238487243652
99th percentile: 2.5341633796691894
mean time: 2.1369446754455566
Pipeline stage StressChecker completed in 11.97s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.63s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.63s
Shutdown handler de-registered
rirv938-tune-mistral-grp_3576_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 3127.43s
Shutdown handler de-registered
rirv938-tune-mistral-grp_3576_v1 status is now inactive due to auto deactivation removed underperforming models
rirv938-tune-mistral-grp_3576_v1 status is now torndown due to DeploymentManager action