developer_uid: rirv938
submission_id: chaiml-rinen-grpo-40k-c_51339_v1
model_name: chaiml-rinen-grpo-40k-c_51339_v1
model_group: ChaiML/rinen_grpo_40k_cp
status: torndown
timestamp: 2025-05-01T18:44:35+00:00
num_battles: 8271
num_wins: 4517
celo_rating: 1337.29
family_friendly_score: 0.5509999999999999
family_friendly_standard_error: 0.007034187941759872
submission_type: basic
model_repo: ChaiML/rinen_grpo_40k_cp296_95ff
model_architecture: MistralForCausalLM
model_num_parameters: 12772070400.0
best_of: 8
max_input_tokens: 1024
max_output_tokens: 64
reward_model: default
latencies: [{'batch_size': 1, 'throughput': 0.6055445799291461, 'latency_mean': 1.6513158583641052, 'latency_p50': 1.6638685464859009, 'latency_p90': 1.8268481016159057}, {'batch_size': 3, 'throughput': 1.1217926554229698, 'latency_mean': 2.6721025669574736, 'latency_p50': 2.6635189056396484, 'latency_p90': 2.9565902233123778}, {'batch_size': 5, 'throughput': 1.3724374035081028, 'latency_mean': 3.6292040026187897, 'latency_p50': 3.6068098545074463, 'latency_p90': 4.050941228866577}, {'batch_size': 6, 'throughput': 1.4259192691467495, 'latency_mean': 4.180398563146591, 'latency_p50': 4.174941897392273, 'latency_p90': 4.7313943147659305}, {'batch_size': 8, 'throughput': 1.4995242072824482, 'latency_mean': 5.300164619684219, 'latency_p50': 5.306684374809265, 'latency_p90': 5.980837321281433}, {'batch_size': 10, 'throughput': 1.534655518498787, 'latency_mean': 6.478244347572327, 'latency_p50': 6.4650572538375854, 'latency_p90': 7.355137944221497}]
gpu_counts: {'NVIDIA RTX A5000': 1}
display_name: chaiml-rinen-grpo-40k-c_51339_v1
is_internal_developer: True
language_model: ChaiML/rinen_grpo_40k_cp296_95ff
model_size: 13B
ranking_group: single
throughput_3p7s: 1.39
us_pacific_date: 2025-05-01
win_ratio: 0.5461250151130456
generation_params: {'temperature': 0.9, 'top_p': 1.0, 'min_p': 0.05, 'top_k': 80, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['Bot:', 'User:', '<|im_end|>', 'You:', '\n', '####', '<|eot_id|>', '</s>'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': True}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer
Waiting for job on chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer to finish
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ _____ __ __ ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ /___/ ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ Version: 0.12.8 ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ https://mk1.ai ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ belonging to: ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ Chai Research Corp. ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ Expiration: 2028-03-31 23:59:59 ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ║ ║
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: Downloaded to shared memory in 71.033s
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpjyd6oqmn, device:0
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: quantized model in 40.458s
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: Processed model ChaiML/rinen_grpo_40k_cp296_95ff in 111.492s
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: creating bucket guanaco-mkml-models
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/chaiml-rinen-grpo-40k-c-51339-v1
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/chaiml-rinen-grpo-40k-c-51339-v1/config.json
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/chaiml-rinen-grpo-40k-c-51339-v1/special_tokens_map.json
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/chaiml-rinen-grpo-40k-c-51339-v1/tokenizer_config.json
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-rinen-grpo-40k-c-51339-v1/tokenizer.json
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-rinen-grpo-40k-c-51339-v1/flywheel_model.0.safetensors
chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 1%|▏ | 5/363 [00:00<00:15, 23.10it/s] Loading 0: 3%|▎ | 10/363 [00:00<00:12, 28.80it/s] Loading 0: 4%|▍ | 14/363 [00:00<00:13, 25.74it/s] Loading 0: 6%|▌ | 20/363 [00:00<00:09, 34.91it/s] Loading 0: 7%|▋ | 24/363 [00:00<00:14, 23.86it/s] Loading 0: 8%|▊ | 28/363 [00:01<00:13, 24.15it/s] Loading 0: 9%|▉ | 32/363 [00:01<00:13, 23.64it/s] Loading 0: 11%|█ | 39/363 [00:01<00:10, 30.42it/s] Loading 0: 12%|█▏ | 43/363 [00:01<00:10, 29.73it/s] Loading 0: 13%|█▎ | 48/363 [00:01<00:09, 31.81it/s] Loading 0: 14%|█▍ | 52/363 [00:01<00:10, 30.08it/s] Loading 0: 15%|█▌ | 56/363 [00:01<00:10, 29.80it/s] Loading 0: 17%|█▋ | 61/363 [00:02<00:11, 25.80it/s] Loading 0: 18%|█▊ | 64/363 [00:02<00:13, 22.44it/s] Loading 0: 20%|█▉ | 71/363 [00:02<00:10, 29.00it/s] Loading 0: 21%|██ | 75/363 [00:02<00:10, 28.14it/s] Loading 0: 21%|██▏ | 78/363 [00:02<00:10, 26.08it/s] Loading 0: 23%|██▎ | 82/363 [00:02<00:09, 28.52it/s] Loading 0: 24%|██▎ | 86/363 [00:03<00:11, 24.89it/s] Loading 0: 25%|██▌ | 91/363 [00:03<00:09, 29.53it/s] Loading 0: 26%|██▌ | 95/363 [00:03<00:10, 25.93it/s] Loading 0: 28%|██▊ | 101/363 [00:03<00:10, 24.46it/s] Loading 0: 29%|██▊ | 104/363 [00:03<00:11, 21.93it/s] Loading 0: 30%|███ | 109/363 [00:04<00:09, 26.92it/s] Loading 0: 31%|███ | 113/363 [00:04<00:10, 24.83it/s] Loading 0: 33%|███▎ | 118/363 [00:04<00:08, 29.60it/s] Loading 0: 34%|███▎ | 122/363 [00:04<00:09, 26.42it/s] Loading 0: 35%|███▍ | 127/363 [00:04<00:07, 31.16it/s] Loading 0: 36%|███▌ | 131/363 [00:04<00:08, 27.17it/s] Loading 0: 38%|███▊ | 137/363 [00:04<00:07, 31.43it/s] Loading 0: 39%|███▉ | 142/363 [00:05<00:08, 26.39it/s] Loading 0: 40%|████ | 146/363 [00:05<00:08, 26.55it/s] Loading 0: 41%|████ | 149/363 [00:05<00:09, 22.94it/s] Loading 0: 42%|████▏ | 154/363 [00:05<00:07, 27.91it/s] Loading 0: 44%|████▎ | 158/363 [00:05<00:08, 25.40it/s] Loading 0: 45%|████▌ | 165/363 [00:06<00:06, 31.50it/s] Loading 0: 47%|████▋ | 169/363 [00:06<00:06, 30.05it/s] Loading 0: 48%|████▊ | 174/363 [00:06<00:05, 32.07it/s] Loading 0: 49%|████▉ | 178/363 [00:06<00:06, 30.42it/s] Loading 0: 50%|█████ | 182/363 [00:06<00:07, 23.85it/s] Loading 0: 51%|█████ | 185/363 [00:06<00:08, 21.39it/s] Loading 0: 52%|█████▏ | 190/363 [00:07<00:06, 26.62it/s] Loading 0: 53%|█████▎ | 194/363 [00:07<00:06, 24.77it/s] Loading 0: 55%|█████▌ | 201/363 [00:07<00:05, 31.43it/s] Loading 0: 56%|█████▋ | 205/363 [00:07<00:05, 30.13it/s] Loading 0: 58%|█████▊ | 210/363 [00:07<00:04, 31.90it/s] Loading 0: 59%|█████▉ | 214/363 [00:07<00:04, 30.43it/s] Loading 0: 60%|██████ | 218/363 [00:07<00:04, 30.42it/s] Loading 0: 61%|██████▏ | 223/363 [00:08<00:05, 24.91it/s] Loading 0: 62%|██████▏ | 226/363 [00:08<00:05, 24.08it/s] Loading 0: 63%|██████▎ | 230/363 [00:08<00:05, 23.32it/s] Loading 0: 65%|██████▌ | 237/363 [00:08<00:04, 30.23it/s] Loading 0: 66%|██████▋ | 241/363 [00:08<00:04, 28.72it/s] Loading 0: 68%|██████▊ | 246/363 [00:08<00:03, 31.13it/s] Loading 0: 69%|██████▉ | 250/363 [00:09<00:03, 29.57it/s] Loading 0: 70%|███████ | 255/363 [00:09<00:03, 31.98it/s] Loading 0: 71%|███████▏ | 259/363 [00:09<00:03, 30.52it/s] Loading 0: 72%|███████▏ | 263/363 [00:09<00:04, 24.69it/s] Loading 0: 73%|███████▎ | 266/363 [00:09<00:04, 21.89it/s] Loading 0: 75%|███████▌ | 273/363 [00:09<00:03, 28.62it/s] Loading 0: 76%|███████▋ | 277/363 [00:10<00:03, 28.29it/s] Loading 0: 78%|███████▊ | 282/363 [00:10<00:02, 30.32it/s] Loading 0: 79%|███████▉ | 286/363 [00:10<00:02, 29.59it/s] Loading 0: 80%|████████ | 291/363 [00:10<00:02, 31.93it/s] Loading 0: 81%|████████▏ | 295/363 [00:10<00:02, 30.56it/s] Loading 0: 82%|████████▏ | 299/363 [00:10<00:02, 30.71it/s] Loading 0: 84%|████████▎ | 304/363 [00:11<00:02, 26.38it/s] Loading 0: 85%|████████▍ | 307/363 [00:11<00:02, 25.00it/s] Loading 0: 86%|████████▌ | 311/363 [00:11<00:02, 23.93it/s] Loading 0: 88%|████████▊ | 318/363 [00:11<00:01, 30.58it/s] Loading 0: 89%|████████▊ | 322/363 [00:11<00:01, 29.07it/s] Loading 0: 90%|█████████ | 327/363 [00:11<00:01, 31.21it/s] Loading 0: 91%|█████████ | 331/363 [00:11<00:01, 29.66it/s] Loading 0: 93%|█████████▎| 336/363 [00:12<00:00, 31.89it/s] Loading 0: 94%|█████████▎| 340/363 [00:12<00:00, 30.63it/s] Loading 0: 95%|█████████▍| 344/363 [00:19<00:09, 2.00it/s] Loading 0: 96%|█████████▌| 348/363 [00:19<00:05, 2.70it/s] Loading 0: 97%|█████████▋| 353/363 [00:19<00:02, 3.91it/s] Loading 0: 98%|█████████▊| 357/363 [00:19<00:01, 5.08it/s]
Job chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer completed after 134.93s with status: succeeded
Stopping job with name chaiml-rinen-grpo-40k-c-51339-v1-mkmlizer
Pipeline stage MKMLizer completed in 135.68s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.16s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service chaiml-rinen-grpo-40k-c-51339-v1
Waiting for inference service chaiml-rinen-grpo-40k-c-51339-v1 to be ready
Failed to get response for submission chaiml-anthropic-grpo-4_66492_v1: HTTPConnectionPool(host='chaiml-anthropic-grpo-4-66492-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission chaiml-anthropic-grpo-4_66492_v2: HTTPConnectionPool(host='chaiml-anthropic-grpo-4-66492-v2-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Inference service chaiml-rinen-grpo-40k-c-51339-v1 ready after 150.79083251953125s
Pipeline stage MKMLDeployer completed in 151.19s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.168151617050171s
Received healthy response to inference request in 1.6575138568878174s
Received healthy response to inference request in 1.4693520069122314s
Received healthy response to inference request in 1.4832687377929688s
5 requests
1 failed requests
5th percentile: 1.472135353088379
10th percentile: 1.4749186992645265
20th percentile: 1.4804853916168212
30th percentile: 1.5181177616119386
40th percentile: 1.587815809249878
50th percentile: 1.6575138568878174
60th percentile: 1.8617689609527588
70th percentile: 2.0660240650177
80th percentile: 5.762849807739261
90th percentile: 12.952246189117433
95th percentile: 16.546944379806515
99th percentile: 19.422702932357787
mean time: 5.383985757827759
%s, retrying in %s seconds...
Received healthy response to inference request in 1.5171794891357422s
Received healthy response to inference request in 1.6925277709960938s
Received healthy response to inference request in 1.4844255447387695s
Received healthy response to inference request in 1.5314240455627441s
Received healthy response to inference request in 1.4790780544281006s
5 requests
0 failed requests
5th percentile: 1.4801475524902343
10th percentile: 1.481217050552368
20th percentile: 1.4833560466766358
30th percentile: 1.490976333618164
40th percentile: 1.5040779113769531
50th percentile: 1.5171794891357422
60th percentile: 1.522877311706543
70th percentile: 1.5285751342773437
80th percentile: 1.5636447906494142
90th percentile: 1.6280862808227539
95th percentile: 1.6603070259094237
99th percentile: 1.6860836219787598
mean time: 1.54092698097229
Pipeline stage StressChecker completed in 37.19s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.70s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.82s
Shutdown handler de-registered
chaiml-rinen-grpo-40k-c_51339_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.11s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-rinen-grpo-40k-c-51339-v1-profiler
Waiting for inference service chaiml-rinen-grpo-40k-c-51339-v1-profiler to be ready
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 5134.68s
Shutdown handler de-registered
chaiml-rinen-grpo-40k-c_51339_v1 status is now inactive due to auto deactivation removed underperforming models
chaiml-rinen-grpo-40k-c_51339_v1 status is now torndown due to DeploymentManager action