dtnewman-trainer-debin-_14184

developer_uid: Jellywibble

submission_id: dtnewman-trainer-debin-_14184_v1

model_name: dtnewman-trainer-debin-_14184_v1

model_group: dtnewman/trainer_debin_2

status: torndown

timestamp: 2025-02-25T16:43:02+00:00

num_battles: 7798

num_wins: 3561

celo_rating: 1240.76

family_friendly_score: 0.5962000000000001

family_friendly_standard_error: 0.006938956117457438

submission_type: basic

model_repo: dtnewman/trainer_debin_2025-02-25-checkpoint-1

model_architecture: MistralForCausalLM

model_num_parameters: 12772070400.0

best_of: 8

max_input_tokens: 1024

max_output_tokens: 64

reward_model: default

latencies: [{'batch_size': 1, 'throughput': 0.6037234037940257, 'latency_mean': 1.656322796344757, 'latency_p50': 1.671081781387329, 'latency_p90': 1.8251694679260253}, {'batch_size': 3, 'throughput': 1.1084350347098093, 'latency_mean': 2.70107839345932, 'latency_p50': 2.6973161697387695, 'latency_p90': 2.970609188079834}, {'batch_size': 5, 'throughput': 1.3360974827745788, 'latency_mean': 3.722613432407379, 'latency_p50': 3.7135326862335205, 'latency_p90': 4.200136828422546}, {'batch_size': 6, 'throughput': 1.4007647848489728, 'latency_mean': 4.265383095741272, 'latency_p50': 4.256740927696228, 'latency_p90': 4.720174741744995}, {'batch_size': 8, 'throughput': 1.455716296445882, 'latency_mean': 5.466188833713532, 'latency_p50': 5.495953679084778, 'latency_p90': 6.136641883850097}, {'batch_size': 10, 'throughput': 1.494490213245232, 'latency_mean': 6.6450826978683475, 'latency_p50': 6.6534188985824585, 'latency_p90': 7.49281690120697}]

gpu_counts: {'NVIDIA RTX A5000': 1}

display_name: dtnewman-trainer-debin-_14184_v1

is_internal_developer: True

language_model: dtnewman/trainer_debin_2025-02-25-checkpoint-1

model_size: 13B

ranking_group: single

throughput_3p7s: 1.34

us_pacific_date: 2025-02-25

win_ratio: 0.45665555270582203

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}

formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name dtnewman-trainer-debin-14184-v1-mkmlizer
Waiting for job on dtnewman-trainer-debin-14184-v1-mkmlizer to finish
Failed to get response for submission nischaydnk-mistral24bas_35524_v1: HTTPConnectionPool(host='nischaydnk-mistral24bas-35524-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
dtnewman-trainer-debin-14184-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
dtnewman-trainer-debin-14184-v1-mkmlizer: ║     _____            __           __                                ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║    / _/ /_ ___    __/ /  ___ ___ / /                                ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║   / _/ / // / |/|/ / _ \/ -_) -_) /                                 ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║  /_//_/\_, /|__,__/_//_/\__/\__/_/                                  ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║       /___/                                                         ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║                                                                     ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║  Version: 0.12.8                                                    ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║  Copyright 2023 MK ONE TECHNOLOGIES Inc.                            ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║  https://mk1.ai                                                     ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║                                                                     ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║  The license key for the current software has been verified as      ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║  belonging to:                                                      ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║                                                                     ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║  Chai Research Corp.                                                ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║  Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f                   ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║  Expiration: 2025-04-15 23:59:59                                    ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ║                                                                     ║
dtnewman-trainer-debin-14184-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Failed to get response for submission nischaydnk-mistral24bas_35524_v1: HTTPConnectionPool(host='nischaydnk-mistral24bas-35524-v1-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
dtnewman-trainer-debin-14184-v1-mkmlizer: Downloaded to shared memory in 47.099s
dtnewman-trainer-debin-14184-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpq62lqlxs, device:0
dtnewman-trainer-debin-14184-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
dtnewman-trainer-debin-14184-v1-mkmlizer: quantized model in 35.517s
dtnewman-trainer-debin-14184-v1-mkmlizer: Processed model dtnewman/trainer_debin_2025-02-25-checkpoint-1 in 82.617s
dtnewman-trainer-debin-14184-v1-mkmlizer: creating bucket guanaco-mkml-models
dtnewman-trainer-debin-14184-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
dtnewman-trainer-debin-14184-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/dtnewman-trainer-debin-14184-v1
dtnewman-trainer-debin-14184-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/dtnewman-trainer-debin-14184-v1/config.json
dtnewman-trainer-debin-14184-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/dtnewman-trainer-debin-14184-v1/special_tokens_map.json
dtnewman-trainer-debin-14184-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/dtnewman-trainer-debin-14184-v1/tokenizer_config.json
dtnewman-trainer-debin-14184-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/dtnewman-trainer-debin-14184-v1/tokenizer.json
dtnewman-trainer-debin-14184-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/dtnewman-trainer-debin-14184-v1/flywheel_model.0.safetensors
dtnewman-trainer-debin-14184-v1-mkmlizer: 
Loading 0:   0%|          | 0/363 [00:00<?, ?it/s]
Loading 0:   1%|▏         | 5/363 [00:00<00:12, 29.33it/s]
Loading 0:   4%|▎         | 13/363 [00:00<00:06, 51.75it/s]
Loading 0:   5%|▌         | 19/363 [00:00<00:06, 49.91it/s]
Loading 0:   7%|▋         | 25/363 [00:00<00:06, 51.57it/s]
Loading 0:   9%|▉         | 32/363 [00:00<00:06, 47.58it/s]
Loading 0:  11%|█         | 40/363 [00:00<00:05, 56.17it/s]
Loading 0:  13%|█▎        | 46/363 [00:00<00:05, 53.33it/s]
Loading 0:  14%|█▍        | 52/363 [00:01<00:06, 51.68it/s]
Loading 0:  17%|█▋        | 60/363 [00:01<00:05, 52.87it/s]
Loading 0:  18%|█▊        | 66/363 [00:01<00:08, 36.65it/s]
Loading 0:  20%|█▉        | 72/363 [00:01<00:07, 40.33it/s]
Loading 0:  21%|██▏       | 78/363 [00:01<00:07, 40.61it/s]
Loading 0:  23%|██▎       | 83/363 [00:01<00:06, 41.01it/s]
Loading 0:  25%|██▍       | 90/363 [00:01<00:05, 46.81it/s]
Loading 0:  26%|██▋       | 96/363 [00:02<00:06, 44.10it/s]
Loading 0:  28%|██▊       | 101/363 [00:02<00:06, 41.62it/s]
Loading 0:  30%|██▉       | 108/363 [00:02<00:05, 47.91it/s]
Loading 0:  31%|███▏      | 114/363 [00:02<00:05, 44.54it/s]
Loading 0:  33%|███▎      | 119/363 [00:02<00:05, 42.93it/s]
Loading 0:  35%|███▍      | 126/363 [00:02<00:04, 48.50it/s]
Loading 0:  36%|███▋      | 132/363 [00:02<00:05, 45.40it/s]
Loading 0:  38%|███▊      | 137/363 [00:03<00:05, 42.49it/s]
Loading 0:  39%|███▉      | 142/363 [00:03<00:06, 33.26it/s]
Loading 0:  40%|████      | 146/363 [00:03<00:06, 34.10it/s]
Loading 0:  41%|████▏     | 150/363 [00:03<00:06, 33.74it/s]
Loading 0:  43%|████▎     | 157/363 [00:03<00:04, 41.23it/s]
Loading 0:  45%|████▍     | 163/363 [00:03<00:04, 42.12it/s]
Loading 0:  46%|████▋     | 168/363 [00:03<00:04, 42.84it/s]
Loading 0:  48%|████▊     | 175/363 [00:03<00:03, 48.76it/s]
Loading 0:  50%|████▉     | 181/363 [00:04<00:03, 47.36it/s]
Loading 0:  51%|█████     | 186/363 [00:04<00:03, 45.55it/s]
Loading 0:  53%|█████▎    | 193/363 [00:04<00:03, 50.17it/s]
Loading 0:  55%|█████▍    | 199/363 [00:04<00:03, 46.50it/s]
Loading 0:  56%|█████▌    | 204/363 [00:04<00:03, 44.49it/s]
Loading 0:  58%|█████▊    | 211/363 [00:04<00:03, 48.58it/s]
Loading 0:  60%|█████▉    | 217/363 [00:04<00:03, 45.63it/s]
Loading 0:  61%|██████▏   | 223/363 [00:05<00:03, 35.72it/s]
Loading 0:  63%|██████▎   | 227/363 [00:05<00:03, 35.37it/s]
Loading 0:  64%|██████▎   | 231/363 [00:05<00:04, 32.95it/s]
Loading 0:  65%|██████▌   | 237/363 [00:05<00:03, 38.85it/s]
Loading 0:  67%|██████▋   | 242/363 [00:05<00:03, 40.24it/s]
Loading 0:  68%|██████▊   | 247/363 [00:05<00:02, 41.21it/s]
Loading 0:  70%|██████▉   | 253/363 [00:05<00:02, 41.47it/s]
Loading 0:  71%|███████   | 258/363 [00:05<00:02, 42.17it/s]
Loading 0:  73%|███████▎  | 265/363 [00:06<00:02, 48.19it/s]
Loading 0:  75%|███████▍  | 271/363 [00:06<00:02, 45.65it/s]
Loading 0:  76%|███████▌  | 276/363 [00:06<00:01, 44.62it/s]
Loading 0:  78%|███████▊  | 283/363 [00:06<00:01, 49.89it/s]
Loading 0:  80%|███████▉  | 289/363 [00:06<00:01, 48.20it/s]
Loading 0:  81%|████████  | 294/363 [00:06<00:01, 46.75it/s]
Loading 0:  83%|████████▎ | 302/363 [00:06<00:01, 54.77it/s]
Loading 0:  85%|████████▍ | 308/363 [00:13<00:18,  2.93it/s]
Loading 0:  86%|████████▌ | 312/363 [00:13<00:13,  3.64it/s]
Loading 0:  88%|████████▊ | 320/363 [00:13<00:07,  5.71it/s]
Loading 0:  90%|████████▉ | 326/363 [00:14<00:04,  7.67it/s]
Loading 0:  91%|█████████ | 331/363 [00:14<00:03,  9.76it/s]
Loading 0:  93%|█████████▎| 338/363 [00:14<00:01, 13.73it/s]
Loading 0:  95%|█████████▍| 344/363 [00:14<00:01, 17.12it/s]
Loading 0:  96%|█████████▋| 350/363 [00:14<00:00, 20.60it/s]
Loading 0:  98%|█████████▊| 355/363 [00:14<00:00, 24.22it/s]
Loading 0:  99%|█████████▉| 360/363 [00:14<00:00, 26.80it/s]
                                                            
Job dtnewman-trainer-debin-14184-v1-mkmlizer completed after 113.89s with status: succeeded
Stopping job with name dtnewman-trainer-debin-14184-v1-mkmlizer
Pipeline stage MKMLizer completed in 114.36s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service dtnewman-trainer-debin-14184-v1
Waiting for inference service dtnewman-trainer-debin-14184-v1 to be ready
Failed to get response for submission nischaydnk-mistral24b-s_10081_v3: HTTPConnectionPool(host='nischaydnk-mistral24b-s-10081-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Failed to get response for submission nischaydnk-mistral24b-s_10081_v3: HTTPConnectionPool(host='nischaydnk-mistral24b-s-10081-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com', port=80): Read timed out. (read timeout=12.0)
Inference service dtnewman-trainer-debin-14184-v1 ready after 240.867342710495s
Pipeline stage MKMLDeployer completed in 241.29s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPConnectionPool(host='guanaco-submitter.guanaco-backend.k2.chaiverse.com', port=80): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 2.1868088245391846s
Received healthy response to inference request in 1.5968050956726074s
Received healthy response to inference request in 1.6888129711151123s
Received healthy response to inference request in 1.8315198421478271s
5 requests
1 failed requests
5th percentile: 1.6152066707611084
10th percentile: 1.6336082458496093
20th percentile: 1.6704113960266114
30th percentile: 1.7173543453216553
40th percentile: 1.7744370937347411
50th percentile: 1.8315198421478271
60th percentile: 1.9736354351043701
70th percentile: 2.115751028060913
80th percentile: 5.773907327651981
90th percentile: 12.948104333877566
95th percentile: 16.535202836990354
99th percentile: 19.40488163948059
mean time: 5.485249614715576
%s, retrying in %s seconds...
Received healthy response to inference request in 1.5371181964874268s
Received healthy response to inference request in 1.5526208877563477s
Received healthy response to inference request in 1.4765963554382324s
Received healthy response to inference request in 1.6152629852294922s
Received healthy response to inference request in 1.5437214374542236s
5 requests
0 failed requests
5th percentile: 1.4887007236480714
10th percentile: 1.5008050918579101
20th percentile: 1.5250138282775878
30th percentile: 1.5384388446807862
40th percentile: 1.5410801410675048
50th percentile: 1.5437214374542236
60th percentile: 1.5472812175750732
70th percentile: 1.5508409976959228
80th percentile: 1.5651493072509766
90th percentile: 1.5902061462402344
95th percentile: 1.6027345657348633
99th percentile: 1.6127573013305665
mean time: 1.5450639724731445
Pipeline stage StressChecker completed in 37.53s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 0.72s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
triggered trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 0.70s
Shutdown handler de-registered
dtnewman-trainer-debin-_14184_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
Pipeline stage OfflineFamilyFriendlyScorer completed in 4162.66s
Shutdown handler de-registered
dtnewman-trainer-debin-_14184_v1 status is now inactive due to auto deactivation removed underperforming models
dtnewman-trainer-debin-_14184_v1 status is now torndown due to DeploymentManager action