rica40325-feedback-dpo-8

developer_uid: rica40325

submission_id: rica40325-feedback-dpo-8_v1

model_name: rica40325-feedback-dpo-8_v1

model_group: rica40325/feedback_dpo_8

status: torndown

timestamp: 2024-09-10T10:39:15+00:00

num_battles: 14183

num_wins: 6379

celo_rating: 1194.06

family_friendly_score: 0.0

submission_type: basic

model_repo: rica40325/feedback_dpo_8

model_architecture: LlamaForCausalLM

model_num_parameters: 8030261248.0

best_of: 16

max_input_tokens: 512

max_output_tokens: 64

latencies: [{'batch_size': 1, 'throughput': 0.9121392298883656, 'latency_mean': 1.0962369632720947, 'latency_p50': 1.0961217880249023, 'latency_p90': 1.224309778213501}, {'batch_size': 4, 'throughput': 1.799813565874019, 'latency_mean': 2.2138206148147583, 'latency_p50': 2.209741711616516, 'latency_p90': 2.496352767944336}, {'batch_size': 5, 'throughput': 1.8717890478662722, 'latency_mean': 2.6568470072746275, 'latency_p50': 2.669188380241394, 'latency_p90': 2.97242534160614}, {'batch_size': 8, 'throughput': 1.995698591560722, 'latency_mean': 3.979494217634201, 'latency_p50': 3.9977318048477173, 'latency_p90': 4.504805159568787}, {'batch_size': 10, 'throughput': 2.0229153557133355, 'latency_mean': 4.892616982460022, 'latency_p50': 4.826807498931885, 'latency_p90': 5.7358914613723755}, {'batch_size': 12, 'throughput': 2.0144481996483736, 'latency_mean': 5.876554342508316, 'latency_p50': 5.875905275344849, 'latency_p90': 6.77490086555481}, {'batch_size': 15, 'throughput': 1.9985761296294788, 'latency_mean': 7.361770433187485, 'latency_p50': 7.469679117202759, 'latency_p90': 8.19918508529663}]

gpu_counts: {'NVIDIA RTX A5000': 1}

display_name: rica40325-feedback-dpo-8_v1

is_internal_developer: False

language_model: rica40325/feedback_dpo_8

model_size: 8B

ranking_group: single

throughput_3p7s: 1.99

us_pacific_date: 2024-09-10

win_ratio: 0.44976380173447084

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 512, 'best_of': 16, 'max_output_tokens': 64}

formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rica40325-feedback-dpo-8-v1-mkmlizer
Waiting for job on rica40325-feedback-dpo-8-v1-mkmlizer to finish
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
rica40325-feedback-dpo-8-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rica40325-feedback-dpo-8-v1-mkmlizer: ║     _____            __           __                                ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║    / _/ /_ ___    __/ /  ___ ___ / /                                ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║   / _/ / // / |/|/ / _ \/ -_) -_) /                                 ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║  /_//_/\_, /|__,__/_//_/\__/\__/_/                                  ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║       /___/                                                         ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║                                                                     ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║  Version: 0.10.1                                                    ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║  Copyright 2023 MK ONE TECHNOLOGIES Inc.                            ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║  https://mk1.ai                                                     ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║                                                                     ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║  The license key for the current software has been verified as      ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║  belonging to:                                                      ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║                                                                     ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║  Chai Research Corp.                                                ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║  Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f                   ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║  Expiration: 2024-10-15 23:59:59                                    ║
rica40325-feedback-dpo-8-v1-mkmlizer: ║                                                                     ║
rica40325-feedback-dpo-8-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
rica40325-feedback-dpo-8-v1-mkmlizer: Downloaded to shared memory in 34.671s
rica40325-feedback-dpo-8-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpezkij4mq, device:0
rica40325-feedback-dpo-8-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
rica40325-feedback-dpo-8-v1-mkmlizer: quantized model in 25.969s
rica40325-feedback-dpo-8-v1-mkmlizer: Processed model rica40325/feedback_dpo_8 in 60.640s
rica40325-feedback-dpo-8-v1-mkmlizer: creating bucket guanaco-mkml-models
rica40325-feedback-dpo-8-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rica40325-feedback-dpo-8-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rica40325-feedback-dpo-8-v1
rica40325-feedback-dpo-8-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/rica40325-feedback-dpo-8-v1/config.json
rica40325-feedback-dpo-8-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/rica40325-feedback-dpo-8-v1/special_tokens_map.json
rica40325-feedback-dpo-8-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/rica40325-feedback-dpo-8-v1/tokenizer_config.json
rica40325-feedback-dpo-8-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rica40325-feedback-dpo-8-v1/tokenizer.json
rica40325-feedback-dpo-8-v1-mkmlizer: 
Loading 0:   0%|          | 0/291 [00:00<?, ?it/s]
Loading 0:   1%|▏         | 4/291 [00:00<00:07, 39.99it/s]
Loading 0:   4%|▍         | 13/291 [00:00<00:04, 65.11it/s]
Loading 0:   8%|▊         | 22/291 [00:00<00:03, 71.35it/s]
Loading 0:  11%|█         | 31/291 [00:00<00:03, 76.95it/s]
Loading 0:  15%|█▍        | 43/291 [00:00<00:03, 79.41it/s]
Loading 0:  20%|█▉        | 58/291 [00:00<00:02, 85.12it/s]
Loading 0:  23%|██▎       | 67/291 [00:00<00:02, 85.68it/s]
Loading 0:  26%|██▌       | 76/291 [00:00<00:02, 85.84it/s]
Loading 0:  29%|██▉       | 85/291 [00:02<00:09, 22.40it/s]
Loading 0:  32%|███▏      | 94/291 [00:02<00:06, 28.19it/s]
Loading 0:  35%|███▌      | 103/291 [00:02<00:05, 34.46it/s]
Loading 0:  38%|███▊      | 112/291 [00:02<00:04, 41.10it/s]
Loading 0:  42%|████▏     | 121/291 [00:02<00:03, 48.73it/s]
Loading 0:  45%|████▍     | 130/291 [00:02<00:02, 55.58it/s]
Loading 0:  48%|████▊     | 139/291 [00:02<00:02, 61.78it/s]
Loading 0:  51%|█████     | 148/291 [00:02<00:02, 66.78it/s]
Loading 0:  54%|█████▍    | 157/291 [00:02<00:01, 68.66it/s]
Loading 0:  57%|█████▋    | 166/291 [00:03<00:01, 70.84it/s]
Loading 0:  60%|██████    | 175/291 [00:03<00:01, 71.36it/s]
Loading 0:  63%|██████▎   | 184/291 [00:03<00:01, 74.59it/s]
Loading 0:  66%|██████▌   | 192/291 [00:04<00:04, 21.44it/s]
Loading 0:  68%|██████▊   | 198/291 [00:04<00:03, 24.37it/s]
Loading 0:  73%|███████▎  | 211/291 [00:04<00:02, 34.82it/s]
Loading 0:  76%|███████▌  | 220/291 [00:04<00:01, 41.95it/s]
Loading 0:  79%|███████▊  | 229/291 [00:04<00:01, 47.51it/s]
Loading 0:  82%|████████▏ | 238/291 [00:05<00:00, 54.04it/s]
Loading 0:  86%|████████▌ | 250/291 [00:05<00:00, 62.11it/s]
Loading 0:  89%|████████▉ | 259/291 [00:05<00:00, 66.34it/s]
Loading 0:  92%|█████████▏| 268/291 [00:05<00:00, 69.48it/s]
Loading 0:  95%|█████████▌| 277/291 [00:05<00:00, 73.23it/s]
Loading 0:  98%|█████████▊| 286/291 [00:05<00:00, 75.55it/s]
                                                            
Job rica40325-feedback-dpo-8-v1-mkmlizer completed after 86.35s with status: succeeded
Stopping job with name rica40325-feedback-dpo-8-v1-mkmlizer
Pipeline stage MKMLizer completed in 87.31s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.08s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rica40325-feedback-dpo-8-v1
Waiting for inference service rica40325-feedback-dpo-8-v1 to be ready
Failed to get response for submission neversleep-noromaid-v0_8068_v150: ('http://chaiml-llama-8b-pairwis-8189-v19-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:33448->127.0.0.1:8080: read: connection reset by peer\n')
Inference service rica40325-feedback-dpo-8-v1 ready after 160.80120372772217s
Pipeline stage MKMLDeployer completed in 161.14s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.724888563156128s
Received healthy response to inference request in 1.3805930614471436s
Received healthy response to inference request in 1.5168359279632568s
Received healthy response to inference request in 2.105945587158203s
Received healthy response to inference request in 1.6457993984222412s
5 requests
0 failed requests
5th percentile: 1.4078416347503662
10th percentile: 1.4350902080535888
20th percentile: 1.4895873546600342
30th percentile: 1.5426286220550538
40th percentile: 1.5942140102386475
50th percentile: 1.6457993984222412
60th percentile: 1.6774350643157958
70th percentile: 1.7090707302093506
80th percentile: 1.8010999679565431
90th percentile: 1.9535227775573731
95th percentile: 2.029734182357788
99th percentile: 2.09070330619812
mean time: 1.6748125076293945
Pipeline stage StressChecker completed in 8.97s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 8.53s
Shutdown handler de-registered
rica40325-feedback-dpo-8_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.11s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.12s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rica40325-feedback-dpo-8-v1-profiler
Waiting for inference service rica40325-feedback-dpo-8-v1-profiler to be ready
Inference service rica40325-feedback-dpo-8-v1-profiler ready after 180.5073161125183s
Pipeline stage MKMLProfilerDeployer completed in 180.84s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rica40325-feedback-dpo-8-v1-profiler-predictor-00001-deplog22vp:/code/chaiverse_profiler_1725965249 --namespace tenant-chaiml-guanaco
kubectl exec -it rica40325-feedback-dpo-8-v1-profiler-predictor-00001-deplog22vp --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1725965249 && python profiles.py profile --best_of_n 16 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 512 --output_tokens 64 --summary /code/chaiverse_profiler_1725965249/summary.json'
kubectl exec -it rica40325-feedback-dpo-8-v1-profiler-predictor-00001-deplog22vp --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1725965249/summary.json'
Pipeline stage MKMLProfilerRunner completed in 840.09s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rica40325-feedback-dpo-8-v1-profiler is running
Tearing down inference service rica40325-feedback-dpo-8-v1-profiler
Service rica40325-feedback-dpo-8-v1-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.05s
Shutdown handler de-registered
rica40325-feedback-dpo-8_v1 status is now inactive due to auto deactivation removed underperforming models
rica40325-feedback-dpo-8_v1 status is now torndown due to DeploymentManager action