rirv938-llama-8b-dpo-vs-_3178

developer_uid: robert_irvine

submission_id: rirv938-llama-8b-dpo-vs-_3178_v1

model_name: rirv938-llama-8b-dpo-vs-_3178_v1

model_group: rirv938/llama_8b_dpo_vs_

status: torndown

timestamp: 2024-09-14T20:55:03+00:00

num_battles: 10620

num_wins: 5273

celo_rating: 1252.36

family_friendly_score: 0.0

submission_type: basic

model_repo: rirv938/llama_8b_dpo_vs_dpo_250k_390

model_architecture: LlamaForSequenceClassification

model_num_parameters: 8030261248.0

best_of: 1

max_input_tokens: 256

max_output_tokens: 1

latencies: [{'batch_size': 1, 'throughput': 0.1035843835379025, 'latency_mean': 9.653879857063293, 'latency_p50': 9.634511113166809, 'latency_p90': 9.820104217529297}]

gpu_counts: {'NVIDIA RTX A5000': 1}

display_name: rirv938-llama-8b-dpo-vs-_3178_v1

ineligible_reason: max_output_tokens!=64

is_internal_developer: True

language_model: rirv938/llama_8b_dpo_vs_dpo_250k_390

model_size: 8B

ranking_group: single

us_pacific_date: 2024-09-14

win_ratio: 0.49651600753295666

generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 256, 'best_of': 1, 'max_output_tokens': 1}

formatter: {'memory_template': '', 'prompt_template': '', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '', 'truncate_by_message': False}

Resubmit model

Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer
Waiting for job on rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer to finish
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║     _____            __           __                                ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║    / _/ /_ ___    __/ /  ___ ___ / /                                ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║   / _/ / // / |/|/ / _ \/ -_) -_) /                                 ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║  /_//_/\_, /|__,__/_//_/\__/\__/_/                                  ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║       /___/                                                         ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║                                                                     ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║  Version: 0.10.1                                                    ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║  Copyright 2023 MK ONE TECHNOLOGIES Inc.                            ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║  https://mk1.ai                                                     ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║                                                                     ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║  The license key for the current software has been verified as      ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║  belonging to:                                                      ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║                                                                     ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║  Chai Research Corp.                                                ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║  Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f                   ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║  Expiration: 2024-10-15 23:59:59                                    ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ║                                                                     ║
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Failed to get response for submission blend_jugel_2024-09-09: ('http://zonemercy-lexical-nemov8-5966-v2-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'readfrom tcp 127.0.0.1:54646->127.0.0.1:8080: write tcp 127.0.0.1:54646->127.0.0.1:8080: use of closed network connection\n')
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: Downloaded to shared memory in 39.073s
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:t0, folder:/tmp/tmp7smrvtli, device:0
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
Failed to get response for submission blend_hokok_2024-09-09: ('http://neversleep-noromaid-v0-8068-v150-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Connection pool is full, discarding connection: %s. Connection pool size: %s
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: quantized model in 84.972s
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: Processed model rirv938/llama_8b_dpo_vs_dpo_250k_390 in 124.046s
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: creating bucket guanaco-mkml-models
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rirv938-llama-8b-dpo-vs-3178-v1
Connection pool is full, discarding connection: %s. Connection pool size: %s
Failed to get response for submission blend_hokok_2024-09-09: ('http://neversleep-noromaid-v0-8068-v150-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/rirv938-llama-8b-dpo-vs-3178-v1/config.json
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/rirv938-llama-8b-dpo-vs-3178-v1/tokenizer_config.json
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/rirv938-llama-8b-dpo-vs-3178-v1/special_tokens_map.json
rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rirv938-llama-8b-dpo-vs-3178-v1/tokenizer.json
Job rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer completed after 151.63s with status: succeeded
Stopping job with name rirv938-llama-8b-dpo-vs-3178-v1-mkmlizer
Pipeline stage MKMLizer completed in 152.40s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.09s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rirv938-llama-8b-dpo-vs-3178-v1
Waiting for inference service rirv938-llama-8b-dpo-vs-3178-v1 to be ready
Inference service rirv938-llama-8b-dpo-vs-3178-v1 ready after 171.20190739631653s
Pipeline stage MKMLDeployer completed in 175.57s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 5.654731035232544s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 5.954140901565552s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 4.889165163040161s
Received healthy response to inference request in 5.159054756164551s
Received healthy response to inference request in 4.3936944007873535s
5 requests
0 failed requests
5th percentile: 4.492788553237915
10th percentile: 4.591882705688477
20th percentile: 4.7900710105896
30th percentile: 4.943143081665039
40th percentile: 5.051098918914795
50th percentile: 5.159054756164551
60th percentile: 5.357325267791748
70th percentile: 5.555595779418946
80th percentile: 5.714613008499145
90th percentile: 5.834376955032349
95th percentile: 5.89425892829895
99th percentile: 5.942164506912231
mean time: 5.210157251358032
%s, retrying in %s seconds...
Received healthy response to inference request in 4.711295127868652s
Received healthy response to inference request in 3.014885187149048s
Received healthy response to inference request in 5.225952625274658s
Received healthy response to inference request in 7.121256351470947s
Received healthy response to inference request in 4.486185073852539s
5 requests
0 failed requests
5th percentile: 3.3091451644897463
10th percentile: 3.6034051418304442
20th percentile: 4.191925096511841
30th percentile: 4.531207084655762
40th percentile: 4.621251106262207
50th percentile: 4.711295127868652
60th percentile: 4.917158126831055
70th percentile: 5.123021125793457
80th percentile: 5.605013370513916
90th percentile: 6.363134860992432
95th percentile: 6.7421956062316895
99th percentile: 7.045444202423096
mean time: 4.911914873123169
%s, retrying in %s seconds...
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 4.593408823013306s
Received healthy response to inference request in 4.6041765213012695s
Received healthy response to inference request in 2.009122133255005s
Failed to get response for submission blend_hokok_2024-09-09: ('http://neversleep-noromaid-v0-8068-v150-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Received healthy response to inference request in 5.077234745025635s
Received healthy response to inference request in 3.656287431716919s
5 requests
0 failed requests
5th percentile: 2.338555192947388
10th percentile: 2.6679882526397707
20th percentile: 3.326854372024536
30th percentile: 3.8437117099761964
40th percentile: 4.218560266494751
50th percentile: 4.593408823013306
60th percentile: 4.597715902328491
70th percentile: 4.602022981643676
80th percentile: 4.698788166046143
90th percentile: 4.888011455535889
95th percentile: 4.982623100280762
99th percentile: 5.05831241607666
mean time: 3.9880459308624268
Pipeline stage StressChecker completed in 108.70s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 4.76s
Shutdown handler de-registered
rirv938-llama-8b-dpo-vs-_3178_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.17s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service rirv938-llama-8b-dpo-vs-3178-v1-profiler
Waiting for inference service rirv938-llama-8b-dpo-vs-3178-v1-profiler to be ready
Inference service rirv938-llama-8b-dpo-vs-3178-v1-profiler ready after 180.4298801422119s
Pipeline stage MKMLProfilerDeployer completed in 180.90s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/rirv938-llama-8b-dpo3bc77c0a258068214101c18732505b9f-deplocqgpp:/code/chaiverse_profiler_1726347972 --namespace tenant-chaiml-guanaco
kubectl exec -it rirv938-llama-8b-dpo3bc77c0a258068214101c18732505b9f-deplocqgpp --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1726347972 && python profiles.py profile --best_of_n 1 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 256 --output_tokens 1 --summary /code/chaiverse_profiler_1726347972/summary.json'
kubectl exec -it rirv938-llama-8b-dpo3bc77c0a258068214101c18732505b9f-deplocqgpp --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1726347972/summary.json'
Pipeline stage MKMLProfilerRunner completed in 1944.89s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service rirv938-llama-8b-dpo-vs-3178-v1-profiler is running
Tearing down inference service rirv938-llama-8b-dpo-vs-3178-v1-profiler
Service rirv938-llama-8b-dpo-vs-3178-v1-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 1.76s
Shutdown handler de-registered
rirv938-llama-8b-dpo-vs-_3178_v1 status is now inactive due to auto deactivation removed underperforming models
rirv938-llama-8b-dpo-vs-_3178_v1 status is now torndown due to DeploymentManager action