submission_id: jic062-dpo-v1-1-nemo_v2
developer_uid: chace9580
best_of: 8
celo_rating: 1243.89
display_name: jic062-dpo-v1-1-nemo_v2
family_friendly_score: 0.0
formatter: {'memory_template': '[INST]system\n{memory}[/INST]\n', 'prompt_template': '[INST]user\n{prompt}[/INST]\n', 'bot_template': '[INST]assistant\n{bot_name}: {message}[/INST]\n', 'user_template': '[INST]user\n{user_name}: {message}[/INST]\n', 'response_template': '[INST]assistant\n{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.1, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '[/INST]'], 'max_input_tokens': 512, 'best_of': 8, 'max_output_tokens': 64}
gpu_counts: {'NVIDIA RTX A5000': 1}
is_internal_developer: False
language_model: jic062/dpo-v1.1-Nemo
latencies: [{'batch_size': 1, 'throughput': 0.6957226351184506, 'latency_mean': 1.4372551071643829, 'latency_p50': 1.4278088808059692, 'latency_p90': 1.6067377567291259}, {'batch_size': 3, 'throughput': 1.3327594854314684, 'latency_mean': 2.2491272282600403, 'latency_p50': 2.256813645362854, 'latency_p90': 2.5028048515319825}, {'batch_size': 5, 'throughput': 1.576839605870777, 'latency_mean': 3.1582055735588073, 'latency_p50': 3.1572662591934204, 'latency_p90': 3.566643166542053}, {'batch_size': 6, 'throughput': 1.6248655428241636, 'latency_mean': 3.6706688237190246, 'latency_p50': 3.663160800933838, 'latency_p90': 4.063928246498108}, {'batch_size': 8, 'throughput': 1.6007884987415752, 'latency_mean': 4.961596518754959, 'latency_p50': 5.000230550765991, 'latency_p90': 5.593447828292847}, {'batch_size': 10, 'throughput': 1.5474857159139404, 'latency_mean': 6.420194885730743, 'latency_p50': 6.5284271240234375, 'latency_p90': 7.203416872024536}]
max_input_tokens: 512
max_output_tokens: 64
model_architecture: MistralForCausalLM
model_group: jic062/dpo-v1.1-Nemo
model_name: jic062-dpo-v1-1-nemo_v2
model_num_parameters: 12772070400.0
model_repo: jic062/dpo-v1.1-Nemo
model_size: 13B
num_battles: 10938
num_wins: 5659
ranking_group: single
status: torndown
submission_type: basic
throughput_3p7s: 1.63
timestamp: 2024-09-10T04:34:58+00:00
us_pacific_date: 2024-09-09
win_ratio: 0.5173706344852806
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name jic062-dpo-v1-1-nemo-v2-mkmlizer
Waiting for job on jic062-dpo-v1-1-nemo-v2-mkmlizer to finish
Connection pool is full, discarding connection: %s. Connection pool size: %s
jic062-dpo-v1-1-nemo-v2-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ _____ __ __ ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ /___/ ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ Version: 0.10.1 ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ https://mk1.ai ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ The license key for the current software has been verified as ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ belonging to: ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ Chai Research Corp. ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ║ ║
jic062-dpo-v1-1-nemo-v2-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
jic062-dpo-v1-1-nemo-v2-mkmlizer: Downloaded to shared memory in 28.375s
jic062-dpo-v1-1-nemo-v2-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmptqboug2d, device:0
jic062-dpo-v1-1-nemo-v2-mkmlizer: Saving flywheel model at /dev/shm/model_cache
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
jic062-dpo-v1-1-nemo-v2-mkmlizer: quantized model in 36.442s
jic062-dpo-v1-1-nemo-v2-mkmlizer: Processed model jic062/dpo-v1.1-Nemo in 64.817s
Retrying (%r) after connection broken by '%r': %s
jic062-dpo-v1-1-nemo-v2-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
jic062-dpo-v1-1-nemo-v2-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/jic062-dpo-v1-1-nemo-v2
jic062-dpo-v1-1-nemo-v2-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/jic062-dpo-v1-1-nemo-v2/config.json
jic062-dpo-v1-1-nemo-v2-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/jic062-dpo-v1-1-nemo-v2/special_tokens_map.json
jic062-dpo-v1-1-nemo-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/jic062-dpo-v1-1-nemo-v2/tokenizer_config.json
jic062-dpo-v1-1-nemo-v2-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/jic062-dpo-v1-1-nemo-v2/tokenizer.json
jic062-dpo-v1-1-nemo-v2-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/jic062-dpo-v1-1-nemo-v2/flywheel_model.0.safetensors
Job jic062-dpo-v1-1-nemo-v2-mkmlizer completed after 93.39s with status: succeeded
Stopping job with name jic062-dpo-v1-1-nemo-v2-mkmlizer
Pipeline stage MKMLizer completed in 94.99s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.08s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service jic062-dpo-v1-1-nemo-v2
Waiting for inference service jic062-dpo-v1-1-nemo-v2 to be ready
Inference service jic062-dpo-v1-1-nemo-v2 ready after 160.67753982543945s
Pipeline stage MKMLDeployer completed in 161.20s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 3.7703936100006104s
Received healthy response to inference request in 2.229368209838867s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 2.0086796283721924s
Received healthy response to inference request in 2.208261251449585s
Received healthy response to inference request in 1.6866650581359863s
5 requests
0 failed requests
5th percentile: 1.7510679721832276
10th percentile: 1.8154708862304687
20th percentile: 1.944276714324951
30th percentile: 2.0485959529876707
40th percentile: 2.128428602218628
50th percentile: 2.208261251449585
60th percentile: 2.2167040348052978
70th percentile: 2.2251468181610106
80th percentile: 2.537573289871216
90th percentile: 3.153983449935913
95th percentile: 3.4621885299682615
99th percentile: 3.7087525939941406
mean time: 2.380673551559448
Pipeline stage StressChecker completed in 13.02s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 6.41s
Shutdown handler de-registered
jic062-dpo-v1-1-nemo_v2 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.12s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.09s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service jic062-dpo-v1-1-nemo-v2-profiler
Waiting for inference service jic062-dpo-v1-1-nemo-v2-profiler to be ready
Inference service jic062-dpo-v1-1-nemo-v2-profiler ready after 160.4239935874939s
Pipeline stage MKMLProfilerDeployer completed in 160.79s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/jic062-dpo-v1-1-nemo-v2-profiler-predictor-00001-deploymeng4pv4:/code/chaiverse_profiler_1725943371 --namespace tenant-chaiml-guanaco
kubectl exec -it jic062-dpo-v1-1-nemo-v2-profiler-predictor-00001-deploymeng4pv4 --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1725943371 && python profiles.py profile --best_of_n 8 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 512 --output_tokens 64 --summary /code/chaiverse_profiler_1725943371/summary.json'
kubectl exec -it jic062-dpo-v1-1-nemo-v2-profiler-predictor-00001-deploymeng4pv4 --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1725943371/summary.json'
Pipeline stage MKMLProfilerRunner completed in 946.42s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service jic062-dpo-v1-1-nemo-v2-profiler is running
Tearing down inference service jic062-dpo-v1-1-nemo-v2-profiler
Service jic062-dpo-v1-1-nemo-v2-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 1.78s
Shutdown handler de-registered
jic062-dpo-v1-1-nemo_v2 status is now inactive due to auto deactivation removed underperforming models
jic062-dpo-v1-1-nemo_v2 status is now torndown due to DeploymentManager action