developer_uid: chace9580
submission_id: jic062-instruct-v20_v1
model_name: jic062-instruct-v20_v1
model_group: jic062/instruct-v20
status: torndown
timestamp: 2024-09-16T16:58:43+00:00
num_battles: 10758
num_wins: 5218
celo_rating: 1243.1
family_friendly_score: 0.0
submission_type: basic
model_repo: jic062/instruct-v20
model_architecture: LlamaForCausalLM
model_num_parameters: 8030261248.0
best_of: 16
max_input_tokens: 512
max_output_tokens: 64
latencies: [{'batch_size': 1, 'throughput': 0.9119905675786356, 'latency_mean': 1.0964436316490174, 'latency_p50': 1.0895639657974243, 'latency_p90': 1.223279881477356}, {'batch_size': 3, 'throughput': 1.6053881126361624, 'latency_mean': 1.860797700881958, 'latency_p50': 1.8685606718063354, 'latency_p90': 2.0903851985931396}, {'batch_size': 5, 'throughput': 1.7671985675218276, 'latency_mean': 2.8151498448848726, 'latency_p50': 2.8458683490753174, 'latency_p90': 3.181899333000183}, {'batch_size': 6, 'throughput': 1.7739686674198452, 'latency_mean': 3.3571198558807374, 'latency_p50': 3.362440347671509, 'latency_p90': 3.7657623052597047}, {'batch_size': 8, 'throughput': 1.7580765749362806, 'latency_mean': 4.5208222258090975, 'latency_p50': 4.573615789413452, 'latency_p90': 5.092871499061585}, {'batch_size': 10, 'throughput': 1.7542710890224766, 'latency_mean': 5.651948989629745, 'latency_p50': 5.604528069496155, 'latency_p90': 6.625890064239502}]
gpu_counts: {'NVIDIA RTX A5000': 1}
display_name: jic062-instruct-v20_v1
is_internal_developer: False
language_model: jic062/instruct-v20
model_size: 8B
ranking_group: single
throughput_3p7s: 1.78
us_pacific_date: 2024-09-16
win_ratio: 0.4850343930098531
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 80, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '<|end_of_text|>', '|eot_id|'], 'max_input_tokens': 512, 'best_of': 16, 'max_output_tokens': 64}
formatter: {'memory_template': "<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\n{bot_name}'s Persona: {memory}\n\n", 'prompt_template': '{prompt}<|eot_id|>', 'bot_template': '<|start_header_id|>assistant<|end_header_id|>\n\n{bot_name}: {message}<|eot_id|>', 'user_template': '<|start_header_id|>user<|end_header_id|>\n\n{user_name}: {message}<|eot_id|>', 'response_template': '<|start_header_id|>assistant<|end_header_id|>\n\n{bot_name}:', 'truncate_by_message': False}
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name jic062-instruct-v20-v1-mkmlizer
Waiting for job on jic062-instruct-v20-v1-mkmlizer to finish
jic062-instruct-v20-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
jic062-instruct-v20-v1-mkmlizer: ║ _____ __ __ ║
jic062-instruct-v20-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
jic062-instruct-v20-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
jic062-instruct-v20-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
jic062-instruct-v20-v1-mkmlizer: ║ /___/ ║
jic062-instruct-v20-v1-mkmlizer: ║ ║
jic062-instruct-v20-v1-mkmlizer: ║ Version: 0.10.1 ║
jic062-instruct-v20-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
jic062-instruct-v20-v1-mkmlizer: ║ https://mk1.ai ║
jic062-instruct-v20-v1-mkmlizer: ║ ║
jic062-instruct-v20-v1-mkmlizer: ║ The license key for the current software has been verified as ║
jic062-instruct-v20-v1-mkmlizer: ║ belonging to: ║
jic062-instruct-v20-v1-mkmlizer: ║ ║
jic062-instruct-v20-v1-mkmlizer: ║ Chai Research Corp. ║
jic062-instruct-v20-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
jic062-instruct-v20-v1-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
jic062-instruct-v20-v1-mkmlizer: ║ ║
jic062-instruct-v20-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
jic062-instruct-v20-v1-mkmlizer: Downloaded to shared memory in 36.984s
jic062-instruct-v20-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp283oenp7, device:0
jic062-instruct-v20-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
jic062-instruct-v20-v1-mkmlizer: quantized model in 26.544s
jic062-instruct-v20-v1-mkmlizer: Processed model jic062/instruct-v20 in 63.528s
jic062-instruct-v20-v1-mkmlizer: creating bucket guanaco-mkml-models
jic062-instruct-v20-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
jic062-instruct-v20-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/jic062-instruct-v20-v1
jic062-instruct-v20-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/jic062-instruct-v20-v1/config.json
jic062-instruct-v20-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/jic062-instruct-v20-v1/special_tokens_map.json
jic062-instruct-v20-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/jic062-instruct-v20-v1/tokenizer_config.json
jic062-instruct-v20-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/jic062-instruct-v20-v1/tokenizer.json
jic062-instruct-v20-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/jic062-instruct-v20-v1/flywheel_model.0.safetensors
jic062-instruct-v20-v1-mkmlizer: Loading 0: 0%| | 0/291 [00:00<?, ?it/s] Loading 0: 2%|▏ | 5/291 [00:00<00:08, 34.83it/s] Loading 0: 4%|▍ | 13/291 [00:00<00:05, 53.92it/s] Loading 0: 7%|▋ | 19/291 [00:00<00:05, 48.20it/s] Loading 0: 8%|▊ | 24/291 [00:00<00:05, 45.84it/s] Loading 0: 11%|█ | 31/291 [00:00<00:04, 52.76it/s] Loading 0: 13%|█▎ | 37/291 [00:00<00:05, 49.25it/s] Loading 0: 15%|█▍ | 43/291 [00:00<00:04, 51.17it/s] Loading 0: 17%|█▋ | 49/291 [00:00<00:04, 53.60it/s] Loading 0: 19%|█▉ | 55/291 [00:01<00:05, 46.37it/s] Loading 0: 21%|██ | 60/291 [00:01<00:05, 46.04it/s] Loading 0: 23%|██▎ | 66/291 [00:01<00:04, 49.00it/s] Loading 0: 25%|██▍ | 72/291 [00:01<00:04, 50.83it/s] Loading 0: 27%|██▋ | 78/291 [00:01<00:04, 44.31it/s] Loading 0: 29%|██▊ | 83/291 [00:01<00:06, 33.81it/s] Loading 0: 30%|██▉ | 87/291 [00:01<00:05, 34.28it/s] Loading 0: 32%|███▏ | 93/291 [00:02<00:04, 39.90it/s] Loading 0: 34%|███▎ | 98/291 [00:02<00:04, 41.78it/s] Loading 0: 36%|███▌ | 104/291 [00:02<00:04, 38.39it/s] Loading 0: 38%|███▊ | 112/291 [00:02<00:03, 46.85it/s] Loading 0: 41%|████ | 118/291 [00:02<00:03, 44.43it/s] Loading 0: 42%|████▏ | 123/291 [00:02<00:03, 44.47it/s] Loading 0: 45%|████▍ | 130/291 [00:02<00:03, 49.06it/s] Loading 0: 47%|████▋ | 136/291 [00:03<00:03, 46.52it/s] Loading 0: 48%|████▊ | 141/291 [00:03<00:03, 44.75it/s] Loading 0: 51%|█████ | 148/291 [00:03<00:02, 49.11it/s] Loading 0: 53%|█████▎ | 154/291 [00:03<00:03, 45.29it/s] Loading 0: 55%|█████▍ | 159/291 [00:03<00:02, 44.73it/s] Loading 0: 57%|█████▋ | 166/291 [00:03<00:02, 49.96it/s] Loading 0: 59%|█████▉ | 172/291 [00:03<00:02, 40.41it/s] Loading 0: 61%|██████ | 177/291 [00:03<00:02, 41.67it/s] Loading 0: 63%|██████▎ | 182/291 [00:04<00:02, 41.22it/s] Loading 0: 64%|██████▍ | 187/291 [00:04<00:03, 32.00it/s] Loading 0: 66%|██████▌ | 191/291 [00:04<00:02, 33.51it/s] Loading 0: 67%|██████▋ | 195/291 [00:04<00:02, 33.11it/s] Loading 0: 69%|██████▉ | 201/291 [00:04<00:02, 38.93it/s] Loading 0: 71%|███████ | 206/291 [00:04<00:02, 41.04it/s] Loading 0: 73%|███████▎ | 211/291 [00:04<00:01, 43.12it/s] Loading 0: 75%|███████▍ | 217/291 [00:05<00:01, 42.60it/s] Loading 0: 76%|███████▋ | 222/291 [00:05<00:01, 42.63it/s] Loading 0: 78%|███████▊ | 228/291 [00:05<00:01, 46.26it/s] Loading 0: 80%|████████ | 233/291 [00:05<00:01, 45.60it/s] Loading 0: 82%|████████▏ | 238/291 [00:05<00:01, 44.79it/s] Loading 0: 84%|████████▍ | 244/291 [00:05<00:01, 43.65it/s] Loading 0: 86%|████████▌ | 249/291 [00:05<00:00, 42.51it/s] Loading 0: 88%|████████▊ | 255/291 [00:05<00:00, 46.88it/s] Loading 0: 89%|████████▉ | 260/291 [00:05<00:00, 47.30it/s] Loading 0: 91%|█████████ | 265/291 [00:06<00:00, 46.48it/s] Loading 0: 93%|█████████▎| 271/291 [00:06<00:00, 44.16it/s] Loading 0: 95%|█████████▍| 276/291 [00:06<00:00, 43.07it/s] Loading 0: 97%|█████████▋| 281/291 [00:06<00:00, 44.50it/s] Loading 0: 98%|█████████▊| 286/291 [00:06<00:00, 40.23it/s] Loading 0: 100%|██████████| 291/291 [00:11<00:00, 3.00it/s]
Job jic062-instruct-v20-v1-mkmlizer completed after 84.82s with status: succeeded
Stopping job with name jic062-instruct-v20-v1-mkmlizer
Pipeline stage MKMLizer completed in 85.81s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service jic062-instruct-v20-v1
Waiting for inference service jic062-instruct-v20-v1 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Inference service jic062-instruct-v20-v1 ready after 171.16095614433289s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Pipeline stage MKMLDeployer completed in 179.74s
run pipeline stage %s
Running pipeline stage StressChecker
Connection pool is full, discarding connection: %s. Connection pool size: %s
Failed to get response for submission blend_hokok_2024-09-09: ('http://neversleep-noromaid-v0-8068-v150-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Received healthy response to inference request in 2.3016409873962402s
Received healthy response to inference request in 1.7159085273742676s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 7.695668935775757s
Received healthy response to inference request in 3.0057573318481445s
Received healthy response to inference request in 1.5435898303985596s
5 requests
0 failed requests
5th percentile: 1.5780535697937013
10th percentile: 1.6125173091888427
20th percentile: 1.6814447879791259
30th percentile: 1.833055019378662
40th percentile: 2.067348003387451
50th percentile: 2.3016409873962402
60th percentile: 2.5832875251770018
70th percentile: 2.8649340629577633
80th percentile: 3.943739652633668
90th percentile: 5.819704294204712
95th percentile: 6.7576866149902335
99th percentile: 7.508072471618652
mean time: 3.252513122558594
Pipeline stage StressChecker completed in 23.86s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 9.70s
Shutdown handler de-registered
jic062-instruct-v20_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.17s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.14s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service jic062-instruct-v20-v1-profiler
Waiting for inference service jic062-instruct-v20-v1-profiler to be ready
Inference service jic062-instruct-v20-v1-profiler ready after 170.40751123428345s
Pipeline stage MKMLProfilerDeployer completed in 170.81s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/jic062-instruct-v20-v1-profiler-predictor-00001-deploymentlprrv:/code/chaiverse_profiler_1726506443 --namespace tenant-chaiml-guanaco
kubectl exec -it jic062-instruct-v20-v1-profiler-predictor-00001-deploymentlprrv --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1726506443 && python profiles.py profile --best_of_n 16 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 512 --output_tokens 64 --summary /code/chaiverse_profiler_1726506443/summary.json'
kubectl exec -it jic062-instruct-v20-v1-profiler-predictor-00001-deploymentlprrv --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1726506443/summary.json'
Pipeline stage MKMLProfilerRunner completed in 810.53s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service jic062-instruct-v20-v1-profiler is running
Tearing down inference service jic062-instruct-v20-v1-profiler
Service jic062-instruct-v20-v1-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 3.49s
Shutdown handler de-registered
jic062-instruct-v20_v1 status is now inactive due to auto deactivation removed underperforming models
jic062-instruct-v20_v1 status is now torndown due to DeploymentManager action