submission_id: princeton-nlp-gemma-2-9b_1853_v1
developer_uid: Meliodia
formatter: {'memory_template': '<|im_start|>system\n{memory}<|im_end|>\n', 'prompt_template': '<|im_start|>user\n{prompt}<|im_end|>\n', 'bot_template': '<|im_start|>assistant\n{bot_name}: {message}<|im_end|>\n', 'user_template': '<|im_start|>user\n{user_name}: {message}<|im_end|>\n', 'response_template': '<|im_start|>assistant\n{bot_name}:', 'truncate_by_message': True}
generation_params: {'temperature': 1.0, 'top_p': 1.0, 'min_p': 0.0, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 1024, 'best_of': 1, 'max_output_tokens': 64}
model_name: princeton-nlp-gemma-2-9b_1853_v1
model_repo: princeton-nlp/gemma-2-9b-it-SimPO
status: torndown
timestamp: 2024-09-10T21:39:02+00:00
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name princeton-nlp-gemma-2-9b-1853-v1-mkmlizer
Waiting for job on princeton-nlp-gemma-2-9b-1853-v1-mkmlizer to finish
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ _____ __ __ ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ /___/ ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ Version: 0.10.1 ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ https://mk1.ai ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ The license key for the current software has been verified as ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ belonging to: ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ Chai Research Corp. ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ║ ║
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: Downloaded to shared memory in 31.389s
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpqyfqq3qc, device:0
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: quantized model in 33.221s
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: Processed model princeton-nlp/gemma-2-9b-it-SimPO in 64.611s
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: creating bucket guanaco-mkml-models
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/princeton-nlp-gemma-2-9b-1853-v1
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/princeton-nlp-gemma-2-9b-1853-v1/tokenizer_config.json
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/princeton-nlp-gemma-2-9b-1853-v1/special_tokens_map.json
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/princeton-nlp-gemma-2-9b-1853-v1/config.json
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.model s3://guanaco-mkml-models/princeton-nlp-gemma-2-9b-1853-v1/tokenizer.model
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/princeton-nlp-gemma-2-9b-1853-v1/tokenizer.json
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/princeton-nlp-gemma-2-9b-1853-v1/flywheel_model.0.safetensors
princeton-nlp-gemma-2-9b-1853-v1-mkmlizer: Loading 0: 0%| | 0/464 [00:00<?, ?it/s] Loading 0: 3%|▎ | 12/464 [00:00<00:05, 79.70it/s] Loading 0: 5%|▍ | 23/464 [00:00<00:05, 74.11it/s] Loading 0: 7%|▋ | 34/464 [00:00<00:05, 85.52it/s] Loading 0: 10%|▉ | 45/464 [00:00<00:04, 92.09it/s] Loading 0: 12%|█▏ | 56/464 [00:00<00:04, 95.97it/s] Loading 0: 14%|█▍ | 67/464 [00:00<00:04, 95.69it/s] Loading 0: 17%|█▋ | 78/464 [00:00<00:03, 97.08it/s] Loading 0: 19%|█▉ | 88/464 [00:01<00:06, 62.49it/s] Loading 0: 21%|██ | 98/464 [00:01<00:05, 68.65it/s] Loading 0: 23%|██▎ | 108/464 [00:01<00:04, 74.60it/s] Loading 0: 25%|██▌ | 117/464 [00:01<00:04, 75.08it/s] Loading 0: 28%|██▊ | 128/464 [00:01<00:04, 77.98it/s] Loading 0: 30%|██▉ | 139/464 [00:01<00:03, 83.30it/s] Loading 0: 32%|███▏ | 150/464 [00:01<00:03, 82.49it/s] Loading 0: 35%|███▍ | 161/464 [00:01<00:03, 81.68it/s] Loading 0: 37%|███▋ | 172/464 [00:02<00:03, 78.78it/s] Loading 0: 39%|███▉ | 183/464 [00:02<00:03, 78.96it/s] Loading 0: 42%|████▏ | 194/464 [00:02<00:03, 79.20it/s] Loading 0: 44%|████▍ | 203/464 [00:02<00:03, 81.26it/s] Loading 0: 46%|████▌ | 214/464 [00:02<00:02, 85.69it/s] Loading 0: 48%|████▊ | 225/464 [00:02<00:02, 88.57it/s] Loading 0: 50%|█████ | 234/464 [00:02<00:03, 67.42it/s] Loading 0: 52%|█████▏ | 243/464 [00:03<00:03, 71.97it/s] Loading 0: 55%|█████▍ | 254/464 [00:03<00:02, 77.82it/s] Loading 0: 57%|█████▋ | 265/464 [00:03<00:02, 80.47it/s] Loading 0: 59%|█████▉ | 276/464 [00:03<00:02, 80.20it/s] Loading 0: 62%|██████▏ | 287/464 [00:03<00:02, 82.31it/s] Loading 0: 64%|██████▍ | 298/464 [00:03<00:02, 82.31it/s] Loading 0: 67%|██████▋ | 309/464 [00:03<00:01, 82.68it/s] Loading 0: 69%|██████▉ | 320/464 [00:03<00:01, 80.51it/s] Loading 0: 71%|███████▏ | 331/464 [00:04<00:01, 79.03it/s] Loading 0: 74%|███████▎ | 342/464 [00:04<00:01, 80.34it/s] Loading 0: 76%|███████▌ | 353/464 [00:04<00:01, 80.78it/s] Loading 0: 78%|███████▊ | 362/464 [00:04<00:01, 57.48it/s] Loading 0: 81%|████████ | 375/464 [00:04<00:01, 68.36it/s] Loading 0: 83%|████████▎ | 386/464 [00:04<00:01, 73.01it/s] Loading 0: 86%|████████▌ | 397/464 [00:05<00:00, 77.16it/s] Loading 0: 88%|████████▊ | 408/464 [00:05<00:00, 81.84it/s] Loading 0: 90%|█████████ | 419/464 [00:05<00:00, 85.91it/s] Loading 0: 93%|█████████▎| 430/464 [00:05<00:00, 89.53it/s] Loading 0: 95%|█████████▌| 441/464 [00:05<00:00, 87.80it/s] Loading 0: 97%|█████████▋| 452/464 [00:05<00:00, 87.54it/s] Loading 0: 100%|█████████▉| 463/464 [00:05<00:00, 86.23it/s]
Job princeton-nlp-gemma-2-9b-1853-v1-mkmlizer completed after 85.86s with status: succeeded
Stopping job with name princeton-nlp-gemma-2-9b-1853-v1-mkmlizer
Pipeline stage MKMLizer completed in 87.02s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.09s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service princeton-nlp-gemma-2-9b-1853-v1
Waiting for inference service princeton-nlp-gemma-2-9b-1853-v1 to be ready
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Inference service princeton-nlp-gemma-2-9b-1853-v1 ready after 151.30811285972595s
Pipeline stage MKMLDeployer completed in 151.81s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 20.082931756973267
10th percentile: 20.084768533706665
20th percentile: 20.088442087173462
30th percentile: 20.100537919998168
40th percentile: 20.121056032180785
50th percentile: 20.141574144363403
60th percentile: 20.150653409957886
70th percentile: 20.15973267555237
80th percentile: 20.209852123260497
90th percentile: 20.301011753082275
95th percentile: 20.346591567993165
99th percentile: 20.383055419921874
mean time: 20.173878335952757
%s, retrying in %s seconds...
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 20.288898134231566
10th percentile: 20.3505211353302
20th percentile: 20.473767137527467
30th percentile: 20.551531505584716
40th percentile: 20.583814239501955
50th percentile: 20.61609697341919
60th percentile: 20.623627376556396
70th percentile: 20.631157779693602
80th percentile: 20.66568202972412
90th percentile: 20.72720012664795
95th percentile: 20.757959175109864
99th percentile: 20.782566413879394
mean time: 20.560480690002443
%s, retrying in %s seconds...
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
5 requests
5 failed requests
5th percentile: 20.105470848083495
10th percentile: 20.112556600570677
20th percentile: 20.126728105545045
30th percentile: 20.134281158447266
40th percentile: 20.135215759277344
50th percentile: 20.136150360107422
60th percentile: 20.14346570968628
70th percentile: 20.150781059265135
80th percentile: 20.1573016166687
90th percentile: 20.163027381896974
95th percentile: 20.16589026451111
99th percentile: 20.168180570602416
mean time: 20.138308238983154
clean up pipeline due to error=%s
Shutdown handler de-registered
princeton-nlp-gemma-2-9b_1853_v1 status is now failed due to DeploymentManager action
princeton-nlp-gemma-2-9b_1853_v1 status is now torndown due to DeploymentManager action