submission_id: rica40325-mistral-13b-2452_v4
developer_uid: immaculate_possum_03470
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 0.95, 'top_p': 1.0, 'min_p': 0.075, 'top_k': 60, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n', '<|eot_id|>'], 'max_input_tokens': 1024, 'best_of': 8, 'max_output_tokens': 64}
model_name: nemo_base_1
model_repo: rica40325/mistral-13B-2452
status: torndown
timestamp: 2024-09-27T22:54:24+00:00
Resubmit model
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name rica40325-mistral-13b-2452-v4-mkmlizer
Waiting for job on rica40325-mistral-13b-2452-v4-mkmlizer to finish
rica40325-mistral-13b-2452-v4-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
rica40325-mistral-13b-2452-v4-mkmlizer: ║ _____ __ __ ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ /___/ ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ Version: 0.11.12 ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ https://mk1.ai ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ The license key for the current software has been verified as ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ belonging to: ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ Chai Research Corp. ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
rica40325-mistral-13b-2452-v4-mkmlizer: ║ ║
rica40325-mistral-13b-2452-v4-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
admin requested tearing down of arushimgupta-output_v4
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLDeleter
Checking if service arushimgupta-output-v4 is running
Tearing down inference service arushimgupta-output-v4
Service arushimgupta-output-v4 has been torndown
Pipeline stage MKMLDeleter completed in 2.29s
run pipeline stage %s
Running pipeline stage MKMLModelDeleter
Cleaning model data from S3
Cleaning model data from model cache
Deleting key arushimgupta-output-v4/config.json from bucket guanaco-mkml-models
Deleting key arushimgupta-output-v4/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Deleting key arushimgupta-output-v4/special_tokens_map.json from bucket guanaco-mkml-models
Deleting key arushimgupta-output-v4/tokenizer.json from bucket guanaco-mkml-models
Deleting key arushimgupta-output-v4/tokenizer_config.json from bucket guanaco-mkml-models
Pipeline stage MKMLModelDeleter completed in 2.43s
Shutdown handler de-registered
arushimgupta-output_v4 status is now torndown due to DeploymentManager action
rica40325-mistral-13b-2452-v4-mkmlizer: Downloaded to shared memory in 52.176s
rica40325-mistral-13b-2452-v4-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp90i4famf, device:0
rica40325-mistral-13b-2452-v4-mkmlizer: Saving flywheel model at /dev/shm/model_cache
rica40325-mistral-13b-2452-v4-mkmlizer: /opt/conda/lib/python3.10/site-packages/mk1/flywheel/functional/loader.py:55: FutureWarning: You are using `torch.load` with `weights_only=False` (the current default value), which uses the default pickle module implicitly. It is possible to construct malicious pickle data which will execute arbitrary code during unpickling (See https://github.com/pytorch/pytorch/blob/main/SECURITY.md#untrusted-models for more details). In a future release, the default value for `weights_only` will be flipped to `True`. This limits the functions that could be executed during unpickling. Arbitrary objects will no longer be allowed to be loaded via this mode unless they are explicitly allowlisted by the user via `torch.serialization.add_safe_globals`. We recommend you start setting `weights_only=True` for any use case where you don't have full control of the loaded file. Please open an issue on GitHub for any issues related to this experimental feature.
rica40325-mistral-13b-2452-v4-mkmlizer: tensors = torch.load(model_shard_filename, map_location=torch.device(self.device), mmap=True)
rica40325-mistral-13b-2452-v4-mkmlizer: quantized model in 43.008s
rica40325-mistral-13b-2452-v4-mkmlizer: Processed model rica40325/mistral-13B-2452 in 95.185s
rica40325-mistral-13b-2452-v4-mkmlizer: creating bucket guanaco-mkml-models
rica40325-mistral-13b-2452-v4-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
rica40325-mistral-13b-2452-v4-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/rica40325-mistral-13b-2452-v4
rica40325-mistral-13b-2452-v4-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/rica40325-mistral-13b-2452-v4/config.json
rica40325-mistral-13b-2452-v4-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/rica40325-mistral-13b-2452-v4/special_tokens_map.json
rica40325-mistral-13b-2452-v4-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/rica40325-mistral-13b-2452-v4/tokenizer_config.json
rica40325-mistral-13b-2452-v4-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/rica40325-mistral-13b-2452-v4/tokenizer.json
rica40325-mistral-13b-2452-v4-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/rica40325-mistral-13b-2452-v4/flywheel_model.0.safetensors
rica40325-mistral-13b-2452-v4-mkmlizer: Loading 0: 0%| | 0/363 [00:00<?, ?it/s] Loading 0: 2%|▏ | 7/363 [00:00<00:07, 48.90it/s] Loading 0: 5%|▍ | 17/363 [00:00<00:04, 73.75it/s] Loading 0: 7%|▋ | 25/363 [00:00<00:04, 72.10it/s] Loading 0: 9%|▉ | 34/363 [00:00<00:04, 70.11it/s] Loading 0: 12%|█▏ | 43/363 [00:00<00:04, 75.27it/s] Loading 0: 14%|█▍ | 52/363 [00:00<00:03, 77.85it/s] Loading 0: 17%|█▋ | 61/363 [00:01<00:15, 19.85it/s] Loading 0: 19%|█▉ | 70/363 [00:01<00:11, 25.81it/s] Loading 0: 22%|██▏ | 79/363 [00:02<00:08, 32.29it/s] Loading 0: 24%|██▍ | 88/363 [00:02<00:06, 39.50it/s] Loading 0: 27%|██▋ | 97/363 [00:02<00:05, 46.86it/s] Loading 0: 29%|██▉ | 106/363 [00:02<00:04, 52.63it/s] Loading 0: 32%|███▏ | 115/363 [00:02<00:04, 58.45it/s] Loading 0: 34%|███▍ | 124/363 [00:02<00:03, 64.10it/s] Loading 0: 37%|███▋ | 133/363 [00:02<00:03, 66.44it/s] Loading 0: 39%|███▉ | 142/363 [00:04<00:11, 19.47it/s] Loading 0: 42%|████▏ | 151/363 [00:04<00:08, 24.82it/s] Loading 0: 44%|████▍ | 160/363 [00:04<00:06, 30.28it/s] Loading 0: 47%|████▋ | 169/363 [00:04<00:05, 36.75it/s] Loading 0: 49%|████▉ | 178/363 [00:04<00:04, 43.47it/s] Loading 0: 52%|█████▏ | 187/363 [00:04<00:03, 49.38it/s] Loading 0: 54%|█████▍ | 196/363 [00:04<00:03, 55.26it/s] Loading 0: 56%|█████▋ | 205/363 [00:04<00:02, 60.47it/s] Loading 0: 59%|█████▉ | 214/363 [00:05<00:02, 64.03it/s] Loading 0: 61%|██████▏ | 223/363 [00:06<00:07, 18.90it/s] Loading 0: 64%|██████▍ | 232/363 [00:06<00:05, 24.31it/s] Loading 0: 66%|██████▋ | 241/363 [00:06<00:04, 30.47it/s] Loading 0: 69%|██████▉ | 250/363 [00:06<00:03, 37.19it/s] Loading 0: 71%|███████▏ | 259/363 [00:06<00:02, 42.28it/s] Loading 0: 74%|███████▍ | 268/363 [00:06<00:01, 49.41it/s] Loading 0: 76%|███████▋ | 277/363 [00:06<00:01, 55.81it/s] Loading 0: 79%|███████▊ | 285/363 [00:07<00:01, 59.89it/s] Loading 0: 81%|████████ | 293/363 [00:07<00:01, 63.26it/s] Loading 0: 83%|████████▎ | 301/363 [00:07<00:01, 61.65it/s] Loading 0: 85%|████████▍ | 308/363 [00:08<00:03, 17.40it/s] Loading 0: 87%|████████▋ | 314/363 [00:08<00:02, 20.27it/s] Loading 0: 89%|████████▊ | 322/363 [00:08<00:01, 25.65it/s] Loading 0: 91%|█████████ | 331/363 [00:08<00:00, 32.86it/s] Loading 0: 94%|█████████▎| 340/363 [00:09<00:00, 40.41it/s] Loading 0: 96%|█████████▌| 349/363 [00:09<00:00, 46.88it/s] Loading 0: 99%|█████████▊| 358/363 [00:09<00:00, 52.89it/s]
Job rica40325-mistral-13b-2452-v4-mkmlizer completed after 122.98s with status: succeeded
Stopping job with name rica40325-mistral-13b-2452-v4-mkmlizer
Pipeline stage MKMLizer completed in 123.22s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.07s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service rica40325-mistral-13b-2452-v4
Waiting for inference service rica40325-mistral-13b-2452-v4 to be ready
Inference service rica40325-mistral-13b-2452-v4 ready after 230.57875609397888s
Pipeline stage MKMLDeployer completed in 230.81s
run pipeline stage %s
Running pipeline stage StressChecker
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 3.357297658920288s
Received healthy response to inference request in 2.798203706741333s
Received healthy response to inference request in 2.5139620304107666s
Received healthy response to inference request in 2.7737977504730225s
5 requests
1 failed requests
5th percentile: 2.5659291744232178
10th percentile: 2.617896318435669
20th percentile: 2.7218306064605713
30th percentile: 2.7786789417266844
40th percentile: 2.7884413242340087
50th percentile: 2.798203706741333
60th percentile: 3.021841287612915
70th percentile: 3.245478868484497
80th percentile: 6.714176559448245
90th percentile: 13.427934360504151
95th percentile: 16.784813261032102
99th percentile: 19.470316381454467
mean time: 6.316990661621094
%s, retrying in %s seconds...
Received healthy response to inference request in 6.2888548374176025s
Received healthy response to inference request in 2.6935698986053467s
Received healthy response to inference request in 2.234832763671875s
Received healthy response to inference request in 10.511654615402222s
Received healthy response to inference request in 10.70122504234314s
5 requests
0 failed requests
5th percentile: 2.3265801906585692
10th percentile: 2.4183276176452635
20th percentile: 2.6018224716186524
30th percentile: 3.4126268863677978
40th percentile: 4.8507408618927
50th percentile: 6.2888548374176025
60th percentile: 7.97797474861145
70th percentile: 9.667094659805297
80th percentile: 10.549568700790406
90th percentile: 10.625396871566773
95th percentile: 10.663310956954955
99th percentile: 10.693642225265503
mean time: 6.486027431488037
%s, retrying in %s seconds...
Received healthy response to inference request in 2.6574923992156982s
Received healthy response to inference request in 2.695862293243408s
HTTPSConnectionPool(host='guanaco-submitter.chai-research.com', port=443): Read timed out. (read timeout=20)
Received unhealthy response to inference request!
Received healthy response to inference request in 10.288247108459473s
Received healthy response to inference request in 2.0335042476654053s
5 requests
1 failed requests
5th percentile: 2.1583018779754637
10th percentile: 2.2830995082855225
20th percentile: 2.53269476890564
30th percentile: 2.6651663780212402
40th percentile: 2.680514335632324
50th percentile: 2.695862293243408
60th percentile: 5.732816219329833
70th percentile: 8.769770145416258
80th percentile: 12.249652671813967
90th percentile: 16.17246379852295
95th percentile: 18.13386936187744
99th percentile: 19.702993812561036
mean time: 7.554076194763184
clean up pipeline due to error=DeploymentChecksError('Unacceptable number of predict errors: 20.0%')
Shutdown handler de-registered
rica40325-mistral-13b-2452_v4 status is now failed due to DeploymentManager action
rica40325-mistral-13b-2452_v4 status is now torndown due to DeploymentManager action