submission_id: nousresearch-meta-llama_4939_v82
developer_uid: end_to_end_test
formatter: {'memory_template': "{bot_name}'s Persona: {memory}\n####\n", 'prompt_template': '{prompt}\n<START>\n', 'bot_template': '{bot_name}: {message}\n', 'user_template': '{user_name}: {message}\n', 'response_template': '{bot_name}:', 'truncate_by_message': False}
generation_params: {'temperature': 1.0, 'top_p': 0.99, 'min_p': 0.1, 'top_k': 40, 'presence_penalty': 0.0, 'frequency_penalty': 0.0, 'stopping_words': ['\n'], 'max_input_tokens': 512, 'best_of': 4, 'max_output_tokens': 64}
model_name: nousresearch-meta-llama_4939_v82
model_repo: NousResearch/Meta-Llama-3.1-8B-Instruct
status: inactive
timestamp: 2024-10-21T14:37:32+00:00
Resubmit model
nousresearch-meta-llama_4939_v81 status is now torndown due to DeploymentManager action
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name nousresearch-meta-llama-4939-v82-mkmlizer
Waiting for job on nousresearch-meta-llama-4939-v82-mkmlizer to finish
nousresearch-meta-llama-4939-v82-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
nousresearch-meta-llama-4939-v82-mkmlizer: ║ _____ __ __ ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ /___/ ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ Version: 0.11.12 ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ https://mk1.ai ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ The license key for the current software has been verified as ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ belonging to: ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ Chai Research Corp. ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ Expiration: 2025-01-15 23:59:59 ║
nousresearch-meta-llama-4939-v82-mkmlizer: ║ ║
nousresearch-meta-llama-4939-v82-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
nousresearch-meta-llama-4939-v82-mkmlizer: Downloaded to shared memory in 35.719s
nousresearch-meta-llama-4939-v82-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmp3q6hicoc, device:0
nousresearch-meta-llama-4939-v82-mkmlizer: Saving flywheel model at /dev/shm/model_cache
nousresearch-meta-llama-4939-v82-mkmlizer: quantized model in 25.945s
nousresearch-meta-llama-4939-v82-mkmlizer: Processed model NousResearch/Meta-Llama-3.1-8B-Instruct in 61.665s
nousresearch-meta-llama-4939-v82-mkmlizer: creating bucket guanaco-mkml-models
nousresearch-meta-llama-4939-v82-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
nousresearch-meta-llama-4939-v82-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v82
nousresearch-meta-llama-4939-v82-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v82/special_tokens_map.json
nousresearch-meta-llama-4939-v82-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v82/config.json
nousresearch-meta-llama-4939-v82-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v82/tokenizer_config.json
nousresearch-meta-llama-4939-v82-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v82/tokenizer.json
nousresearch-meta-llama-4939-v82-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/nousresearch-meta-llama-4939-v82/flywheel_model.0.safetensors
nousresearch-meta-llama-4939-v82-mkmlizer: Loading 0: 0%| | 0/291 [00:00<?, ?it/s] Loading 0: 2%|▏ | 5/291 [00:00<00:08, 33.75it/s] Loading 0: 5%|▍ | 14/291 [00:00<00:06, 46.15it/s] Loading 0: 8%|▊ | 23/291 [00:00<00:05, 50.84it/s] Loading 0: 11%|█ | 31/291 [00:00<00:04, 59.09it/s] Loading 0: 13%|█▎ | 38/291 [00:00<00:04, 57.80it/s] Loading 0: 15%|█▌ | 44/291 [00:00<00:04, 57.39it/s] Loading 0: 17%|█▋ | 50/291 [00:00<00:04, 49.76it/s] Loading 0: 20%|██ | 59/291 [00:01<00:04, 51.47it/s] Loading 0: 23%|██▎ | 67/291 [00:01<00:03, 58.28it/s] Loading 0: 25%|██▌ | 74/291 [00:01<00:03, 56.62it/s] Loading 0: 27%|██▋ | 80/291 [00:01<00:03, 54.39it/s] Loading 0: 30%|██▉ | 86/291 [00:01<00:05, 34.90it/s] Loading 0: 32%|███▏ | 94/291 [00:01<00:04, 42.51it/s] Loading 0: 34%|███▍ | 100/291 [00:02<00:04, 41.17it/s] Loading 0: 36%|███▌ | 105/291 [00:02<00:04, 41.12it/s] Loading 0: 38%|███▊ | 112/291 [00:02<00:03, 46.87it/s] Loading 0: 41%|████ | 118/291 [00:02<00:03, 46.05it/s] Loading 0: 42%|████▏ | 123/291 [00:02<00:03, 45.49it/s] Loading 0: 45%|████▍ | 130/291 [00:02<00:03, 51.12it/s] Loading 0: 47%|████▋ | 136/291 [00:02<00:03, 49.50it/s] Loading 0: 49%|████▉ | 142/291 [00:02<00:02, 50.58it/s] Loading 0: 51%|█████ | 148/291 [00:03<00:02, 52.37it/s] Loading 0: 53%|█████▎ | 154/291 [00:03<00:02, 50.05it/s] Loading 0: 55%|█████▍ | 160/291 [00:03<00:02, 49.57it/s] Loading 0: 57%|█████▋ | 166/291 [00:03<00:02, 50.22it/s] Loading 0: 59%|█████▉ | 172/291 [00:03<00:02, 47.35it/s] Loading 0: 62%|██████▏ | 179/291 [00:03<00:02, 50.14it/s] Loading 0: 64%|██████▎ | 185/291 [00:03<00:02, 52.62it/s] Loading 0: 66%|██████▌ | 191/291 [00:04<00:02, 34.33it/s] Loading 0: 67%|██████▋ | 196/291 [00:04<00:02, 36.30it/s] Loading 0: 69%|██████▉ | 202/291 [00:04<00:02, 40.91it/s] Loading 0: 71%|███████▏ | 208/291 [00:04<00:02, 40.96it/s] Loading 0: 73%|███████▎ | 213/291 [00:04<00:01, 40.66it/s] Loading 0: 75%|███████▌ | 219/291 [00:04<00:01, 44.37it/s] Loading 0: 77%|███████▋ | 224/291 [00:04<00:01, 43.64it/s] Loading 0: 79%|███████▊ | 229/291 [00:04<00:01, 43.98it/s] Loading 0: 81%|████████ | 235/291 [00:05<00:01, 42.42it/s] Loading 0: 82%|████████▏ | 240/291 [00:05<00:01, 42.25it/s] Loading 0: 85%|████████▍ | 247/291 [00:05<00:00, 48.06it/s] Loading 0: 87%|████████▋ | 253/291 [00:05<00:00, 46.21it/s] Loading 0: 89%|████████▊ | 258/291 [00:05<00:00, 44.97it/s] Loading 0: 91%|█████████ | 265/291 [00:05<00:00, 49.38it/s] Loading 0: 93%|█████████▎| 271/291 [00:05<00:00, 46.41it/s] Loading 0: 95%|█████████▍| 276/291 [00:05<00:00, 46.48it/s] Loading 0: 97%|█████████▋| 282/291 [00:06<00:00, 42.26it/s] Loading 0: 99%|█████████▊| 287/291 [00:11<00:01, 3.26it/s]
Job nousresearch-meta-llama-4939-v82-mkmlizer completed after 88.53s with status: succeeded
Stopping job with name nousresearch-meta-llama-4939-v82-mkmlizer
Pipeline stage MKMLizer completed in 89.84s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.38s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service nousresearch-meta-llama-4939-v82
Waiting for inference service nousresearch-meta-llama-4939-v82 to be ready
Inference service nousresearch-meta-llama-4939-v82 ready after 151.8659746646881s
Pipeline stage MKMLDeployer completed in 153.00s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 2.680114507675171s
Received healthy response to inference request in 1.54856276512146s
Received healthy response to inference request in 1.6701829433441162s
Received healthy response to inference request in 1.5300850868225098s
Received healthy response to inference request in 1.4138238430023193s
5 requests
0 failed requests
5th percentile: 1.4370760917663574
10th percentile: 1.4603283405303955
20th percentile: 1.5068328380584717
30th percentile: 1.5337806224822998
40th percentile: 1.5411716938018798
50th percentile: 1.54856276512146
60th percentile: 1.5972108364105224
70th percentile: 1.6458589076995849
80th percentile: 1.8721692562103274
90th percentile: 2.276141881942749
95th percentile: 2.47812819480896
99th percentile: 2.6397172451019286
mean time: 1.7685538291931153
Pipeline stage StressChecker completed in 12.02s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyTriggerPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage OfflineFamilyFriendlyTriggerPipeline completed in 3.15s
Shutdown handler de-registered
nousresearch-meta-llama_4939_v82 status is now deployed due to DeploymentManager action
nousresearch-meta-llama_4939_v82 status is now inactive due to admin request
admin requested tearing down of nousresearch-meta-llama_4939_v82
Shutdown handler not registered because Python interpreter is not running in the main thread
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
clean up pipeline due to error=DeploymentChecksError("('http://nousresearch-meta-llama-4939-v82-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')")
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
clean up pipeline due to error=DeploymentChecksError("('http://nousresearch-meta-llama-4939-v82-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')")
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
clean up pipeline due to error=DeploymentChecksError("('http://nousresearch-meta-llama-4939-v82-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')")
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
clean up pipeline due to error=DeploymentChecksError("('http://nousresearch-meta-llama-4939-v82-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')")
Shutdown handler de-registered
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage OfflineFamilyFriendlyScorer
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
%s, retrying in %s seconds...
Evaluating %s Family Friendly Score with %s threads
clean up pipeline due to error=DeploymentChecksError("('http://nousresearch-meta-llama-4939-v82-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')")
Shutdown handler de-registered