Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name mistralai-mistral-nemo-9330-v84-mkmlizer
Waiting for job on mistralai-mistral-nemo-9330-v84-mkmlizer to finish
Failed to get response for submission blend_hokok_2024-09-09: ('http://neversleep-noromaid-v0-8068-v150-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
mistralai-mistral-nemo-9330-v84-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ _____ __ __ ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ /___/ ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ Version: 0.10.1 ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ https://mk1.ai ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ The license key for the current software has been verified as ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ belonging to: ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ Chai Research Corp. ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ║ ║
mistralai-mistral-nemo-9330-v84-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Failed to get response for submission blend_hokok_2024-09-09: ('http://neversleep-noromaid-v0-8068-v150-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
Connection pool is full, discarding connection: %s. Connection pool size: %s
mistralai-mistral-nemo-9330-v84-mkmlizer: Downloaded to shared memory in 56.432s
mistralai-mistral-nemo-9330-v84-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpu4jpqwqm, device:0
mistralai-mistral-nemo-9330-v84-mkmlizer: Saving flywheel model at /dev/shm/model_cache
Failed to get response for submission blend_hokok_2024-09-09: ('http://neversleep-noromaid-v0-8068-v150-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '')
mistralai-mistral-nemo-9330-v84-mkmlizer: quantized model in 36.406s
mistralai-mistral-nemo-9330-v84-mkmlizer: Processed model mistralai/Mistral-Nemo-Instruct-2407 in 92.838s
mistralai-mistral-nemo-9330-v84-mkmlizer: creating bucket guanaco-mkml-models
mistralai-mistral-nemo-9330-v84-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
mistralai-mistral-nemo-9330-v84-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/mistralai-mistral-nemo-9330-v84
mistralai-mistral-nemo-9330-v84-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/mistralai-mistral-nemo-9330-v84/config.json
mistralai-mistral-nemo-9330-v84-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/mistralai-mistral-nemo-9330-v84/special_tokens_map.json
mistralai-mistral-nemo-9330-v84-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/mistralai-mistral-nemo-9330-v84/tokenizer_config.json
mistralai-mistral-nemo-9330-v84-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/mistralai-mistral-nemo-9330-v84/tokenizer.json
mistralai-mistral-nemo-9330-v84-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/mistralai-mistral-nemo-9330-v84/flywheel_model.0.safetensors
mistralai-mistral-nemo-9330-v84-mkmlizer:
Loading 0: 0%| | 0/363 [00:00<?, ?it/s]
Loading 0: 1%|▏ | 5/363 [00:00<00:10, 33.55it/s]
Loading 0: 3%|▎ | 12/363 [00:00<00:06, 51.00it/s]
Loading 0: 5%|▍ | 18/363 [00:00<00:06, 50.75it/s]
Loading 0: 7%|▋ | 24/363 [00:00<00:08, 41.83it/s]
Loading 0: 9%|▊ | 31/363 [00:00<00:07, 47.34it/s]
Loading 0: 10%|█ | 37/363 [00:00<00:07, 42.16it/s]
Loading 0: 12%|█▏ | 42/363 [00:00<00:07, 40.88it/s]
Loading 0: 13%|█▎ | 48/363 [00:01<00:07, 44.98it/s]
Loading 0: 15%|█▍ | 53/363 [00:01<00:07, 44.19it/s]
Loading 0: 16%|█▌ | 58/363 [00:01<00:06, 45.70it/s]
Loading 0: 17%|█▋ | 63/363 [00:01<00:09, 30.40it/s]
Loading 0: 18%|█▊ | 67/363 [00:01<00:09, 32.21it/s]
Loading 0: 20%|█▉ | 71/363 [00:01<00:08, 33.22it/s]
Loading 0: 21%|██ | 75/363 [00:01<00:08, 33.59it/s]
Loading 0: 22%|██▏ | 81/363 [00:02<00:07, 38.09it/s]
Loading 0: 24%|██▎ | 86/363 [00:02<00:06, 40.87it/s]
Loading 0: 25%|██▌ | 91/363 [00:02<00:07, 35.84it/s]
Loading 0: 27%|██▋ | 99/363 [00:02<00:05, 44.45it/s]
Loading 0: 29%|██▉ | 105/363 [00:02<00:06, 42.51it/s]
Loading 0: 30%|███ | 110/363 [00:02<00:05, 43.82it/s]
Loading 0: 32%|███▏ | 115/363 [00:02<00:05, 44.52it/s]
Loading 0: 33%|███▎ | 120/363 [00:02<00:05, 40.83it/s]
Loading 0: 34%|███▍ | 125/363 [00:03<00:05, 42.82it/s]
Loading 0: 36%|███▌ | 130/363 [00:03<00:05, 41.87it/s]
Loading 0: 37%|███▋ | 135/363 [00:03<00:05, 42.62it/s]
Loading 0: 39%|███▊ | 140/363 [00:03<00:05, 43.64it/s]
Loading 0: 40%|███▉ | 145/363 [00:03<00:08, 25.66it/s]
Loading 0: 41%|████ | 149/363 [00:03<00:07, 27.15it/s]
Loading 0: 43%|████▎ | 156/363 [00:04<00:05, 35.15it/s]
Loading 0: 44%|████▍ | 161/363 [00:04<00:05, 36.96it/s]
Loading 0: 46%|████▌ | 166/363 [00:04<00:05, 39.14it/s]
Loading 0: 47%|████▋ | 172/363 [00:04<00:04, 40.64it/s]
Loading 0: 49%|████▉ | 177/363 [00:04<00:04, 40.77it/s]
Loading 0: 51%|█████ | 184/363 [00:04<00:03, 45.88it/s]
Loading 0: 52%|█████▏ | 189/363 [00:04<00:03, 44.75it/s]
Loading 0: 53%|█████▎ | 194/363 [00:04<00:04, 38.29it/s]
Loading 0: 55%|█████▌ | 201/363 [00:05<00:03, 45.56it/s]
Loading 0: 57%|█████▋ | 206/363 [00:05<00:03, 41.21it/s]
Loading 0: 58%|█████▊ | 211/363 [00:05<00:03, 42.76it/s]
Loading 0: 60%|█████▉ | 217/363 [00:05<00:03, 42.98it/s]
Loading 0: 61%|██████▏ | 223/363 [00:05<00:04, 33.97it/s]
Loading 0: 63%|██████▎ | 227/363 [00:05<00:03, 34.96it/s]
Loading 0: 64%|██████▎ | 231/363 [00:05<00:03, 33.93it/s]
Loading 0: 65%|██████▌ | 237/363 [00:06<00:03, 38.99it/s]
Loading 0: 67%|██████▋ | 242/363 [00:06<00:02, 40.43it/s]
Loading 0: 68%|██████▊ | 247/363 [00:06<00:02, 40.54it/s]
Loading 0: 69%|██████▉ | 252/363 [00:06<00:02, 42.63it/s]
Loading 0: 71%|███████ | 257/363 [00:06<00:03, 35.10it/s]
Loading 0: 73%|███████▎ | 265/363 [00:06<00:02, 43.79it/s]
Loading 0: 74%|███████▍ | 270/363 [00:06<00:02, 44.11it/s]
Loading 0: 76%|███████▌ | 275/363 [00:06<00:02, 36.45it/s]
Loading 0: 78%|███████▊ | 282/363 [00:07<00:01, 42.87it/s]
Loading 0: 79%|███████▉ | 287/363 [00:07<00:01, 42.80it/s]
Loading 0: 80%|████████ | 292/363 [00:07<00:01, 44.02it/s]
Loading 0: 82%|████████▏ | 297/363 [00:07<00:01, 45.23it/s]
Loading 0: 83%|████████▎ | 302/363 [00:07<00:01, 45.66it/s]
Loading 0: 85%|████████▍ | 307/363 [00:14<00:23, 2.40it/s]
Loading 0: 86%|████████▌ | 312/363 [00:14<00:15, 3.33it/s]
Loading 0: 88%|████████▊ | 320/363 [00:14<00:07, 5.40it/s]
Loading 0: 90%|████████▉ | 325/363 [00:14<00:05, 7.07it/s]
Loading 0: 91%|█████████ | 330/363 [00:14<00:03, 8.90it/s]
Loading 0: 93%|█████████▎| 338/363 [00:15<00:01, 13.42it/s]
Loading 0: 95%|█████████▍| 344/363 [00:15<00:01, 16.59it/s]
Loading 0: 96%|█████████▌| 349/363 [00:15<00:00, 19.57it/s]
Loading 0: 98%|█████████▊| 356/363 [00:15<00:00, 25.86it/s]
Loading 0: 100%|█████████▉| 362/363 [00:15<00:00, 29.35it/s]
Job mistralai-mistral-nemo-9330-v84-mkmlizer completed after 116.06s with status: succeeded
Stopping job with name mistralai-mistral-nemo-9330-v84-mkmlizer
Pipeline stage MKMLizer completed in 116.76s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.08s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service mistralai-mistral-nemo-9330-v84
Waiting for inference service mistralai-mistral-nemo-9330-v84 to be ready
Failed to get response for submission blend_pikis_2024-09-14: ('http://sao10k-mn-12b-lyra-v4a1-v3-predictor.tenant-chaiml-guanaco.k.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', '{"error":"ValueError : [TypeError(\\"\'numpy.int64\' object is not iterable\\"), TypeError(\'vars() argument must have __dict__ attribute\')]"}')
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Inference service mistralai-mistral-nemo-9330-v84 ready after 191.17855072021484s
Pipeline stage MKMLDeployer completed in 191.70s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.8857872486114502s
Received healthy response to inference request in 1.480635643005371s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Connection pool is full, discarding connection: %s. Connection pool size: %s
Received healthy response to inference request in 1.0397188663482666s
Received healthy response to inference request in 1.1034190654754639s
Received healthy response to inference request in 2.2121059894561768s
5 requests
0 failed requests
5th percentile: 1.052458906173706
10th percentile: 1.0651989459991456
20th percentile: 1.0906790256500245
30th percentile: 1.1788623809814454
40th percentile: 1.3297490119934081
50th percentile: 1.480635643005371
60th percentile: 1.6426962852478026
70th percentile: 1.8047569274902342
80th percentile: 1.9510509967803955
90th percentile: 2.081578493118286
95th percentile: 2.1468422412872314
99th percentile: 2.199053239822388
mean time: 1.5443333625793456
Pipeline stage StressChecker completed in 8.67s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 5.91s
Shutdown handler de-registered
mistralai-mistral-nemo-_9330_v84 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.11s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service mistralai-mistral-nemo-9330-v84-profiler
Waiting for inference service mistralai-mistral-nemo-9330-v84-profiler to be ready
Inference service mistralai-mistral-nemo-9330-v84-profiler ready after 190.6437954902649s
Pipeline stage MKMLProfilerDeployer completed in 191.07s
run pipeline stage %s
Running pipeline stage MKMLProfilerRunner
kubectl cp /code/guanaco/guanaco_inference_services/src/inference_scripts tenant-chaiml-guanaco/mistralai-mistral-ne119b6d78ebf15c75060a4e92c2cc11f8-deplo5fwzs:/code/chaiverse_profiler_1726646619 --namespace tenant-chaiml-guanaco
kubectl exec -it mistralai-mistral-ne119b6d78ebf15c75060a4e92c2cc11f8-deplo5fwzs --namespace tenant-chaiml-guanaco -- sh -c 'cd /code/chaiverse_profiler_1726646619 && python profiles.py profile --best_of_n 4 --auto_batch 5 --batches 1,5,10,15,20,25,30,35,40,45,50,55,60,65,70,75,80,85,90,95,100,105,110,115,120,125,130,135,140,145,150,155,160,165,170,175,180,185,190,195 --samples 200 --input_tokens 1024 --output_tokens 64 --summary /code/chaiverse_profiler_1726646619/summary.json'
kubectl exec -it mistralai-mistral-ne119b6d78ebf15c75060a4e92c2cc11f8-deplo5fwzs --namespace tenant-chaiml-guanaco -- bash -c 'cat /code/chaiverse_profiler_1726646619/summary.json'
Pipeline stage MKMLProfilerRunner completed in 960.69s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Checking if service mistralai-mistral-nemo-9330-v84-profiler is running
Tearing down inference service mistralai-mistral-nemo-9330-v84-profiler
Service mistralai-mistral-nemo-9330-v84-profiler has been torndown
Pipeline stage MKMLProfilerDeleter completed in 2.46s
Shutdown handler de-registered
mistralai-mistral-nemo-_9330_v84 status is now inactive due to auto deactivation removed underperforming models
mistralai-mistral-nemo-_9330_v84 status is now torndown due to DeploymentManager action