Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name chaiml-cai-synth-v4-cosine-v1-mkmlizer
Waiting for job on chaiml-cai-synth-v4-cosine-v1-mkmlizer to finish
chaiml-cai-synth-v4-cosine-v1-mkmlizer: bash: cannot set terminal process group (-1): Inappropriate ioctl for device
chaiml-cai-synth-v4-cosine-v1-mkmlizer: bash: no job control in this shell
chaiml-cai-synth-v4-cosine-v1-mkmlizer: /root/miniconda3/envs/nvidia/lib/python3.11/site-packages/mk1/__init__.py:1: UserWarning: pkg_resources is deprecated as an API. See https://setuptools.pypa.io/en/latest/pkg_resources.html. The pkg_resources package is slated for removal as early as 2025-11-30. Refrain from using this package or pin to Setuptools<81.
chaiml-cai-synth-v4-cosine-v1-mkmlizer: __import__('pkg_resources').declare_namespace(__name__)
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ██████ ██████ █████ ████ ████ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ░░██████ ██████ ░░███ ███░ ░░███ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ░███░█████░███ ░███ ███ ░███ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ░███░░███ ░███ ░███████ ░███ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ░███ ░░░ ░███ ░███░░███ ░███ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ░███ ░███ ░███ ░░███ ░███ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ █████ █████ █████ ░░████ █████ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ░░░░░ ░░░░░ ░░░░░ ░░░░ ░░░░░ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ Version: 0.30.6+torch280 ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ Features: FLYWHEEL, CUDA ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ Copyright 2023-2025 MK ONE TECHNOLOGIES Inc. ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ https://mk1.ai ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ belonging to: ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ Chai Research Corp. ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ Expiration: 2028-03-31 23:59:59 ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ║ ║
chaiml-cai-synth-v4-cosine-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
chaiml-cai-synth-v4-cosine-v1-mkmlizer: Downloaded to shared memory in 97.854s
chaiml-cai-synth-v4-cosine-v1-mkmlizer: Checking if ChaiML/cai-synth-v4_cosine already exists in ChaiML
chaiml-cai-synth-v4-cosine-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpd9hswqkq, device:0
chaiml-cai-synth-v4-cosine-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
chaiml-cai-synth-v4-cosine-v1-mkmlizer:
Loading 0: 0%| | 0.00/363 [00:00<?, ?it/s]
Loading 0: 9%|▉ | 32.0/363 [00:01<00:10, 31.1it/s]
Loading 0: 9%|▉ | 32.0/363 [00:01<00:10, 31.1it/s]
Loading 0: 16%|█▌ | 57.0/363 [00:02<00:11, 26.8it/s]
Loading 0: 16%|█▌ | 57.0/363 [00:02<00:11, 26.8it/s]
Loading 0: 22%|██▏ | 80.0/363 [00:03<00:11, 24.6it/s]
Loading 0: 22%|██▏ | 80.0/363 [00:03<00:11, 24.6it/s]
Loading 0: 29%|██▉ | 107/363 [00:04<00:10, 24.9it/s]
Loading 0: 29%|██▉ | 107/363 [00:04<00:10, 24.9it/s]
Loading 0: 37%|███▋ | 136/363 [00:05<00:08, 26.2it/s]
Loading 0: 37%|███▋ | 136/363 [00:05<00:08, 26.2it/s]
Loading 0: 44%|████▍ | 161/363 [00:06<00:07, 25.7it/s]
Loading 0: 44%|████▍ | 161/363 [00:06<00:07, 25.7it/s]
Loading 0: 52%|█████▏ | 188/363 [00:07<00:06, 25.8it/s]
Loading 0: 52%|█████▏ | 188/363 [00:07<00:06, 25.8it/s]
Loading 0: 52%|█████▏ | 188/363 [00:20<00:06, 25.8it/s]
Loading 0: 55%|█████▌ | 201/363 [00:20<00:36, 4.43it/s]
Loading 0: 55%|█████▌ | 201/363 [00:20<00:36, 4.43it/s]
Loading 0: 62%|██████▏ | 224/363 [00:21<00:23, 5.90it/s]
Loading 0: 62%|██████▏ | 224/363 [00:21<00:23, 5.90it/s]
Loading 0: 70%|███████ | 255/363 [00:22<00:12, 8.53it/s]
Loading 0: 70%|███████ | 255/363 [00:22<00:12, 8.53it/s]
Loading 0: 77%|███████▋ | 280/363 [00:24<00:07, 10.6it/s]
Loading 0: 77%|███████▋ | 280/363 [00:24<00:07, 10.6it/s]
Loading 0: 84%|████████▍ | 306/363 [00:25<00:04, 12.9it/s]
Loading 0: 84%|████████▍ | 306/363 [00:25<00:04, 12.9it/s]
Loading 0: 93%|█████████▎| 336/363 [00:26<00:01, 15.6it/s]
Loading 0: 93%|█████████▎| 336/363 [00:26<00:01, 15.6it/s]
Loading 0: 98%|█████████▊| 357/363 [00:27<00:00, 16.4it/s]
Loading 0: 98%|█████████▊| 357/363 [00:27<00:00, 16.4it/s]
Loading 0: 100%|██████████| 363/363 [00:27<00:00, 17.3it/s]
Loading 0: 100%|██████████| 363/363 [00:27<00:00, 17.3it/s]
Loading 0: 100%|██████████| 363/363 [00:27<00:00, 13.2it/s]
chaiml-cai-synth-v4-cosine-v1-mkmlizer: The tokenizer you are loading from '/tmp/tmpd9hswqkq' with an incorrect regex pattern: https://huggingface.co/mistralai/Mistral-Small-3.1-24B-Instruct-2503/discussions/84#69121093e8b480e709447d5e. This will lead to incorrect tokenization. You should set the `fix_mistral_regex=True` flag when loading this tokenizer to fix this issue.
chaiml-cai-synth-v4-cosine-v1-mkmlizer: quantized model in 44.298s
chaiml-cai-synth-v4-cosine-v1-mkmlizer: Processed model ChaiML/cai-synth-v4_cosine in 142.153s
chaiml-cai-synth-v4-cosine-v1-mkmlizer: creating bucket guanaco-mkml-models
chaiml-cai-synth-v4-cosine-v1-mkmlizer: Bucket 's3://guanaco-mkml-models/' created
chaiml-cai-synth-v4-cosine-v1-mkmlizer: uploading /dev/shm/model_cache to s3://guanaco-mkml-models/chaiml-cai-synth-v4-cosine-v1/nvidia
chaiml-cai-synth-v4-cosine-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/chaiml-cai-synth-v4-cosine-v1/nvidia/config.json
chaiml-cai-synth-v4-cosine-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/chaiml-cai-synth-v4-cosine-v1/nvidia/special_tokens_map.json
chaiml-cai-synth-v4-cosine-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/chaiml-cai-synth-v4-cosine-v1/nvidia/tokenizer_config.json
chaiml-cai-synth-v4-cosine-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-cai-synth-v4-cosine-v1/nvidia/tokenizer.json
chaiml-cai-synth-v4-cosine-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.1.safetensors s3://guanaco-mkml-models/chaiml-cai-synth-v4-cosine-v1/nvidia/flywheel_model.1.safetensors
chaiml-cai-synth-v4-cosine-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-cai-synth-v4-cosine-v1/nvidia/flywheel_model.0.safetensors
Job chaiml-cai-synth-v4-cosine-v1-mkmlizer completed after 236.16s with status: succeeded
Stopping job with name chaiml-cai-synth-v4-cosine-v1-mkmlizer
Pipeline stage MKMLizer completed in 236.70s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.15s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service chaiml-cai-synth-v4-cosine-v1
Waiting for inference service chaiml-cai-synth-v4-cosine-v1 to be ready
Unable to record family friendly update due to error: ('http://chaiml-nemo-guard-merged-v3-predictor.tenant-chaiml-guanaco.k2.chaiverse.com/v1/models/GPT-J-6B-lit-v2:predict', 'read tcp 127.0.0.1:58692->127.0.0.1:8080: read: connection reset by peer\n')
Inference service chaiml-cai-synth-v4-cosine-v1 ready after 170.8562092781067s
Pipeline stage MKMLDeployer completed in 171.47s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 4.3316309452056885s
Received healthy response to inference request in 4.344685077667236s
Received healthy response to inference request in 4.351128339767456s
Received healthy response to inference request in 4.2287445068359375s
Received healthy response to inference request in 4.523427963256836s
5 requests
0 failed requests
5th percentile: 4.249321794509887
10th percentile: 4.269899082183838
20th percentile: 4.311053657531739
30th percentile: 4.334241771697998
40th percentile: 4.339463424682617
50th percentile: 4.344685077667236
60th percentile: 4.347262382507324
70th percentile: 4.349839687347412
80th percentile: 4.385588264465332
90th percentile: 4.454508113861084
95th percentile: 4.48896803855896
99th percentile: 4.516535978317261
mean time: 4.355923366546631
%s, retrying in %s seconds...
Received healthy response to inference request in 4.053181886672974s
Received healthy response to inference request in 4.324198961257935s
Received healthy response to inference request in 4.108707427978516s
Received healthy response to inference request in 4.347848415374756s
Received healthy response to inference request in 4.035433292388916s
5 requests
0 failed requests
5th percentile: 4.038983011245728
10th percentile: 4.042532730102539
20th percentile: 4.049632167816162
30th percentile: 4.064286994934082
40th percentile: 4.086497211456299
50th percentile: 4.108707427978516
60th percentile: 4.194904041290283
70th percentile: 4.281100654602051
80th percentile: 4.328928852081299
90th percentile: 4.3383886337280275
95th percentile: 4.343118524551391
99th percentile: 4.346902437210083
mean time: 4.173873996734619
%s, retrying in %s seconds...
Received healthy response to inference request in 4.472404956817627s
Received healthy response to inference request in 4.760805130004883s
Received healthy response to inference request in 4.452908754348755s
Received healthy response to inference request in 4.407743692398071s
Received healthy response to inference request in 4.024809122085571s
5 requests
0 failed requests
5th percentile: 4.1013960361480715
10th percentile: 4.177982950210572
20th percentile: 4.331156778335571
30th percentile: 4.416776704788208
40th percentile: 4.434842729568482
50th percentile: 4.452908754348755
60th percentile: 4.4607072353363035
70th percentile: 4.468505716323852
80th percentile: 4.5300849914550785
90th percentile: 4.645445060729981
95th percentile: 4.703125095367431
99th percentile: 4.749269123077393
mean time: 4.423734331130982
clean up pipeline due to error=DeploymentChecksError('Unacceptable 70th percentile latency 4.468505716323852s')
Shutdown handler de-registered
chaiml-cai-synth-v4-cosine_v1 status is now failed due to DeploymentManager action