Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer
Waiting for job on chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer to finish
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLizer
Starting job with name arushimgupta-peft-save-1-v4-mkmlizer
Waiting for job on arushimgupta-peft-save-1-v4-mkmlizer to finish
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ _____ __ __ ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ /___/ ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ Version: 0.11.12 ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ https://mk1.ai ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ The license key for the current software has been verified as ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ belonging to: ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ Chai Research Corp. ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ║ ║
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
arushimgupta-peft-save-1-v4-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
arushimgupta-peft-save-1-v4-mkmlizer: ║ _____ __ __ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ /___/ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Version: 0.11.12 ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ https://mk1.ai ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ The license key for the current software has been verified as ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ belonging to: ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Chai Research Corp. ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
arushimgupta-peft-save-1-v4-mkmlizer: Downloaded to shared memory in 16.028s
arushimgupta-peft-save-1-v4-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpbziar55a, device:0
arushimgupta-peft-save-1-v4-mkmlizer: Saving flywheel model at /dev/shm/model_cache
arushimgupta-peft-save-1-v4-mkmlizer:
Loading 0: 0%| | 0/1203 [00:00<?, ?it/s]Traceback (most recent call last):
arushimgupta-peft-save-1-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 151, in <module>
arushimgupta-peft-save-1-v4-mkmlizer: cli()
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
arushimgupta-peft-save-1-v4-mkmlizer: return self.main(*args, **kwargs)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1078, in main
arushimgupta-peft-save-1-v4-mkmlizer: rv = self.invoke(ctx)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
arushimgupta-peft-save-1-v4-mkmlizer: return _process_result(sub_ctx.command.invoke(sub_ctx))
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
arushimgupta-peft-save-1-v4-mkmlizer: return ctx.invoke(self.callback, **ctx.params)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
arushimgupta-peft-save-1-v4-mkmlizer: return __callback(*args, **kwargs)
arushimgupta-peft-save-1-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 42, in quantize
arushimgupta-peft-save-1-v4-mkmlizer: quantize_model(temp_folder, output_path, profile, device)
arushimgupta-peft-save-1-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 135, in quantize_model
arushimgupta-peft-save-1-v4-mkmlizer: flywheel.instrument(
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/instrument.py", line 93, in instrument
arushimgupta-peft-save-1-v4-mkmlizer: compiler.save_pretrained(input_model_path, output_model_path, storage_format)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/functional/compiler.py", line 23, in save_pretrained
arushimgupta-peft-save-1-v4-mkmlizer: self.save_st_pretrained(input_model_path, output_model_path)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/functional/compiler.py", line 38, in save_st_pretrained
arushimgupta-peft-save-1-v4-mkmlizer: for name, tensor in model_iterator:
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/models/mistral.py", line 241, in tensor_merger
arushimgupta-peft-save-1-v4-mkmlizer: for name, tensor in tensor_iterator:
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/functional/loader.py", line 217, in tensor_compiler
arushimgupta-peft-save-1-v4-mkmlizer: compiled_tensor = runtime.instrument(tensor, profile.value)
arushimgupta-peft-save-1-v4-mkmlizer: RuntimeError: CUDA error: invalid configuration argument
arushimgupta-peft-save-1-v4-mkmlizer: CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
arushimgupta-peft-save-1-v4-mkmlizer: For debugging consider passing CUDA_LAUNCH_BLOCKING=1
arushimgupta-peft-save-1-v4-mkmlizer: Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
arushimgupta-peft-save-1-v4-mkmlizer: Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:43 (most recent call first):
arushimgupta-peft-save-1-v4-mkmlizer: frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7296984cbf86 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x72969847ad10 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7296985a6f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #3: void at::native::gpu_kernel_impl<__nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> >(at::TensorIteratorBase&, __nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> const&) + 0x4de (0x72964af0aa2e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #4: void at::native::gpu_kernel<__nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> >(at::TensorIteratorBase&, __nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> const&) + 0x34b (0x72964af0b01b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #5: at::native::direct_copy_kernel_cuda(at::TensorIteratorBase&) + 0x38c (0x72964aec9a0c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #6: at::native::copy_device_to_device(at::TensorIterator&, bool, bool) + 0xb25 (0x72964aeca715 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #7: <unknown function> + 0x1910312 (0x72964aecc312 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #8: <unknown function> + 0x1cbebff (0x729680499bff in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #9: at::native::copy_(at::Tensor&, at::Tensor const&, bool) + 0x62 (0x72968049b5a2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #10: at::_ops::copy_::call(at::Tensor&, at::Tensor const&, bool) + 0x15c (0x72968125635c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #11: at::native::_to_copy(at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0x1e01 (0x7296807b96b1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #12: <unknown function> + 0x2e19f8b (0x7296815f4f8b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0xf5 (0x729680cfdc25 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #14: <unknown function> + 0x2c58a33 (0x729681433a33 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #15: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0xf5 (0x729680cfdc25 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #16: <unknown function> + 0x470df1f (0x729682ee8f1f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #17: <unknown function> + 0x470e35e (0x729682ee935e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #18: at::_ops::_to_copy::call(at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0x1eb (0x729680d8d68b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #19: at::native::to(at::Tensor const&, c10::ScalarType, bool, bool, std::optional<c10::MemoryFormat>) + 0xa2 (0x7296807b6182 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #20: <unknown function> + 0x301e3b0 (0x7296817f93b0 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #21: at::_ops::to_dtype::call(at::Tensor const&, c10::ScalarType, bool, bool, std::optional<c10::MemoryFormat>) + 0x178 (0x729680f3d258 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #22: mkodec::instrument(at::Tensor, int) + 0x4b (0x7295bcdc81ab in /opt/conda/lib/python3.10/site-packages/mk1/flywheel/runtime.cpython-310-x86_64-linux-gnu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #23: <unknown function> + 0x981e2 (0x7295bcdb01e2 in /opt/conda/lib/python3.10/site-packages/mk1/flywheel/runtime.cpython-310-x86_64-linux-gnu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #24: <unknown function> + 0xa7b9b (0x7295bcdbfb9b in /opt/conda/lib/python3.10/site-packages/mk1/flywheel/runtime.cpython-310-x86_64-linux-gnu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #25: python3() [0x4fd907]
arushimgupta-peft-save-1-v4-mkmlizer: <omitting python frames>
arushimgupta-peft-save-1-v4-mkmlizer: frame #28: python3() [0x5112cf]
arushimgupta-peft-save-1-v4-mkmlizer: frame #30: python3() [0x5112cf]
arushimgupta-peft-save-1-v4-mkmlizer: frame #43: python3() [0x5095ce]
arushimgupta-peft-save-1-v4-mkmlizer: frame #50: python3() [0x509857]
arushimgupta-peft-save-1-v4-mkmlizer: frame #54: python3() [0x5cf913]
arushimgupta-peft-save-1-v4-mkmlizer: frame #57: python3() [0x5951c2]
arushimgupta-peft-save-1-v4-mkmlizer: frame #59: python3() [0x5c5ef7]
arushimgupta-peft-save-1-v4-mkmlizer: frame #60: python3() [0x5c1030]
arushimgupta-peft-save-1-v4-mkmlizer: frame #61: python3() [0x459781]
arushimgupta-peft-save-1-v4-mkmlizer:
arushimgupta-peft-save-1-v4-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
arushimgupta-peft-save-1-v4-mkmlizer: ║ _____ __ __ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ /___/ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Version: 0.11.12 ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ https://mk1.ai ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ The license key for the current software has been verified as ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ belonging to: ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Chai Research Corp. ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
Job arushimgupta-peft-save-1-v4-mkmlizer completed after 122.64s with status: failed
Stopping job with name arushimgupta-peft-save-1-v4-mkmlizer
%s, retrying in %s seconds...
Starting job with name arushimgupta-peft-save-1-v4-mkmlizer
Waiting for job on arushimgupta-peft-save-1-v4-mkmlizer to finish
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: Downloaded to shared memory in 109.492s
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpssrwu5mq, device:0
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: Saving flywheel model at /dev/shm/model_cache
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: quantized model in 43.236s
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: Processed model ChaiML/0926-nemo-virgo-top-safe-bot-1edit-long in 152.729s
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: creating bucket guanaco-mkml-models
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: cp /dev/shm/model_cache/special_tokens_map.json s3://guanaco-mkml-models/chaiml-0926-nemo-virgo-t-5421-v1/special_tokens_map.json
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: cp /dev/shm/model_cache/config.json s3://guanaco-mkml-models/chaiml-0926-nemo-virgo-t-5421-v1/config.json
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer_config.json s3://guanaco-mkml-models/chaiml-0926-nemo-virgo-t-5421-v1/tokenizer_config.json
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: cp /dev/shm/model_cache/tokenizer.json s3://guanaco-mkml-models/chaiml-0926-nemo-virgo-t-5421-v1/tokenizer.json
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer: cp /dev/shm/model_cache/flywheel_model.0.safetensors s3://guanaco-mkml-models/chaiml-0926-nemo-virgo-t-5421-v1/flywheel_model.0.safetensors
chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer:
Loading 0: 0%| | 0/363 [00:00<?, ?it/s]
Loading 0: 1%|▏ | 5/363 [00:00<00:15, 22.69it/s]
Loading 0: 3%|▎ | 10/363 [00:00<00:11, 29.43it/s]
Loading 0: 4%|▍ | 14/363 [00:00<00:13, 26.51it/s]
Loading 0: 6%|▌ | 21/363 [00:00<00:08, 38.34it/s]
Loading 0: 7%|▋ | 26/363 [00:01<00:16, 21.04it/s]
Loading 0: 9%|▊ | 31/363 [00:01<00:12, 25.74it/s]
Loading 0: 10%|▉ | 35/363 [00:01<00:12, 26.00it/s]
Loading 0: 11%|█ | 39/363 [00:01<00:12, 26.89it/s]
Loading 0: 12%|█▏ | 43/363 [00:01<00:12, 25.79it/s]
Loading 0: 13%|█▎ | 46/363 [00:01<00:11, 26.52it/s]
Loading 0: 13%|█▎ | 49/363 [00:01<00:11, 27.08it/s]
Loading 0: 14%|█▍ | 52/363 [00:01<00:12, 25.73it/s]
Loading 0: 15%|█▌ | 55/363 [00:02<00:11, 26.63it/s]
Loading 0: 17%|█▋ | 60/363 [00:02<00:10, 28.04it/s]
Loading 0: 17%|█▋ | 63/363 [00:02<00:14, 20.07it/s]
Loading 0: 18%|█▊ | 66/363 [00:02<00:14, 20.60it/s]
Loading 0: 19%|█▉ | 69/363 [00:02<00:13, 22.12it/s]
Loading 0: 20%|█▉ | 72/363 [00:02<00:14, 20.72it/s]
Loading 0: 21%|██ | 77/363 [00:03<00:13, 21.63it/s]
Loading 0: 23%|██▎ | 82/363 [00:03<00:10, 26.16it/s]
Loading 0: 24%|██▎ | 86/363 [00:03<00:11, 23.16it/s]
Loading 0: 25%|██▌ | 91/363 [00:03<00:09, 27.39it/s]
Loading 0: 26%|██▌ | 95/363 [00:03<00:11, 24.17it/s]
Loading 0: 28%|██▊ | 100/363 [00:03<00:09, 28.45it/s]
Loading 0: 29%|██▊ | 104/363 [00:04<00:13, 19.12it/s]
Loading 0: 31%|███ | 111/363 [00:04<00:09, 25.74it/s]
Loading 0: 32%|███▏ | 115/363 [00:04<00:09, 26.08it/s]
Loading 0: 33%|███▎ | 120/363 [00:04<00:08, 29.01it/s]
Loading 0: 34%|███▍ | 124/363 [00:04<00:08, 28.49it/s]
Loading 0: 36%|███▌ | 129/363 [00:05<00:07, 31.05it/s]
Loading 0: 37%|███▋ | 133/363 [00:05<00:07, 29.99it/s]
Loading 0: 38%|███▊ | 137/363 [00:05<00:07, 30.68it/s]
Loading 0: 39%|███▉ | 142/363 [00:05<00:08, 26.65it/s]
Loading 0: 40%|███▉ | 145/363 [00:05<00:08, 24.66it/s]
Loading 0: 41%|████ | 149/363 [00:05<00:09, 22.55it/s]
Loading 0: 42%|████▏ | 154/363 [00:06<00:07, 26.53it/s]
Loading 0: 44%|████▎ | 158/363 [00:06<00:08, 23.96it/s]
Loading 0: 45%|████▍ | 163/363 [00:06<00:06, 28.70it/s]
Loading 0: 46%|████▌ | 167/363 [00:06<00:07, 25.24it/s]
Loading 0: 47%|████▋ | 172/363 [00:06<00:06, 28.68it/s]
Loading 0: 48%|████▊ | 176/363 [00:06<00:07, 24.98it/s]
Loading 0: 50%|████▉ | 181/363 [00:06<00:06, 29.17it/s]
Loading 0: 51%|█████ | 185/363 [00:07<00:09, 19.63it/s]
Loading 0: 52%|█████▏ | 190/363 [00:07<00:07, 23.80it/s]
Loading 0: 53%|█████▎ | 194/363 [00:07<00:07, 21.80it/s]
Loading 0: 55%|█████▍ | 199/363 [00:07<00:06, 26.14it/s]
Loading 0: 56%|█████▌ | 203/363 [00:08<00:06, 23.24it/s]
Loading 0: 57%|█████▋ | 208/363 [00:08<00:05, 27.32it/s]
Loading 0: 58%|█████▊ | 212/363 [00:08<00:06, 24.18it/s]
Loading 0: 60%|█████▉ | 217/363 [00:08<00:05, 28.41it/s]
Loading 0: 61%|██████ | 222/363 [00:08<00:04, 29.46it/s]
Loading 0: 62%|██████▏ | 226/363 [00:09<00:06, 20.76it/s]
Loading 0: 63%|██████▎ | 230/363 [00:09<00:06, 20.53it/s]
Loading 0: 65%|██████▌ | 237/363 [00:09<00:04, 26.61it/s]
Loading 0: 66%|██████▋ | 241/363 [00:09<00:04, 26.24it/s]
Loading 0: 68%|██████▊ | 246/363 [00:09<00:04, 28.80it/s]
Loading 0: 69%|██████▉ | 250/363 [00:09<00:04, 28.06it/s]
Loading 0: 70%|███████ | 255/363 [00:09<00:03, 30.86it/s]
Loading 0: 71%|███████▏ | 259/363 [00:10<00:03, 29.42it/s]
Loading 0: 72%|███████▏ | 263/363 [00:10<00:04, 23.92it/s]
Loading 0: 73%|███████▎ | 266/363 [00:10<00:04, 21.17it/s]
Loading 0: 75%|███████▍ | 271/363 [00:10<00:03, 26.18it/s]
Loading 0: 76%|███████▌ | 275/363 [00:10<00:03, 22.97it/s]
Loading 0: 77%|███████▋ | 280/363 [00:10<00:02, 27.86it/s]
Loading 0: 78%|███████▊ | 284/363 [00:11<00:03, 24.76it/s]
Loading 0: 80%|███████▉ | 289/363 [00:11<00:02, 29.05it/s]
Loading 0: 81%|████████ | 293/363 [00:11<00:02, 25.39it/s]
Loading 0: 82%|████████▏ | 298/363 [00:11<00:02, 29.00it/s]
Loading 0: 83%|████████▎ | 303/363 [00:11<00:02, 29.36it/s]
Loading 0: 85%|████████▍ | 307/363 [00:12<00:02, 20.74it/s]
Loading 0: 85%|████████▌ | 310/363 [00:12<00:02, 22.21it/s]
Loading 0: 86%|████████▌ | 313/363 [00:12<00:02, 22.15it/s]
Loading 0: 87%|████████▋ | 316/363 [00:12<00:02, 23.22it/s]
Loading 0: 88%|████████▊ | 320/363 [00:12<00:02, 21.46it/s]
Loading 0: 90%|████████▉ | 325/363 [00:12<00:01, 26.42it/s]
Loading 0: 91%|█████████ | 329/363 [00:13<00:01, 23.58it/s]
Loading 0: 92%|█████████▏| 334/363 [00:13<00:01, 27.81it/s]
Loading 0: 93%|█████████▎| 338/363 [00:13<00:01, 24.22it/s]
Loading 0: 94%|█████████▍| 343/363 [00:13<00:00, 28.47it/s]
Loading 0: 96%|█████████▌| 347/363 [00:20<00:08, 1.97it/s]
Loading 0: 96%|█████████▋| 350/363 [00:20<00:05, 2.49it/s]
Loading 0: 97%|█████████▋| 353/363 [00:20<00:03, 3.18it/s]
Loading 0: 98%|█████████▊| 357/363 [00:20<00:01, 4.35it/s]
Loading 0: 100%|█████████▉| 362/363 [00:21<00:00, 6.49it/s]
Job chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer completed after 205.21s with status: succeeded
Stopping job with name chaiml-0926-nemo-virgo-t-5421-v1-mkmlizer
Pipeline stage MKMLizer completed in 206.03s
run pipeline stage %s
Running pipeline stage MKMLTemplater
Pipeline stage MKMLTemplater completed in 0.13s
run pipeline stage %s
Running pipeline stage MKMLDeployer
Creating inference service chaiml-0926-nemo-virgo-t-5421-v1
Waiting for inference service chaiml-0926-nemo-virgo-t-5421-v1 to be ready
arushimgupta-peft-save-1-v4-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
arushimgupta-peft-save-1-v4-mkmlizer: ║ _____ __ __ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ /___/ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Version: 0.11.12 ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ https://mk1.ai ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ The license key for the current software has been verified as ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ belonging to: ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Chai Research Corp. ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
arushimgupta-peft-save-1-v4-mkmlizer: Downloaded to shared memory in 16.076s
arushimgupta-peft-save-1-v4-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmplc93rojo, device:0
arushimgupta-peft-save-1-v4-mkmlizer: Saving flywheel model at /dev/shm/model_cache
arushimgupta-peft-save-1-v4-mkmlizer:
Loading 0: 0%| | 0/1203 [00:00<?, ?it/s]Traceback (most recent call last):
arushimgupta-peft-save-1-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 151, in <module>
arushimgupta-peft-save-1-v4-mkmlizer: cli()
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
arushimgupta-peft-save-1-v4-mkmlizer: return self.main(*args, **kwargs)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1078, in main
arushimgupta-peft-save-1-v4-mkmlizer: rv = self.invoke(ctx)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
arushimgupta-peft-save-1-v4-mkmlizer: return _process_result(sub_ctx.command.invoke(sub_ctx))
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
arushimgupta-peft-save-1-v4-mkmlizer: return ctx.invoke(self.callback, **ctx.params)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
arushimgupta-peft-save-1-v4-mkmlizer: return __callback(*args, **kwargs)
arushimgupta-peft-save-1-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 42, in quantize
arushimgupta-peft-save-1-v4-mkmlizer: quantize_model(temp_folder, output_path, profile, device)
arushimgupta-peft-save-1-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 135, in quantize_model
arushimgupta-peft-save-1-v4-mkmlizer: flywheel.instrument(
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/instrument.py", line 93, in instrument
arushimgupta-peft-save-1-v4-mkmlizer: compiler.save_pretrained(input_model_path, output_model_path, storage_format)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/functional/compiler.py", line 23, in save_pretrained
arushimgupta-peft-save-1-v4-mkmlizer: self.save_st_pretrained(input_model_path, output_model_path)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/functional/compiler.py", line 38, in save_st_pretrained
arushimgupta-peft-save-1-v4-mkmlizer: for name, tensor in model_iterator:
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/models/mistral.py", line 241, in tensor_merger
arushimgupta-peft-save-1-v4-mkmlizer: for name, tensor in tensor_iterator:
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/functional/loader.py", line 217, in tensor_compiler
arushimgupta-peft-save-1-v4-mkmlizer: compiled_tensor = runtime.instrument(tensor, profile.value)
arushimgupta-peft-save-1-v4-mkmlizer: RuntimeError: CUDA error: invalid configuration argument
arushimgupta-peft-save-1-v4-mkmlizer: CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
arushimgupta-peft-save-1-v4-mkmlizer: For debugging consider passing CUDA_LAUNCH_BLOCKING=1
arushimgupta-peft-save-1-v4-mkmlizer: Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
arushimgupta-peft-save-1-v4-mkmlizer: Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:43 (most recent call first):
arushimgupta-peft-save-1-v4-mkmlizer: frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x755352f77f86 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x755352f26d10 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7553533d4f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #3: void at::native::gpu_kernel_impl<__nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> >(at::TensorIteratorBase&, __nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> const&) + 0x4de (0x75530590aa2e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #4: void at::native::gpu_kernel<__nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> >(at::TensorIteratorBase&, __nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> const&) + 0x34b (0x75530590b01b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #5: at::native::direct_copy_kernel_cuda(at::TensorIteratorBase&) + 0x38c (0x7553058c9a0c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #6: at::native::copy_device_to_device(at::TensorIterator&, bool, bool) + 0xb25 (0x7553058ca715 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #7: <unknown function> + 0x1910312 (0x7553058cc312 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #8: <unknown function> + 0x1cbebff (0x75533ae99bff in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #9: at::native::copy_(at::Tensor&, at::Tensor const&, bool) + 0x62 (0x75533ae9b5a2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #10: at::_ops::copy_::call(at::Tensor&, at::Tensor const&, bool) + 0x15c (0x75533bc5635c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #11: at::native::_to_copy(at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0x1e01 (0x75533b1b96b1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #12: <unknown function> + 0x2e19f8b (0x75533bff4f8b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0xf5 (0x75533b6fdc25 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #14: <unknown function> + 0x2c58a33 (0x75533be33a33 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #15: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0xf5 (0x75533b6fdc25 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #16: <unknown function> + 0x470df1f (0x75533d8e8f1f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #17: <unknown function> + 0x470e35e (0x75533d8e935e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #18: at::_ops::_to_copy::call(at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0x1eb (0x75533b78d68b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #19: at::native::to(at::Tensor const&, c10::ScalarType, bool, bool, std::optional<c10::MemoryFormat>) + 0xa2 (0x75533b1b6182 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #20: <unknown function> + 0x301e3b0 (0x75533c1f93b0 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #21: at::_ops::to_dtype::call(at::Tensor const&, c10::ScalarType, bool, bool, std::optional<c10::MemoryFormat>) + 0x178 (0x75533b93d258 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #22: mkodec::instrument(at::Tensor, int) + 0x4b (0x7552777c81ab in /opt/conda/lib/python3.10/site-packages/mk1/flywheel/runtime.cpython-310-x86_64-linux-gnu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #23: <unknown function> + 0x981e2 (0x7552777b01e2 in /opt/conda/lib/python3.10/site-packages/mk1/flywheel/runtime.cpython-310-x86_64-linux-gnu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #24: <unknown function> + 0xa7b9b (0x7552777bfb9b in /opt/conda/lib/python3.10/site-packages/mk1/flywheel/runtime.cpython-310-x86_64-linux-gnu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #25: python3() [0x4fd907]
arushimgupta-peft-save-1-v4-mkmlizer: <omitting python frames>
arushimgupta-peft-save-1-v4-mkmlizer: frame #28: python3() [0x5112cf]
arushimgupta-peft-save-1-v4-mkmlizer: frame #30: python3() [0x5112cf]
arushimgupta-peft-save-1-v4-mkmlizer: frame #43: python3() [0x5095ce]
arushimgupta-peft-save-1-v4-mkmlizer: frame #50: python3() [0x509857]
arushimgupta-peft-save-1-v4-mkmlizer: frame #54: python3() [0x5cf913]
arushimgupta-peft-save-1-v4-mkmlizer: frame #57: python3() [0x5951c2]
arushimgupta-peft-save-1-v4-mkmlizer: frame #59: python3() [0x5c5ef7]
arushimgupta-peft-save-1-v4-mkmlizer: frame #60: python3() [0x5c1030]
arushimgupta-peft-save-1-v4-mkmlizer: frame #61: python3() [0x459781]
arushimgupta-peft-save-1-v4-mkmlizer:
Job arushimgupta-peft-save-1-v4-mkmlizer completed after 109.63s with status: failed
Stopping job with name arushimgupta-peft-save-1-v4-mkmlizer
%s, retrying in %s seconds...
Starting job with name arushimgupta-peft-save-1-v4-mkmlizer
Waiting for job on arushimgupta-peft-save-1-v4-mkmlizer to finish
arushimgupta-peft-save-1-v4-mkmlizer: ╔═════════════════════════════════════════════════════════════════════╗
arushimgupta-peft-save-1-v4-mkmlizer: ║ _____ __ __ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ / _/ /_ ___ __/ / ___ ___ / / ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ / _/ / // / |/|/ / _ \/ -_) -_) / ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ /_//_/\_, /|__,__/_//_/\__/\__/_/ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ /___/ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Version: 0.11.12 ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Copyright 2023 MK ONE TECHNOLOGIES Inc. ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ https://mk1.ai ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ The license key for the current software has been verified as ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ belonging to: ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Chai Research Corp. ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Account ID: 7997a29f-0ceb-4cc7-9adf-840c57b4ae6f ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ Expiration: 2024-10-15 23:59:59 ║
arushimgupta-peft-save-1-v4-mkmlizer: ║ ║
arushimgupta-peft-save-1-v4-mkmlizer: ╚═════════════════════════════════════════════════════════════════════╝
arushimgupta-peft-save-1-v4-mkmlizer: Downloaded to shared memory in 16.150s
arushimgupta-peft-save-1-v4-mkmlizer: quantizing model to /dev/shm/model_cache, profile:s0, folder:/tmp/tmpwwu1nxsj, device:0
arushimgupta-peft-save-1-v4-mkmlizer: Saving flywheel model at /dev/shm/model_cache
arushimgupta-peft-save-1-v4-mkmlizer:
Loading 0: 0%| | 0/1203 [00:00<?, ?it/s]Traceback (most recent call last):
arushimgupta-peft-save-1-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 151, in <module>
arushimgupta-peft-save-1-v4-mkmlizer: cli()
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1157, in __call__
arushimgupta-peft-save-1-v4-mkmlizer: return self.main(*args, **kwargs)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1078, in main
arushimgupta-peft-save-1-v4-mkmlizer: rv = self.invoke(ctx)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1688, in invoke
arushimgupta-peft-save-1-v4-mkmlizer: return _process_result(sub_ctx.command.invoke(sub_ctx))
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
arushimgupta-peft-save-1-v4-mkmlizer: return ctx.invoke(self.callback, **ctx.params)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/click/core.py", line 783, in invoke
arushimgupta-peft-save-1-v4-mkmlizer: return __callback(*args, **kwargs)
arushimgupta-peft-save-1-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 42, in quantize
arushimgupta-peft-save-1-v4-mkmlizer: quantize_model(temp_folder, output_path, profile, device)
arushimgupta-peft-save-1-v4-mkmlizer: File "/code/uploading/mkmlize.py", line 135, in quantize_model
arushimgupta-peft-save-1-v4-mkmlizer: flywheel.instrument(
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/instrument.py", line 93, in instrument
arushimgupta-peft-save-1-v4-mkmlizer: compiler.save_pretrained(input_model_path, output_model_path, storage_format)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/functional/compiler.py", line 23, in save_pretrained
arushimgupta-peft-save-1-v4-mkmlizer: self.save_st_pretrained(input_model_path, output_model_path)
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/functional/compiler.py", line 38, in save_st_pretrained
arushimgupta-peft-save-1-v4-mkmlizer: for name, tensor in model_iterator:
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/models/mistral.py", line 241, in tensor_merger
arushimgupta-peft-save-1-v4-mkmlizer: for name, tensor in tensor_iterator:
arushimgupta-peft-save-1-v4-mkmlizer: File "/opt/conda/lib/python3.10/site-packages/mk1/flywheel/functional/loader.py", line 217, in tensor_compiler
arushimgupta-peft-save-1-v4-mkmlizer: compiled_tensor = runtime.instrument(tensor, profile.value)
arushimgupta-peft-save-1-v4-mkmlizer: RuntimeError: CUDA error: invalid configuration argument
arushimgupta-peft-save-1-v4-mkmlizer: CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.
arushimgupta-peft-save-1-v4-mkmlizer: For debugging consider passing CUDA_LAUNCH_BLOCKING=1
arushimgupta-peft-save-1-v4-mkmlizer: Compile with `TORCH_USE_CUDA_DSA` to enable device-side assertions.
arushimgupta-peft-save-1-v4-mkmlizer: Exception raised from c10_cuda_check_implementation at ../c10/cuda/CUDAException.cpp:43 (most recent call first):
arushimgupta-peft-save-1-v4-mkmlizer: frame #0: c10::Error::Error(c10::SourceLocation, std::string) + 0x96 (0x7c8b0e977f86 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #1: c10::detail::torchCheckFail(char const*, char const*, unsigned int, std::string const&) + 0x64 (0x7c8b0e926d10 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #2: c10::cuda::c10_cuda_check_implementation(int, char const*, char const*, int, bool) + 0x118 (0x7c8b0ed19f08 in /opt/conda/lib/python3.10/site-packages/torch/lib/libc10_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #3: void at::native::gpu_kernel_impl<__nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> >(at::TensorIteratorBase&, __nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> const&) + 0x4de (0x7c8ac130aa2e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #4: void at::native::gpu_kernel<__nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> >(at::TensorIteratorBase&, __nv_hdl_wrapper_t<false, true, false, __nv_dl_tag<void (*)(at::TensorIteratorBase&), &at::native::direct_copy_kernel_cuda, 18u>, c10::Half (c10::Half)> const&) + 0x34b (0x7c8ac130b01b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #5: at::native::direct_copy_kernel_cuda(at::TensorIteratorBase&) + 0x38c (0x7c8ac12c9a0c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #6: at::native::copy_device_to_device(at::TensorIterator&, bool, bool) + 0xb25 (0x7c8ac12ca715 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #7: <unknown function> + 0x1910312 (0x7c8ac12cc312 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cuda.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #8: <unknown function> + 0x1cbebff (0x7c8af6899bff in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #9: at::native::copy_(at::Tensor&, at::Tensor const&, bool) + 0x62 (0x7c8af689b5a2 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #10: at::_ops::copy_::call(at::Tensor&, at::Tensor const&, bool) + 0x15c (0x7c8af765635c in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #11: at::native::_to_copy(at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0x1e01 (0x7c8af6bb96b1 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #12: <unknown function> + 0x2e19f8b (0x7c8af79f4f8b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #13: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0xf5 (0x7c8af70fdc25 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #14: <unknown function> + 0x2c58a33 (0x7c8af7833a33 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #15: at::_ops::_to_copy::redispatch(c10::DispatchKeySet, at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0xf5 (0x7c8af70fdc25 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #16: <unknown function> + 0x470df1f (0x7c8af92e8f1f in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #17: <unknown function> + 0x470e35e (0x7c8af92e935e in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #18: at::_ops::_to_copy::call(at::Tensor const&, std::optional<c10::ScalarType>, std::optional<c10::Layout>, std::optional<c10::Device>, std::optional<bool>, bool, std::optional<c10::MemoryFormat>) + 0x1eb (0x7c8af718d68b in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #19: at::native::to(at::Tensor const&, c10::ScalarType, bool, bool, std::optional<c10::MemoryFormat>) + 0xa2 (0x7c8af6bb6182 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #20: <unknown function> + 0x301e3b0 (0x7c8af7bf93b0 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #21: at::_ops::to_dtype::call(at::Tensor const&, c10::ScalarType, bool, bool, std::optional<c10::MemoryFormat>) + 0x178 (0x7c8af733d258 in /opt/conda/lib/python3.10/site-packages/torch/lib/libtorch_cpu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #22: mkodec::instrument(at::Tensor, int) + 0x4b (0x7c8a331c81ab in /opt/conda/lib/python3.10/site-packages/mk1/flywheel/runtime.cpython-310-x86_64-linux-gnu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #23: <unknown function> + 0x981e2 (0x7c8a331b01e2 in /opt/conda/lib/python3.10/site-packages/mk1/flywheel/runtime.cpython-310-x86_64-linux-gnu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #24: <unknown function> + 0xa7b9b (0x7c8a331bfb9b in /opt/conda/lib/python3.10/site-packages/mk1/flywheel/runtime.cpython-310-x86_64-linux-gnu.so)
arushimgupta-peft-save-1-v4-mkmlizer: frame #25: python3() [0x4fd907]
arushimgupta-peft-save-1-v4-mkmlizer: <omitting python frames>
arushimgupta-peft-save-1-v4-mkmlizer: frame #28: python3() [0x5112cf]
arushimgupta-peft-save-1-v4-mkmlizer: frame #30: python3() [0x5112cf]
arushimgupta-peft-save-1-v4-mkmlizer: frame #43: python3() [0x5095ce]
arushimgupta-peft-save-1-v4-mkmlizer: frame #50: python3() [0x509857]
arushimgupta-peft-save-1-v4-mkmlizer: frame #54: python3() [0x5cf913]
arushimgupta-peft-save-1-v4-mkmlizer: frame #57: python3() [0x5951c2]
arushimgupta-peft-save-1-v4-mkmlizer: frame #59: python3() [0x5c5ef7]
arushimgupta-peft-save-1-v4-mkmlizer: frame #60: python3() [0x5c1030]
arushimgupta-peft-save-1-v4-mkmlizer: frame #61: python3() [0x459781]
arushimgupta-peft-save-1-v4-mkmlizer:
Job arushimgupta-peft-save-1-v4-mkmlizer completed after 50.36s with status: failed
Stopping job with name arushimgupta-peft-save-1-v4-mkmlizer
clean up pipeline due to error=MKMLizerError('')
Shutdown handler de-registered
MKMLizerError('')
arushimgupta-peft-save-1_v4 status is now failed due to DeploymentManager action
admin requested tearing down of arushimgupta-peft-save-1_v4
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLDeleter completed in 0.34s
run pipeline stage %s
Running pipeline stage MKMLModelDeleter
Skipping deletion as no model was successfully uploaded
Pipeline stage MKMLModelDeleter completed in 0.34s
Shutdown handler de-registered
arushimgupta-peft-save-1_v4 status is now torndown due to DeploymentManager action
Inference service chaiml-0926-nemo-virgo-t-5421-v1 ready after 220.48139023780823s
Pipeline stage MKMLDeployer completed in 220.96s
run pipeline stage %s
Running pipeline stage StressChecker
Received healthy response to inference request in 1.8620631694793701s
Received healthy response to inference request in 1.1796214580535889s
Received healthy response to inference request in 1.682734489440918s
Received healthy response to inference request in 1.5158770084381104s
Received healthy response to inference request in 1.1990594863891602s
5 requests
0 failed requests
5th percentile: 1.183509063720703
10th percentile: 1.1873966693878173
20th percentile: 1.195171880722046
30th percentile: 1.2624229907989502
40th percentile: 1.3891499996185304
50th percentile: 1.5158770084381104
60th percentile: 1.5826200008392335
70th percentile: 1.6493629932403564
80th percentile: 1.7186002254486084
90th percentile: 1.7903316974639893
95th percentile: 1.8261974334716797
99th percentile: 1.854890022277832
mean time: 1.4878711223602294
Pipeline stage StressChecker completed in 8.71s
run pipeline stage %s
Running pipeline stage TriggerMKMLProfilingPipeline
run_pipeline:run_in_cloud %s
starting trigger_guanaco_pipeline args=%s
Pipeline stage TriggerMKMLProfilingPipeline completed in 4.61s
Shutdown handler de-registered
chaiml-0926-nemo-virgo-t_5421_v1 status is now deployed due to DeploymentManager action
Shutdown handler registered
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.26s
run pipeline stage %s
Running pipeline stage MKMLProfilerTemplater
Pipeline stage MKMLProfilerTemplater completed in 0.19s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeployer
Creating inference service chaiml-0926-nemo-virgo-t-5421-v1-profiler
Waiting for inference service chaiml-0926-nemo-virgo-t-5421-v1-profiler to be ready
Tearing down inference service chaiml-0926-nemo-virgo-t-5421-v1-profiler
%s, retrying in %s seconds...
Creating inference service chaiml-0926-nemo-virgo-t-5421-v1-profiler
Waiting for inference service chaiml-0926-nemo-virgo-t-5421-v1-profiler to be ready
Tearing down inference service chaiml-0926-nemo-virgo-t-5421-v1-profiler
%s, retrying in %s seconds...
Creating inference service chaiml-0926-nemo-virgo-t-5421-v1-profiler
Waiting for inference service chaiml-0926-nemo-virgo-t-5421-v1-profiler to be ready
Tearing down inference service chaiml-0926-nemo-virgo-t-5421-v1-profiler
clean up pipeline due to error=%s
run pipeline stage %s
Running pipeline stage MKMLProfilerDeleter
Skipping teardown as no inference service was successfully deployed
Pipeline stage MKMLProfilerDeleter completed in 0.26s
Shutdown handler de-registered
chaiml-0926-nemo-virgo-t_5421_v1 status is now inactive due to auto deactivation removed underperforming models
Running pipeline stage MKMLDeleter
run pipeline stage %s
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of chaiml-0926-nemo-virgo-t_5421_v1
Checking if service chaiml-0916-intent-suppo-6584-v5 is running
Running pipeline stage MKMLDeleter
run pipeline stage %s
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_degak_2024-09-26
Tearing down inference service chaiml-0916-intent-suppo-6584-v5
Checking if service chaiml-0926-nemo-virgo-t-1582-v1 is running
Running pipeline stage MKMLDeleter
run pipeline stage %s
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_fafot_2024-09-26
Service chaiml-0916-intent-suppo-6584-v5 has been torndown
Checking if service chaiml-0926-nemo-virgo-t-3956-v6 is running
Tearing down inference service chaiml-0926-nemo-virgo-t-1582-v1
Running pipeline stage MKMLDeleter
run pipeline stage %s
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_femik_2024-09-26
Pipeline stage MKMLDeleter completed in 36.42s
Service chaiml-0926-nemo-virgo-t-1582-v1 has been torndown
Tearing down inference service chaiml-0926-nemo-virgo-t-3956-v6
Checking if service chaiml-0926-nemo-virgo-t-3956-v7 is running
Running pipeline stage MKMLDeleter
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_jisob_2024-09-26
run pipeline stage %s
Pipeline stage MKMLDeleter completed in 43.68s
Service chaiml-0926-nemo-virgo-t-3956-v6 has been torndown
Tearing down inference service chaiml-0926-nemo-virgo-t-3956-v7
Checking if service chaiml-0926-nemo-virgo-t-5421-v1 is running
function_degak_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_keneb_2024-09-26
Running pipeline stage MKMLModelDeleter
run pipeline stage %s
Pipeline stage MKMLDeleter completed in 50.31s
Service chaiml-0926-nemo-virgo-t-3956-v7 has been torndown
Tearing down inference service chaiml-0926-nemo-virgo-t-5421-v1
function_fafot_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_sulib_2024-09-26
Cleaning model data from S3
Running pipeline stage MKMLModelDeleter
run pipeline stage %s
Pipeline stage MKMLDeleter completed in 58.94s
Service chaiml-0926-nemo-virgo-t-5421-v1 has been torndown
function_femik_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_tudub_2024-09-26
Cleaning model data from model cache
Cleaning model data from S3
Running pipeline stage MKMLModelDeleter
run pipeline stage %s
Pipeline stage MKMLDeleter completed in 63.34s
function_jisob_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of meta-llama-llama-3-1-8b-_7331_v1
Cleaning model data from model cache
Deleting key chaiml-0916-intent-suppo-6584-v5/config.json from bucket guanaco-mkml-models
Cleaning model data from S3
Running pipeline stage MKMLModelDeleter
run pipeline stage %s
function_keneb_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of mistralai-mistral-nemo_9330_v110
Deleting key chaiml-0926-nemo-virgo-t-1582-v1/config.json from bucket guanaco-mkml-models
Deleting key chaiml-0916-intent-suppo-6584-v5/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Cleaning model data from model cache
Cleaning model data from S3
Running pipeline stage MKMLModelDeleter
function_sulib_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of mistralai-mistral-nemo_9330_v111
Deleting key chaiml-0926-nemo-virgo-t-1582-v1/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Deleting key chaiml-0916-intent-suppo-6584-v5/special_tokens_map.json from bucket guanaco-mkml-models
Deleting key chaiml-0926-nemo-virgo-t-3956-v6/config.json from bucket guanaco-mkml-models
Cleaning model data from model cache
Cleaning model data from S3
function_tudub_2024-09-26 status is now torndown due to DeploymentManager action
run pipeline stage %s
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of mistralai-mistral-nemo_9330_v112
Deleting key chaiml-0926-nemo-virgo-t-1582-v1/special_tokens_map.json from bucket guanaco-mkml-models
Deleting key chaiml-0916-intent-suppo-6584-v5/tokenizer.json from bucket guanaco-mkml-models
Deleting key chaiml-0926-nemo-virgo-t-3956-v6/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Deleting key chaiml-0926-nemo-virgo-t-3956-v7/config.json from bucket guanaco-mkml-models
admin requested tearing down of blend_dones_2024-09-27
Cleaning model data from model cache
admin requested tearing down of blend_rofur_2024-10-03
Running pipeline stage MKMLDeleter
run pipeline stage %s
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of mistralai-mistral-nemo_9330_v114
Deleting key chaiml-0926-nemo-virgo-t-1582-v1/tokenizer.json from bucket guanaco-mkml-models
Deleting key chaiml-0916-intent-suppo-6584-v5/tokenizer_config.json from bucket guanaco-mkml-models
Deleting key chaiml-0926-nemo-virgo-t-3956-v6/special_tokens_map.json from bucket guanaco-mkml-models
Deleting key chaiml-0926-nemo-virgo-t-3956-v7/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of blend_fenik_2024-09-27
Deleting key chaiml-0926-nemo-virgo-t-5421-v1/config.json from bucket guanaco-mkml-models
Shutdown handler not registered because Python interpreter is not running in the main thread
Checking if service meta-llama-llama-3-1-8b-7331-v1 is running
Running pipeline stage MKMLDeleter
run pipeline %s
run pipeline stage %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of mistralai-mistral-nemo_9330_v115
Deleting key chaiml-0926-nemo-virgo-t-1582-v1/tokenizer_config.json from bucket guanaco-mkml-models
Deleting key chaiml-0926-nemo-virgo-t-3956-v6/tokenizer.json from bucket guanaco-mkml-models
Pipeline stage MKMLModelDeleter completed in 179.55s
Deleting key chaiml-0926-nemo-virgo-t-3956-v7/special_tokens_map.json from bucket guanaco-mkml-models
run pipeline %s
Deleting key chaiml-0926-nemo-virgo-t-5421-v1/flywheel_model.0.safetensors from bucket guanaco-mkml-models
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of blend_fuhof_2024-09-27
run pipeline %s
Checking if service mistralai-mistral-nemo-9330-v110 is running
Connection pool is full, discarding connection: %s. Connection pool size: %s
admin requested tearing down of blend_gelom_2024-09-27
Shutdown handler not registered because Python interpreter is not running in the main thread
Running pipeline stage MKMLDeleter
Running pipeline stage ProductionBlendMKMLTemplater
Checking if service mistralai-mistral-nemo-9330-v111 is running
run pipeline stage %s
Tearing down inference service mistralai-mistral-nemo-9330-v110
Service meta-llama-llama-3-1-8b-7331-v1 has been torndown
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of mistralai-mistral-nemo_9330_v117
blend_fenik_2024-09-27 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
chaiml-0926-nemo-virgo-t_5421_v1 status is now torndown due to DeploymentManager action
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
Running pipeline stage MKMLDeployer
function_tudub_2024-09-26 status is now torndown due to DeploymentManager action
blend_gelom_2024-09-27 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
run pipeline %s
Shutdown handler de-registered
run pipeline stage %s
admin requested tearing down of blend_dones_2024-09-27
run pipeline %s
Running pipeline stage MKMLDeleter
run pipeline stage %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of chaiml-0926-nemo-virgo-t_5421_v1
run pipeline stage %s
Pipeline stage %s skipped, reason=%s
Running pipeline stage MKMLDeleter
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_degak_2024-09-26
Running pipeline stage MKMLDeleter
Pipeline stage MKMLDeleter completed in 30.27s
Pipeline stage %s skipped, reason=%s
run pipeline stage %s
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_fafot_2024-09-26
admin requested tearing down of blend_dones_2024-09-27
admin requested tearing down of blend_rofur_2024-10-03
Pipeline stage %s skipped, reason=%s
run pipeline stage %s
Pipeline stage MKMLDeleter completed in 42.17s
Running pipeline stage MKMLDeleter
run pipeline stage %s
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_femik_2024-09-26
Shutdown handler not registered because Python interpreter is not running in the main thread
Tearing down inference service blend-rofur-2024-10-03
admin requested tearing down of blend_fenik_2024-09-27
Shutdown handler not registered because Python interpreter is not running in the main thread
Pipeline stage MKMLDeleter completed in 66.76s
run pipeline stage %s
Running pipeline stage MKMLModelDeleter
Pipeline stage %s skipped, reason=%s
Running pipeline stage MKMLDeleter
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_jisob_2024-09-26
run pipeline %s
%s, retrying in %s seconds...
Tearing down inference service blend-rofur-2024-10-03
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of blend_fuhof_2024-09-27
run pipeline %s
run pipeline stage %s
Running pipeline stage MKMLModelDeleter
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLDeleter completed in 94.59s
Pipeline stage %s skipped, reason=%s
function_degak_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
Shutdown handler de-registered
admin requested tearing down of function_keneb_2024-09-26
Creating inference service blend-rofur-2024-10-03
%s, retrying in %s seconds...
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of blend_fulat_2024-09-27
run pipeline stage %s
Running pipeline stage MKMLModelDeleter
Pipeline stage %s skipped, reason=%s
Pipeline stage MKMLModelDeleter completed in 104.24s
run pipeline stage %s
Pipeline stage MKMLDeleter completed in 98.75s
Shutdown handler de-registered
function_fafot_2024-09-26 status is now torndown due to DeploymentManager action
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
blend_dones_2024-09-27 status is now torndown due to DeploymentManager action
admin requested tearing down of function_sulib_2024-09-26
Pipeline stage %s skipped, reason=%s
Waiting for inference service blend-rofur-2024-10-03 to be ready
Shutdown handler de-registered
Creating inference service blend-rofur-2024-10-03
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of blend_gelom_2024-09-27
Running pipeline stage ProductionBlendMKMLTemplater
Pipeline stage MKMLModelDeleter completed in 111.38s
Shutdown handler de-registered
Running pipeline stage MKMLModelDeleter
run pipeline stage %s
function_femik_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_tudub_2024-09-26
Pipeline stage MKMLModelDeleter completed in 96.03s
blend_fenik_2024-09-27 status is now torndown due to DeploymentManager action
Ignoring service blend-rofur-2024-10-03 already deployed
Shutdown handler de-registered
run pipeline %s
admin requested tearing down of blend_rofur_2024-10-03
Shutdown handler not registered because Python interpreter is not running in the main thread
Pipeline stage %s skipped, reason=%s
Shutdown handler de-registered
admin requested tearing down of blend_dones_2024-09-27
chaiml-0916-intent-suppo_6584_v5 status is now torndown due to DeploymentManager action
Pipeline stage %s skipped, reason=%s
admin requested tearing down of blend_rofur_2024-10-03
admin requested tearing down of blend_rofur_2024-10-03
Running pipeline stage MKMLModelDeleter
run pipeline %s
function_jisob_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
Shutdown handler not registered because Python interpreter is not running in the main thread
Shutdown handler de-registered
admin requested tearing down of blend_rofur_2024-10-03
Waiting for inference service blend-rofur-2024-10-03 to be ready
blend_fuhof_2024-09-27 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
Pipeline stage ProductionBlendMKMLTemplater completed in 226.35s
chaiml-0926-nemo-virgo-t_1582_v1 status is now torndown due to DeploymentManager action
Shutdown handler not registered because Python interpreter is not running in the main thread
Pipeline stage MKMLModelDeleter completed in 237.49s
Shutdown handler not registered because Python interpreter is not running in the main thread
Shutdown handler not registered because Python interpreter is not running in the main thread
Pipeline stage %s skipped, reason=%s
Shutdown handler de-registered
function_keneb_2024-09-26 status is now torndown due to DeploymentManager action
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
chaiml-0926-nemo-virgo-t_3956_v6 status is now torndown due to DeploymentManager action
run pipeline %s
Shutdown handler de-registered
blend_fulat_2024-09-27 status is now torndown due to DeploymentManager action
run pipeline stage %s
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline stage %s
admin requested tearing down of blend_fenik_2024-09-27
run pipeline %s
admin requested tearing down of mistralai-mistral-nemo_9330_v112
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
chaiml-0926-nemo-virgo-t_5421_v1 status is now torndown due to DeploymentManager action
run pipeline %s
Running pipeline stage MKMLDeleter
Shutdown handler not registered because Python interpreter is not running in the main thread
run pipeline %s
Pipeline stage %s skipped, reason=%s
Shutdown handler not registered because Python interpreter is not running in the main thread
Running pipeline stage MKMLDeleter
admin requested tearing down of chaiml-0926-nemo-virgo-t_5421_v1
run pipeline stage %s
Pipeline stage MKMLDeleter completed in 2.97s
run pipeline %s
Pipeline stage %s skipped, reason=%s
Shutdown handler not registered because Python interpreter is not running in the main thread
Running pipeline stage MKMLDeleter
admin requested tearing down of function_degak_2024-09-26
run pipeline stage %s
run pipeline stage %s
Pipeline stage MKMLDeleter completed in 2.88s
run pipeline %s
Pipeline stage %s skipped, reason=%s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_fafot_2024-09-26
Running pipeline stage MKMLModelDeleter
Running pipeline stage MKMLDeleter
run pipeline stage %s
run pipeline stage %s
Pipeline stage MKMLDeleter completed in 2.76s
run pipeline %s
Pipeline stage %s skipped, reason=%s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_femik_2024-09-26
Pipeline stage %s skipped, reason=%s
Running pipeline stage MKMLModelDeleter
Running pipeline stage MKMLDeleter
run pipeline stage %s
Shutdown handler de-registered
Pipeline stage MKMLModelDeleter completed in 3.32s
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_jisob_2024-09-26
Pipeline stage MKMLDeleter completed in 4.53s
Pipeline stage %s skipped, reason=%s
Pipeline stage %s skipped, reason=%s
Running pipeline stage MKMLModelDeleter
function_degak_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of function_keneb_2024-09-26
Pipeline stage MKMLDeleter completed in 4.81s
Pipeline stage %s skipped, reason=%s
chaiml-0916-intent-suppo_6584_v5 status is now torndown due to DeploymentManager action
function_fafot_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
Running pipeline stage MKMLModelDeleter
admin requested tearing down of function_sulib_2024-09-26
Shutdown handler de-registered
run pipeline stage %s
Pipeline stage MKMLModelDeleter completed in 4.95s
function_femik_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
run pipeline %s
chaiml-0926-nemo-virgo-t_1582_v1 status is now torndown due to DeploymentManager action
Running pipeline stage MKMLModelDeleter
Shutdown handler de-registered
Pipeline stage MKMLModelDeleter completed in 3.40s
function_jisob_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
Pipeline stage %s skipped, reason=%s
admin requested tearing down of meta-llama-llama-3-1-8b-_7331_v1
chaiml-0926-nemo-virgo-t_3956_v6 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
Shutdown handler de-registered
function_keneb_2024-09-26 status is now torndown due to DeploymentManager action
run pipeline %s
Pipeline stage MKMLModelDeleter completed in 3.51s
Shutdown handler not registered because Python interpreter is not running in the main thread
chaiml-0926-nemo-virgo-t_3956_v7 status is now torndown due to DeploymentManager action
function_sulib_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler de-registered
Shutdown handler de-registered
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of mistralai-mistral-nemo_9330_v111
admin requested tearing down of mistralai-mistral-nemo_9330_v111
chaiml-0926-nemo-virgo-t_5421_v1 status is now torndown due to DeploymentManager action
function_tudub_2024-09-26 status is now torndown due to DeploymentManager action
run pipeline %s
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of mistralai-mistral-nemo_9330_v111
run pipeline stage %s
run pipeline %s
chaiml-0926-nemo-virgo-t_5421_v1 status is now torndown due to DeploymentManager action
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of mistralai-mistral-nemo_9330_v112
admin requested tearing down of mistralai-mistral-nemo_9330_v110
Shutdown handler de-registered
chaiml-0926-nemo-virgo-t_5421_v1 status is now torndown due to DeploymentManager action
function_sulib_2024-09-26 status is now torndown due to DeploymentManager action
Shutdown handler not registered because Python interpreter is not running in the main thread
admin requested tearing down of mistralai-mistral-nemo_9330_v111
run pipeline stage %s