Compressing chat-based models such as Llama-2-7B-chat and Mistral-7B-v0.2-Instruct

Hi @tuidan ,

Thanks for open-sourcing this work. Really appreciate it.

I am wondering if I can compress the chat-based models using this approach. Or do I need to do any additional steps?

Looking forward, thanks.