-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
I think it would be very cool if the NF4 version had an accuracy recovery adapter. It might increase the quality to the level of the INT8 model without increasing the RAM/VRAM usage by much.
Here is some info about the idea:
https://huggingface.co/ostris/accuracy_recovery_adapters
https://www.reddit.com/r/LocalLLaMA/comments/1mytbfz/accuracy_recovery_adapter_with_selfgenerated_data/
I think it's originally from Apple for on-device models. It's similar to quantization-aware training.
I wonder how expensive it would be to make one for this model. What do you think?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels