Memory efficient backprop
This release introduces memory-efficient backprop through frozen weights where the gradient is calculated from the 8-bit weights but is computed in fp16. This is useful for creating Low-rank (LoRa) Adapters for fine-tuning large models.
This is a feature contributed by @dbaranchuk and @justheuristic.
0.34.0
Bug fixes and memory-efficient backprop
Features:
- Linear8bitLt layer now supports
memory_efficient_backward=Truewhich enables backprop of gradients through frozen weights.
Bug fixes:
- fixed an issue where too many threads were created in blockwise quantization on the CPU for large tensors