pytorch-optimizer v2.2.1
Change Log
Feature
- Support
max_grad_norm(Adan optimizer) - Support
gradient averaging(Lamb optimizer) - Support
dampening,nesterovparameters (Lars optimizer)
Refactor
- move
stepparameter fromstatetogroup. (to reduce computation cost & memory) - load
betasbygroup, not a parameter. - change to in-place operations.
Fix
- fix when
momentumis 0 (Lars optimizer)