pytorch-optimizer v3.4.0

kozistr released this 02 Feb 05:07

8da7b49

Change Log

Feature

Implement FOCUS optimizer. (#330, #331)
- First Order Concentrated Updating Scheme
Implement PSGD Kron optimizer. (#336, #337)
- preconditioned stochastic gradient descent w/ Kron pre-conditioner
Implement EXAdam optimizer. (#338, #339)
- The Power of Adaptive Cross-Moments

Update

Support OrthoGrad variant to Ranger25. (#332)
- Ranger25 optimizer is my experimental-crafted optimizer, which mixes lots of optimizer variants such as ADOPT + AdEMAMix + Cautious + StableAdamW + Adam-Atan2 + OrthoGrad.

Fix

Add the missing state property in OrthoGrad optimizer. (#326, #327)
Add the missing state_dict, and load_state_dict methods to TRAC and OrthoGrad optimizers. (#332)
Skip when the gradient is sparse in OrthoGrad optimizer. (#332)
Support alternative precision training in SOAP optimizer. (#333)
Store SOAP condition matrices as the dtype of their parameters. (#335)

Contributions

thanks to @Vectorrent, @kylevedder

Contributors

kylevedder and Vectorrent

Assets 2