pytorch-optimizer v2.4.0
Change Log
Feature
- Implement
D-Adaptation optimizers(DAdaptAdaGrad,DAdaptAdam,DAdaptSGD), #101- Learning rate free learning for SGD, AdaGrad and Adam
- original implementation: https://github.com/facebookresearch/dadaptation
- Shampoo optimizer
- Support
no_preconditioning_for_layers_with_dim_gt(default 8192)
- Support
Improvement
- refactor/improve
matrix_power(), unroll the loop due to the performance, #101 - speed-up/fix
power_iter(), not to deep-copymat_v. #101
Docs
D-Adaptation optimizers& Shampoo utils