|
10 | 10 |
|
11 | 11 | ## The reasons why you use `pytorch-optimizer`. |
12 | 12 |
|
13 | | -* Wide range of supported optimizers. Currently, **108 optimizers (+ `bitsandbytes`, `qgalore`, `torchao`)**, **16 lr schedulers**, and **13 loss functions** are supported! |
| 13 | +* Wide range of supported optimizers. Currently, **110 optimizers (+ `bitsandbytes`, `qgalore`, `torchao`)**, **16 lr schedulers**, and **13 loss functions** are supported! |
14 | 14 | * Including many variants such as `ADOPT`, `Cautious`, `AdamD`, `StableAdamW`, and `Gradient Centrailiaztion` |
15 | 15 | * Easy to use, clean, and tested codes |
16 | 16 | * Active maintenance |
@@ -215,7 +215,9 @@ get_supported_optimizers(['adam*', 'ranger*']) |
215 | 215 | | RACS & Alice | *Towards Efficient Optimizer Design for LLM via Structured Fisher Approximation with a Low-Rank Extension* | | <https://arxiv.org/pdf/2502.07752> | [cite](https://ui.adsabs.harvard.edu/abs/2025arXiv250207752G/exportcitation) | |
216 | 216 | | VSGD | *Variational Stochastic Gradient Descent for Deep Neural Networks* | [github](https://github.com/generativeai-tue/vsgd) | <https://openreview.net/forum?id=xu4ATNjcdy> | [cite](https://github.com/generativeai-tue/vsgd/tree/main?tab=readme-ov-file#cite) | |
217 | 217 | | SNSM | *Subset-Norm and Subspace-Momentum: Faster Memory-Efficient Adaptive Optimization with Convergence Guarantees* | [github](https://github.com/timmytonga/sn-sm) | <https://arxiv.org/abs/2411.07120> | [cite](https://ui.adsabs.harvard.edu/abs/2024arXiv241107120N/exportcitation) | |
218 | | -| AdamC | Why Gradients Rapidly Increase Near the End of Training* | | <https://arxiv.org/abs/2506.02285> | [cite](https://ui.adsabs.harvard.edu/abs/2025arXiv250602285D/exportcitation) | |
| 218 | +| AdamC | *Why Gradients Rapidly Increase Near the End of Training* | | <https://arxiv.org/abs/2506.02285> | [cite](https://ui.adsabs.harvard.edu/abs/2025arXiv250602285D/exportcitation) | |
| 219 | +| AdaMuon | *Adaptive Muon Optimizer* | | <https://arxiv.org/abs/2507.11005v1> | [cite](https://ui.adsabs.harvard.edu/abs/2025arXiv250711005S/exportcitation) | |
| 220 | +| SPlus | *A Stable Whitening Optimizer for Efficient Neural Network Training* | [github](https://github.com/kvfrans/splus) | <https://arxiv.org/abs/2506.07254> | [cite](https://ui.adsabs.harvard.edu/abs/2025arXiv250607254F/exportcitation) | |
219 | 221 |
|
220 | 222 | ## Supported LR Scheduler |
221 | 223 |
|
|
0 commit comments