You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@
10
10
11
11
## The reasons why you use `pytorch-optimizer`.
12
12
13
-
* Wide range of supported optimizers. Currently, **99 optimizers (+ `bitsandbytes`, `qgalore`, `torchao`)**, **16 lr schedulers**, and **13 loss functions** are supported!
13
+
* Wide range of supported optimizers. Currently, **100 optimizers (+ `bitsandbytes`, `qgalore`, `torchao`)**, **16 lr schedulers**, and **13 loss functions** are supported!
14
14
* Including many variants such as `ADOPT`, `Cautious`, `AdamD`, `StableAdamW`, and `Gradient Centrailiaztion`
| LookSAM |*Towards Efficient and Scalable Sharpness-Aware Minimization*|[github](https://github.com/rollovd/LookSAM)|<https://arxiv.org/abs/2203.02714>|[cite](https://ui.adsabs.harvard.edu/abs/2022arXiv220302714L/exportcitation)|
209
209
| SCION |*Training Deep Learning Models with Norm-Constrained LMOs*||<https://arxiv.org/abs/2502.07529>|[cite](https://ui.adsabs.harvard.edu/abs/2025arXiv250207529P/exportcitation)|
210
+
| COSMOS |*SOAP with Muon*|[github](https://github.com/lliu606/COSMOS)|||
211
+
| StableSPAM |*How to Train in 4-Bit More Stably than 16-Bit Adam |[github](https://github.com/TianjinYellow/StableSPAM)|<https://arxiv.org/abs/2502.17055>||
Copy file name to clipboardExpand all lines: docs/index.md
+3-1Lines changed: 3 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@
10
10
11
11
## The reasons why you use `pytorch-optimizer`.
12
12
13
-
* Wide range of supported optimizers. Currently, **99 optimizers (+ `bitsandbytes`, `qgalore`, `torchao`)**, **16 lr schedulers**, and **13 loss functions** are supported!
13
+
* Wide range of supported optimizers. Currently, **100 optimizers (+ `bitsandbytes`, `qgalore`, `torchao`)**, **16 lr schedulers**, and **13 loss functions** are supported!
14
14
* Including many variants such as `ADOPT`, `Cautious`, `AdamD`, `StableAdamW`, and `Gradient Centrailiaztion`
| LookSAM |*Towards Efficient and Scalable Sharpness-Aware Minimization*|[github](https://github.com/rollovd/LookSAM)|<https://arxiv.org/abs/2203.02714>|[cite](https://ui.adsabs.harvard.edu/abs/2022arXiv220302714L/exportcitation)|
209
209
| SCION |*Training Deep Learning Models with Norm-Constrained LMOs*||<https://arxiv.org/abs/2502.07529>|[cite](https://ui.adsabs.harvard.edu/abs/2025arXiv250207529P/exportcitation)|
210
+
| COSMOS |*SOAP with Muon*|[github](https://github.com/lliu606/COSMOS)|||
211
+
| StableSPAM |*How to Train in 4-Bit More Stably than 16-Bit Adam |[github](https://github.com/TianjinYellow/StableSPAM)|<https://arxiv.org/abs/2502.17055>||
0 commit comments