Skip to content

Commit d658ec6

Browse files
authored
Merge pull request #16 from apzl/main
Update README
2 parents 560c8f9 + 6a50e3e commit d658ec6

File tree

1 file changed

+30
-10
lines changed

1 file changed

+30
-10
lines changed

README.md

Lines changed: 30 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -19,6 +19,7 @@ If this repository has been useful to you in your research, please cite it using
1919
- [Legend](#legend)
2020
- [Survey Papers](#survey-papers)
2121
- [First-order Optimizers](#first-order-optimizers)
22+
- [Momentum based Optimizers](#momentum-based-optimizers)
2223
- [Adaptive Optimizers](#adaptive-optimizers)
2324
- [Adam Family of Optimizers](#adam-family-of-optimizers)
2425
- [Second-order Optimizers](#second-order-optimizers)
@@ -32,8 +33,8 @@ If this repository has been useful to you in your research, please cite it using
3233

3334
| Symbol | Meaning | Count |
3435
|:--------------|:--------|:------|
35-
| None | Paper | 11 |
36-
| :outbox_tray: | Summary | 2 |
36+
| None | Paper | 17 |
37+
| :outbox_tray: | Summary | 3 |
3738
| :computer: | Code | 0 |
3839

3940

@@ -53,36 +54,55 @@ If this repository has been useful to you in your research, please cite it using
5354
4. [KOALA: A Kalman Optimization Algorithm with Loss Adaptivity](https://arxiv.org/abs/2107.03331) [:outbox_tray:]() [:computer:]()
5455
Aram Davtyan, Sepehr Sameni, Llukman Cerkezi, Givi Meishvilli, Adam Bielski, Paolo Favaro; 2021
5556

57+
## Momentum based Optimizers
58+
59+
5. [On the Momentum Term in Gradient Descent Learning Algorithms](https://reader.elsevier.com/reader/sd/pii/S0893608098001166?token=3147494EED9FE670AF728F3408B795675246C9934481200C4E86611D7FE34FAEDDFF1E9BD5C6AE9455320BF21F3FEA3B&originRegion=eu-west-1&originCreation=20230223114928) [:outbox_tray:]() [:computer:]()
60+
Ning Qian; 1999
61+
62+
6. [Symbolic Discovery of Optimization Algorithms](https://arxiv.org/abs/2302.06675) [:outbox_tray:]() [:computer:]() Xiangning Chen, Chen Liang, Da Huang; 2023
63+
64+
5665
## Adaptive Optimizers
5766

58-
5. [RMSProp](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf) [:outbox_tray:]() [:computer:]()
67+
7. [Adaptive Subgradient Methods for Online Learning and Stochastic Optimization](https://dl.acm.org/doi/10.5555/1953048.2021068) [:outbox_tray:]() [:computer:]() John Duchi, Elad Hazan, Yoram Singer; 2011
68+
69+
8. [ADADELTA: An Adaptive Learning Rate Method](https://arxiv.org/abs/1212.5701) [:outbox_tray:]() [:computer:]()
70+
Matthew D. Zeiler; 2012
71+
72+
6. [RMSProp](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf) [:outbox_tray:]() [:computer:]()
5973
Geoffrey Hinton; 2013
6074

6175
## Adam Family of Optimizers
6276

63-
6. [Adam: A Method for Stochastic Optimization](https://arxiv.org/abs/1412.6980) [:outbox_tray:]() [:computer:]()
77+
10. [Adam: A Method for Stochastic Optimization](https://arxiv.org/abs/1412.6980) [:outbox_tray:]() [:computer:]()
6478
Diederik P. Kingma, Jimmy Ba; 2014
6579

80+
8. [AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights](https://arxiv.org/abs/2006.08217) [:outbox_tray:]() [:computer:]()
81+
Byeongho Heo, Sanghyuk Chun, Seong Joon Oh, Dongyoon Han; 2020
82+
83+
9. [On the Variance of the Adaptive Learning Rate and Beyond](https://arxiv.org/abs/1908.03265) [:outbox_tray:]() [:computer:]()
84+
Liyuan Liu, Haoming Jiang, Pengcheng He; 2021
85+
6686
# Second-order Optimizers
6787

68-
7. [Shampoo: Preconditioned Stochastic Tensor Optimization](https://arxiv.org/abs/1802.09568) [:outbox_tray:]() [:computer:]()
69-
Vineet Gupta, Tomer Koren, Yoram Singer
88+
13. [Shampoo: Preconditioned Stochastic Tensor Optimization](https://arxiv.org/abs/1802.09568) [:outbox_tray:]() [:computer:]()
89+
Vineet Gupta, Tomer Koren, Yoram Singer; 2018
7090

7191

7292
# Other Optimisation-Related Research
7393

7494
## General Improvements
75-
8. [Gradient Centralization: A New Optimization Technique for Deep Neural Networks](https://arxiv.org/abs/2004.01461) [:outbox_tray:](survey/gradient-centralization.md) [:computer:]()
95+
14. [Gradient Centralization: A New Optimization Technique for Deep Neural Networks](https://arxiv.org/abs/2004.01461) [:outbox_tray:](survey/gradient-centralization.md) [:computer:]()
7696
Hongwei Yong, Jianqiang Huang, Xiansheng Hua, Lei Zhang; 2020
7797

7898

7999
## Optimizer Analysis and Meta-research
80-
9. [On Empirical Comparisons of Optimizers for Deep Learning](https://arxiv.org/abs/1910.05446) [:outbox_tray:]()
100+
15. [On Empirical Comparisons of Optimizers for Deep Learning](https://arxiv.org/abs/1910.05446) [:outbox_tray:]()
81101
Dami Choi, Christopher J. Shallue, Zachary Nado, Jaehoon Lee, Chris J. Maddison, George E. Dahl; 2019
82102

83-
10. [Adam Can Converge Without Any Modification on Update Rules](https://arxiv.org/abs/2208.09632) [:outbox_tray:](survey/adam-can-converge.md)
103+
11. [Adam Can Converge Without Any Modification on Update Rules](https://arxiv.org/abs/2208.09632) [:outbox_tray:](survey/adam-can-converge.md)
84104
Yushun Zhang, Congliang Chen, Naichen Shi, Ruoyu Sun, Zhi-Quan Luo; 2022
85105

86106
## Hyperparameter Tuning
87-
11. [Gradient Descent: The Ultimate Optimizer](https://arxiv.org/abs/1909.13371) [:outbox_tray:]() [:computer:]()
107+
17. [Gradient Descent: The Ultimate Optimizer](https://arxiv.org/abs/1909.13371) [:outbox_tray:]() [:computer:]()
88108
Kartik Chandra, Audrey Xie, Jonathan Ragan-Kelley, Erik Meijer; 2019

0 commit comments

Comments
 (0)