Skip to content

Commit b475f41

Browse files
authored
Merge pull request #3 from Dawn-Of-Eve/bhavnicksm/changes
Bhavnicksm/changes
2 parents 5fd39ca + 67aa19f commit b475f41

File tree

5 files changed

+25
-10
lines changed

5 files changed

+25
-10
lines changed

CONTRIBUTING/README.md

Whitespace-only changes.

CONTRIBUTING/template.md

Whitespace-only changes.

README.md

Lines changed: 24 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,12 @@ If this repository has been useful to you in your research, please cite it using
1010
- [Survey Papers]()
1111
- [First-order Optimizers](#first-order-optimizers)
1212
- [Adaptive Optimizers](#adaptive-optimizers)
13+
- [Adam family of Optimizers](#adam-family-of-optimizers)
1314
- [Second-order Optimizers](#second-order-optimizers)
14-
- [Optimizer Agnostic Improvements](#optimizer-agnostic-improvements)
15-
15+
- [Other Optimization-related Research](#other-optimisation-related-research)
16+
- [General Improvements](#general-improvements)
17+
- [Optimizer Analysis](#optimizer-analysis-and-meta-research)
18+
- [Hyperparameter tuning](#hyperparameter-tuning)
1619
### Legend
1720

1821
| Symbol | Meaning |
@@ -31,6 +34,9 @@ If this repository has been useful to you in your research, please cite it using
3134

3235
## First-order Optimizers
3336

37+
- [Nesterov Accelerated Gradient momentum](https://jlmelville.github.io/mize/nesterov.html) [:outbox_tray:]() [:computer:]()
38+
Yuri Nesterov; _Unknown_
39+
3440
- [KOALA: A Kalman Optimization Algorithm with Loss Adaptivity](https://arxiv.org/abs/2107.03331) [:outbox_tray:]() [:computer:]()
3541
Aram Davtyan, Sepehr Sameni, Llukman Cerkezi, Givi Meishvilli, Adam Bielski, Paolo Favaro; 2021
3642

@@ -39,22 +45,31 @@ If this repository has been useful to you in your research, please cite it using
3945
- [RMSProp](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf) [:outbox_tray:]() [:computer:]()
4046
Geoffrey Hinton; 2013
4147

48+
### Adam Family of Optimizers
49+
4250
- [Adam: A Method for Stochastic Optimization](https://arxiv.org/abs/1412.6980) [:outbox_tray:]() [:computer:]()
4351
Diederik P. Kingma, Jimmy Ba; 2014
4452

4553
## Second-order Optimizers
4654

47-
## Optimizer-agnostic improvements
55+
- [Shampoo: Preconditioned Stochastic Tensor Optimization](https://arxiv.org/abs/1802.09568) [:outbox_tray:]() [:computer:]()
56+
Vineet Gupta, Tomer Koren, Yoram Singer
4857

58+
59+
## Other Optimisation-Related Research
60+
61+
### General Improvements
4962
- [Gradient Centralization: A New Optimization Technique for Deep Neural Networks](https://arxiv.org/abs/2004.01461) [:outbox_tray:]() [:computer:]()
5063
Hongwei Yong, Jianqiang Huang, Xiansheng Hua, Lei Zhang; 2020
5164

5265

53-
- [Gradient Descent: The Ultimate Optimizer](https://arxiv.org/abs/1909.13371) [:outbox_tray:]() [:computer:]()
54-
Kartik Chandra, Audrey Xie, Jonathan Ragan-Kelley, Erik Meijer; 2019
55-
66+
### Optimizer Analysis and Meta-research
67+
- [On Empirical Comparisons of Optimizers for Deep Learning](https://arxiv.org/abs/1910.05446) [:outbox_tray:]()
68+
Dami Choi, Christopher J. Shallue, Zachary Nado, Jaehoon Lee, Chris J. Maddison, George E. Dahl; 2019
5669

57-
## Optimizers' Analysis and Meta-research
70+
- [Adam Can Converge Without Any Modification on Update Rules](https://arxiv.org/abs/2208.09632) [:outbox_tray:](survey/adam-can-converge.md)
71+
Yushun Zhang, Congliang Chen, Naichen Shi, Ruoyu Sun, Zhi-Quan Luo; 2022
5872

59-
- [On Empirical Comparisons of Optimizers for Deep Learning](https://arxiv.org/abs/1910.05446) [:outbox_tray:]()
60-
Dami Choi, Christopher J. Shallue, Zachary Nado, Jaehoon Lee, Chris J. Maddison, George E. Dahl; 2019
73+
### Hyperparameter Tuning
74+
- [Gradient Descent: The Ultimate Optimizer](https://arxiv.org/abs/1909.13371) [:outbox_tray:]() [:computer:]()
75+
Kartik Chandra, Audrey Xie, Jonathan Ragan-Kelley, Erik Meijer; 2019

survey/adam-can-converge.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ This paper suggests that:
1717
1. For $β_2$ is large enough and $β_1 < √β_2,$ Adam converges to the neighborhood of critical points.
1818
2. For any fixed $n$, there exists a function such that, Adam diverges to infinity when $(β_1, β_2)$ is picked in the red region.
1919

20-
![adam-can-converge](adam-can-converg.png)
20+
![adam-can-converge](assets/adam-can-converg.png)
2121

2222

2323
3. There is a phase transition from divergence to convergence when changing $β_2$.

0 commit comments

Comments
 (0)