@@ -10,9 +10,12 @@ If this repository has been useful to you in your research, please cite it using
1010- [ Survey Papers] ( )
1111- [ First-order Optimizers] ( #first-order-optimizers )
1212 - [ Adaptive Optimizers] ( #adaptive-optimizers )
13+ - [ Adam family of Optimizers] ( #adam-family-of-optimizers )
1314- [ Second-order Optimizers] ( #second-order-optimizers )
14- - [ Optimizer Agnostic Improvements] ( #optimizer-agnostic-improvements )
15-
15+ - [ Other Optimization-related Research] ( #other-optimisation-related-research )
16+ - [ General Improvements] ( #general-improvements )
17+ - [ Optimizer Analysis] ( #optimizer-analysis-and-meta-research )
18+ - [ Hyperparameter tuning] ( #hyperparameter-tuning )
1619### Legend
1720
1821| Symbol | Meaning |
@@ -31,6 +34,9 @@ If this repository has been useful to you in your research, please cite it using
3134
3235## First-order Optimizers
3336
37+ - [ Nesterov Accelerated Gradient momentum] ( https://jlmelville.github.io/mize/nesterov.html ) [ :outbox_tray : ] ( ) [ :computer : ] ( )
38+ Yuri Nesterov; _ Unknown_
39+
3440- [ KOALA: A Kalman Optimization Algorithm with Loss Adaptivity] ( https://arxiv.org/abs/2107.03331 ) [ :outbox_tray : ] ( ) [ :computer : ] ( )
3541 Aram Davtyan, Sepehr Sameni, Llukman Cerkezi, Givi Meishvilli, Adam Bielski, Paolo Favaro; 2021
3642
@@ -39,22 +45,31 @@ If this repository has been useful to you in your research, please cite it using
3945- [ RMSProp] ( http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf ) [ :outbox_tray : ] ( ) [ :computer : ] ( )
4046 Geoffrey Hinton; 2013
4147
48+ ### Adam Family of Optimizers
49+
4250- [ Adam: A Method for Stochastic Optimization] ( https://arxiv.org/abs/1412.6980 ) [ :outbox_tray : ] ( ) [ :computer : ] ( )
4351 Diederik P. Kingma, Jimmy Ba; 2014
4452
4553## Second-order Optimizers
4654
47- ## Optimizer-agnostic improvements
55+ - [ Shampoo: Preconditioned Stochastic Tensor Optimization] ( https://arxiv.org/abs/1802.09568 ) [ :outbox_tray : ] ( ) [ :computer : ] ( )
56+ Vineet Gupta, Tomer Koren, Yoram Singer
4857
58+
59+ ## Other Optimisation-Related Research
60+
61+ ### General Improvements
4962- [ Gradient Centralization: A New Optimization Technique for Deep Neural Networks] ( https://arxiv.org/abs/2004.01461 ) [ :outbox_tray : ] ( ) [ :computer : ] ( )
5063 Hongwei Yong, Jianqiang Huang, Xiansheng Hua, Lei Zhang; 2020
5164
5265
53- - [ Gradient Descent: The Ultimate Optimizer] ( https://arxiv.org/abs/1909.13371 ) [ : outbox_tray : ] ( ) [ : computer : ] ( )
54- Kartik Chandra, Audrey Xie, Jonathan Ragan-Kelley, Erik Meijer; 2019
55-
66+ ### Optimizer Analysis and Meta-research
67+ - [ On Empirical Comparisons of Optimizers for Deep Learning ] ( https://arxiv.org/abs/1910.05446 ) [ : outbox_tray : ] ( )
68+ Dami Choi, Christopher J. Shallue, Zachary Nado, Jaehoon Lee, Chris J. Maddison, George E. Dahl; 2019
5669
57- ## Optimizers' Analysis and Meta-research
70+ - [ Adam Can Converge Without Any Modification on Update Rules] ( https://arxiv.org/abs/2208.09632 ) [ :outbox_tray : ] ( survey/adam-can-converge.md )
71+ Yushun Zhang, Congliang Chen, Naichen Shi, Ruoyu Sun, Zhi-Quan Luo; 2022
5872
59- - [ On Empirical Comparisons of Optimizers for Deep Learning] ( https://arxiv.org/abs/1910.05446 ) [ :outbox_tray : ] ( )
60- Dami Choi, Christopher J. Shallue, Zachary Nado, Jaehoon Lee, Chris J. Maddison, George E. Dahl; 2019
73+ ### Hyperparameter Tuning
74+ - [ Gradient Descent: The Ultimate Optimizer] ( https://arxiv.org/abs/1909.13371 ) [ :outbox_tray : ] ( ) [ :computer : ] ( )
75+ Kartik Chandra, Audrey Xie, Jonathan Ragan-Kelley, Erik Meijer; 2019
0 commit comments