You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
10.[Adam: A Method for Stochastic Optimization](https://arxiv.org/abs/1412.6980)[:outbox_tray:]()[:computer:]()
77
+
11.[Adam: A Method for Stochastic Optimization](https://arxiv.org/abs/1412.6980)[:outbox_tray:]()[:computer:]()
78
78
Diederik P. Kingma, Jimmy Ba; 2014
79
79
80
-
8.[AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights](https://arxiv.org/abs/2006.08217)[:outbox_tray:]()[:computer:]()
80
+
12.[AdamP: Slowing Down the Slowdown for Momentum Optimizers on Scale-invariant Weights](https://arxiv.org/abs/2006.08217)[:outbox_tray:]()[:computer:]()
9.[On the Variance of the Adaptive Learning Rate and Beyond](https://arxiv.org/abs/1908.03265)[:outbox_tray:]()[:computer:]()
83
+
13.[On the Variance of the Adaptive Learning Rate and Beyond](https://arxiv.org/abs/1908.03265)[:outbox_tray:]()[:computer:]()
84
84
Liyuan Liu, Haoming Jiang, Pengcheng He; 2021
85
85
86
+
14.[AdaBelief Optimizer: Adapting Stepsizes by the Belief in Observed Gradients](https://arxiv.org/abs/2010.07468) Juntang Zhuang, Tommy Tang, Yifan Ding, Sekhar Tatikonda, Nicha Dvornek, Xenophon Papademetris, James S. Duncan ; 2020
87
+
88
+
15.[Momentum Centering and Asynchronous Update for Adaptive Gradient Methods](https://arxiv.org/abs/2110.05454) Juntang Zhuang, Yifan Ding, Tommy Tang, Nicha Dvornek, Sekhar Tatikonda, James S. Duncan ; 2021
14.[Gradient Centralization: A New Optimization Technique for Deep Neural Networks](https://arxiv.org/abs/2004.01461)[:outbox_tray:](survey/gradient-centralization.md)[:computer:]()
99
+
17.[Gradient Centralization: A New Optimization Technique for Deep Neural Networks](https://arxiv.org/abs/2004.01461)[:outbox_tray:](survey/gradient-centralization.md)[:computer:]()
96
100
Hongwei Yong, Jianqiang Huang, Xiansheng Hua, Lei Zhang; 2020
97
101
98
102
99
103
## Optimizer Analysis and Meta-research
100
-
15.[On Empirical Comparisons of Optimizers for Deep Learning](https://arxiv.org/abs/1910.05446)[:outbox_tray:]()
104
+
18.[On Empirical Comparisons of Optimizers for Deep Learning](https://arxiv.org/abs/1910.05446)[:outbox_tray:]()
101
105
Dami Choi, Christopher J. Shallue, Zachary Nado, Jaehoon Lee, Chris J. Maddison, George E. Dahl; 2019
102
106
103
-
11.[Adam Can Converge Without Any Modification on Update Rules](https://arxiv.org/abs/2208.09632)[:outbox_tray:](survey/adam-can-converge.md)
107
+
19.[Adam Can Converge Without Any Modification on Update Rules](https://arxiv.org/abs/2208.09632)[:outbox_tray:](survey/adam-can-converge.md)
0 commit comments