Skip to content

Commit 7bcc873

Browse files
committed
docs: PCGrad
1 parent 82fe441 commit 7bcc873

File tree

1 file changed

+27
-11
lines changed

1 file changed

+27
-11
lines changed

README.rst

Lines changed: 27 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -73,17 +73,17 @@ of the ideas are applied in ``Ranger21`` optimizer.
7373

7474
Also, most of the captures are taken from ``Ranger21`` paper.
7575

76-
+------------------------------------------+-------------------------------------+--------------------------------------------+
77-
| `Adaptive Gradient Clipping`_ | `Gradient Centralization`_ | `Softplus Transformation`_ |
78-
+------------------------------------------+-------------------------------------+--------------------------------------------+
79-
| `Gradient Normalization`_ | `Norm Loss`_ | `Positive-Negative Momentum`_ |
80-
+------------------------------------------+-------------------------------------+--------------------------------------------+
81-
| `Linear learning rate warmup`_ | `Stable weight decay`_ | `Explore-exploit learning rate schedule`_ |
82-
+------------------------------------------+-------------------------------------+--------------------------------------------+
83-
| `Lookahead`_ | `Chebyshev learning rate schedule`_ | `(Adaptive) Sharpness-Aware Minimization`_ |
84-
+------------------------------------------+-------------------------------------+--------------------------------------------+
85-
| `On the Convergence of Adam and Beyond`_ | | |
86-
+------------------------------------------+-------------------------------------+--------------------------------------------+
76+
+------------------------------------------+---------------------------------------------+--------------------------------------------+
77+
| `Adaptive Gradient Clipping`_ | `Gradient Centralization`_ | `Softplus Transformation`_ |
78+
+------------------------------------------+---------------------------------------------+--------------------------------------------+
79+
| `Gradient Normalization`_ | `Norm Loss`_ | `Positive-Negative Momentum`_ |
80+
+------------------------------------------+---------------------------------------------+--------------------------------------------+
81+
| `Linear learning rate warmup`_ | `Stable weight decay`_ | `Explore-exploit learning rate schedule`_ |
82+
+------------------------------------------+---------------------------------------------+--------------------------------------------+
83+
| `Lookahead`_ | `Chebyshev learning rate schedule`_ | `(Adaptive) Sharpness-Aware Minimization`_ |
84+
+------------------------------------------+---------------------------------------------+--------------------------------------------+
85+
| `On the Convergence of Adam and Beyond`_ | `Gradient Surgery for Multi-Task Learning`_ | | |
86+
+------------------------------------------+---------------------------------------------+--------------------------------------------+
8787

8888
Adaptive Gradient Clipping
8989
--------------------------
@@ -195,6 +195,11 @@ On the Convergence of Adam and Beyond
195195

196196
- paper : `paper <https://openreview.net/forum?id=ryQu7f-RZ>`__
197197

198+
Gradient Surgery for Multi-Task Learning
199+
----------------------------------------
200+
201+
- paper : `paper <https://arxiv.org/abs/2001.06782>`__
202+
198203
Citations
199204
---------
200205

@@ -430,6 +435,17 @@ On the Convergence of Adam and Beyond
430435
year={2019}
431436
}
432437

438+
Gradient Surgery for Multi-Task Learning
439+
440+
::
441+
442+
@article{yu2020gradient,
443+
title={Gradient surgery for multi-task learning},
444+
author={Yu, Tianhe and Kumar, Saurabh and Gupta, Abhishek and Levine, Sergey and Hausman, Karol and Finn, Chelsea},
445+
journal={arXiv preprint arXiv:2001.06782},
446+
year={2020}
447+
}
448+
433449
Author
434450
------
435451

0 commit comments

Comments
 (0)