Skip to content

Commit dc79a7c

Browse files
committed
docs: Useful Resources
1 parent 8420556 commit dc79a7c

File tree

1 file changed

+1
-14
lines changed

1 file changed

+1
-14
lines changed

README.rst

Lines changed: 1 addition & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -315,45 +315,32 @@ Lookahead
315315
| ``k`` steps forward, 1 step back. ``Lookahead`` consisting of keeping an exponential moving average of the weights that is
316316
| updated and substituted to the current weights every ``k_{lookahead}`` steps (5 by default).
317317
318-
- code : `github <https://github.com/alphadl/lookahead.pytorch>`__
319-
- paper : `arXiv <https://arxiv.org/abs/1907.08610v2>`__
320-
321318
Chebyshev learning rate schedule
322319
--------------------------------
323320

324321
Acceleration via Fractal Learning Rate Schedules.
325322

326-
- paper : `arXiv <https://arxiv.org/abs/2103.01338v1>`__
327-
328323
(Adaptive) Sharpness-Aware Minimization
329324
---------------------------------------
330325

331326
| Sharpness-Aware Minimization (SAM) simultaneously minimizes loss value and loss sharpness.
332327
| In particular, it seeks parameters that lie in neighborhoods having uniformly low loss.
333328
334-
- SAM paper : `paper <https://arxiv.org/abs/2010.01412>`__
335-
- ASAM paper : `paper <https://arxiv.org/abs/2102.11600>`__
336-
- A/SAM code : `github <https://github.com/davda54/sam>`__
337-
338329
On the Convergence of Adam and Beyond
339330
-------------------------------------
340331

341-
- paper : `paper <https://openreview.net/forum?id=ryQu7f-RZ>`__
332+
| Convergence issues can be fixed by endowing such algorithms with `long-term memory' of past gradients
342333
343334
Improved bias-correction in Adam
344335
--------------------------------
345336

346337
| With the default bias-correction, Adam may actually make larger than requested gradient updates early in training.
347338
348-
- paper : `arXiv <https://arxiv.org/abs/2110.10828>`_
349-
350339
Adaptive Gradient Norm Correction
351340
---------------------------------
352341

353342
| Correcting the norm of gradient in each iteration based on the adaptive training history of gradient norm.
354343
355-
- paper : `arXiv <https://arxiv.org/abs/2210.06364>`__
356-
357344
Citation
358345
--------
359346

0 commit comments

Comments
 (0)