We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 370169c commit f5027b1Copy full SHA for f5027b1
pytorch_optimizer/madgrad.py
@@ -38,7 +38,7 @@ def __init__(
38
):
39
"""A Momentumized, Adaptive, Dual Averaged Gradient Method for Stochastic (slightly modified)
40
:param params: PARAMETERS. iterable of parameters to optimize or dicts defining parameter groups
41
- :param lr: float. learning rate.
+ :param lr: float. learning rate
42
:param eps: float. term added to the denominator to improve numerical stability
43
:param weight_decay: float. weight decay (L2 penalty)
44
MADGRAD optimizer requires less weight decay than other methods, often as little as zero
0 commit comments