File tree Expand file tree Collapse file tree 1 file changed +2
-2
lines changed
pytorch_optimizer/optimizer Expand file tree Collapse file tree 1 file changed +2
-2
lines changed Original file line number Diff line number Diff line change @@ -14,8 +14,8 @@ class AdaFactor(Optimizer, BaseOptimizer):
1414
1515 :param params: PARAMETERS. iterable of parameters to optimize or dicts defining parameter groups.
1616 :param lr: float. learning rate.
17- :param betas: Union[ BETAS, None] . coefficients used for computing running averages of gradient and the squared
18- hessian trace. if betas is None, first momentum will be skipped.
17+ :param betas: BETAS. coefficients used for computing running averages of gradient and the squared
18+ hessian trace. if beta1 is None, first momentum will be skipped.
1919 :param decay_rate: float. coefficient used to compute running averages of square gradient.
2020 :param weight_decay: float. weight decay (L2 penalty).
2121 :param weight_decouple: bool. the optimizer uses decoupled weight decay as in AdamW.
You can’t perform that action at this time.
0 commit comments