We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 6675dfe commit 5d924c5Copy full SHA for 5d924c5
docs/changelogs/v3.0.1.md
@@ -4,6 +4,10 @@
4
5
* Implement `FAdam` optimizer. (#241, #242)
6
* [Adam is a natural gradient optimizer using diagonal empirical Fisher information](https://arxiv.org/abs/2405.12807)
7
+* Tweak `AdaFactor` optimizer. (#236, #243)
8
+ * support not-using-first-momentum when beta1 is not given
9
+ * default dtype for first momentum to `bfloat16`
10
+ * clip second momentum to 0.999
11
12
### Bug
13
0 commit comments