We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
There was an error while loading. Please reload this page.
1 parent 671b902 commit 96c91e6Copy full SHA for 96c91e6
docs/changelogs/v3.4.3.md renamed to docs/changelogs/v3.5.0.md
@@ -5,6 +5,10 @@
5
* Support `StableSPAM` optimizer. (#358, #359)
6
* [How to Train in 4-Bit More Stably than 16-Bit Adam](https://arxiv.org/abs/2502.17055?)
7
* Support `ScheduleFreeWrapper`. (#334, #360)
8
+* Implement `AdaGC` optimizer. (#364, #366)
9
+ * [Improving Training Stability for Large Language Model Pretraining](https://arxiv.org/abs/2502.11034)
10
+* Implement `Simplified-Ademamix` optimizer. (#364, #366)
11
+ * [Connections between Schedule-Free Optimizers, AdEMAMix, and Accelerated SGD Variants](https://arxiv.org/abs/2502.02431)
12
13
### Update
14
0 commit comments