You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@ MaxFactor is best described as a thoughtful integration of existing optimization
3
3
4
4
The optimizer makes practical engineering tradeoffs that work well empirically for speech recognition models. Its particular combination of approaches addresses practical challenges in training large speech and multimodal llms.
5
5
6
-
Why another optimizer? The architecture will serve as the foundation for the experimental Frequency-Adaptive Momentum (FAM) approach, which aims to leverage the inherent frequency structure of speech data in the optimization process itself.
6
+
Why another optimizer? The architecture will serve as the foundation for the experimental Frequency-Adaptive Momentum (FAM) approach, which aims to leverage the inherent frequency structure of speech data in the optimization process itself. Each characteristic was selected based on empirical evidence suggesting they work well for ASR/NLP models.
0 commit comments