-
Notifications
You must be signed in to change notification settings - Fork 12
Update README.md - Added background and usage example #32
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
README.md
Outdated
|
|
||
| ### Why They Matter | ||
|
|
||
| Shampoo optimizers have demonstrated significant practical impact in large-scale language model training. Most notably, they were used to train the **Kimi K2 model** ([arXiv:2507.20534](https://arxiv.org/abs/2507.20534)), showcasing their effectiveness at scale. These optimizers can: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kimi used Muon. And it is also debatable whether it is shampoo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably should just change to emerging optimizers to not exclude any future ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated to mention emerging optimizers
README.md
Outdated
|
|
||
| ### Optimizers Included | ||
|
|
||
| This project focuses on the following Shampoo class optimizers: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More will come.
|
@sbhavani could you address the CI failure? @snowmanwwg plz also review. |
Signed-off-by: Santosh Bhavani <sbhavani@nvidia.com>
Signed-off-by: Santosh Bhavani <sbhavani@nvidia.com>
…matrix-based preconditioning' Signed-off-by: Santosh Bhavani <sbhavani@nvidia.com>
Signed-off-by: Santosh Bhavani <sbhavani@nvidia.com>
Signed-off-by: Santosh Bhavani <sbhavani@nvidia.com>
Signed-off-by: Santosh Bhavani <sbhavani@nvidia.com>
Signed-off-by: Santosh Bhavani <sbhavani@nvidia.com>
|
/ok to test e5a78dd |
|
/ok to test 0408134 |
* Update README with background, usage examples Signed-off-by: Santosh Bhavani <sbhavani@nvidia.com> Signed-off-by: Pablo Garay <pagaray@nvidia.com>
No description provided.