Skip to content

Muon is an optimizer for hidden layers in neural networks

License

Notifications You must be signed in to change notification settings

CoffeeVampir3/NorMuon

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

177 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Muon: An optimizer for the hidden layers of neural networks

Tentative implementation of NorMuon from https://arxiv.org/abs/2510.05491

Currently only implemented as single device:

SingleDeviceNorMuonWithAuxAdam(param_groups)

Original Muon implementation by:

Citation

@misc{jordan2024muon,
  author       = {Keller Jordan and Yuchen Jin and Vlado Boza and You Jiacheng and
                  Franz Cesista and Laker Newhouse and Jeremy Bernstein},
  title        = {Muon: An optimizer for hidden layers in neural networks},
  year         = {2024},
  url          = {https://kellerjordan.github.io/posts/muon/}
}

About

Muon is an optimizer for hidden layers in neural networks

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%