Skip to content

Conversation

@mkhona-nvidia
Copy link
Contributor

@mkhona-nvidia mkhona-nvidia commented Oct 1, 2025

Based on @leloykun's spectral clipping algorithm

Also updated Polar Express Newton-schulz coefficients to their "stable" version as described by the paper, excluding the last entry

@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 1, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@mkhona-nvidia mkhona-nvidia changed the title Adding "spectral clipped" weight decay to be used together with matrix-based preconditioning optimizers [Draft] Adding "spectral clipped" weight decay to be used together with matrix-based preconditioning optimizers Oct 1, 2025
@mkhona-nvidia mkhona-nvidia changed the title [Draft] Adding "spectral clipped" weight decay to be used together with matrix-based preconditioning optimizers Adding "spectral clipped" weight decay to be used together with matrix-based preconditioning optimizers Oct 3, 2025
@mkhona-nvidia mkhona-nvidia requested a review from skyw October 3, 2025 17:00
@mkhona-nvidia mkhona-nvidia force-pushed the mkhona/spectral_clipping_weight_decay branch from 8e5d24e to a00c443 Compare October 3, 2025 22:00
pablo-garay and others added 17 commits October 3, 2025 15:00
Signed-off-by: mikail <[email protected]>
* save unnecessary matmul

Signed-off-by: Hao Wu <[email protected]>

* simplify criteria logic

Signed-off-by: Hao Wu <[email protected]>

* remove max precondition dim

Signed-off-by: Hao Wu <[email protected]>
Signed-off-by: mikail <[email protected]>
Signed-off-by: mikail <[email protected]>
Signed-off-by: mikail <[email protected]>
Signed-off-by: mikail <[email protected]>
Signed-off-by: mikail <[email protected]>
Signed-off-by: mikail <[email protected]>
Signed-off-by: mikail <[email protected]>
Signed-off-by: mikail <[email protected]>
@mkhona-nvidia mkhona-nvidia force-pushed the mkhona/spectral_clipping_weight_decay branch from a00c443 to 4afbc39 Compare October 3, 2025 22:00
@mkhona-nvidia mkhona-nvidia requested a review from a team as a code owner October 3, 2025 22:00
@mkhona-nvidia
Copy link
Contributor Author

/ok to test 4afbc39

skyw added 2 commits October 3, 2025 15:26
Use math equation in some of the docstrings.

Get rid off long lines. Modified some unnecessary content.

No function change.

Signed-off-by: Hao Wu <[email protected]>
* Update eigen value criteria testing to save compute

Signed-off-by: Hao Wu <[email protected]>

* remove deprecated function in __all__

Signed-off-by: Hao Wu <[email protected]>

---------

Signed-off-by: Hao Wu <[email protected]>
@mkhona-nvidia mkhona-nvidia deleted the mkhona/spectral_clipping_weight_decay branch October 15, 2025 02:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants