Skip to content

Conversation

@mkhona-nvidia
Copy link
Contributor

Added an optimizer for normalized weights, i.e. weights whose columns or rows sum to 1. This is referred to as the "Oblique manifold" and this optimizer performs Riemannian descent on the oblique manifold (see An Introduction to Optimization on Smooth Manifolds by Nicolas Boumal at https://www.nicolasboumal.net/book/ for details)

@copy-pr-bot
Copy link

copy-pr-bot bot commented Oct 2, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@mkhona-nvidia mkhona-nvidia changed the title Mkhona/normalized opt Normalized Riemannian optimizer Oct 3, 2025
@mkhona-nvidia mkhona-nvidia requested a review from skyw October 3, 2025 17:00
Copy link
Contributor

@skyw skyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Haven't finished reviewing all the tests. Will do another path when requested changes are done.

@mkhona-nvidia mkhona-nvidia force-pushed the mkhona/normalized_opt branch from 2f8543f to dd2194d Compare October 3, 2025 22:38
Copy link
Contributor

@skyw skyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of left overs need further fix, parameter for example.

I didn't check convergence test very carefully. Also L1 tests haven't been set up in CI, post a local run result in the chat before final approval.

@mkhona-nvidia
Copy link
Contributor Author

Couple of left overs need further fix, parameter for example.

I didn't check convergence test very carefully. Also L1 tests haven't been set up in CI, post a local run result in the chat before final approval.

Here are results of convergence test:

Running tests under Python 3.12.8: /Users/mkhona/miniconda3/envs/pytorch_env/bin/python
[ RUN ] NormalizedOptimizerConvergenceTest.test_oblique_adam_convergence
[ OK ] NormalizedOptimizerConvergenceTest.test_oblique_adam_convergence
[ RUN ] NormalizedOptimizerConvergenceTest.test_oblique_sgd_convergence
[ OK ] NormalizedOptimizerConvergenceTest.test_oblique_sgd_convergence
[ RUN ] NormalizedOptimizerConvergenceTest.test_optimizer_modes_convergence_adam_col
Final accuracy: 89.2
[ OK ] NormalizedOptimizerConvergenceTest.test_optimizer_modes_convergence_adam_col
[ RUN ] NormalizedOptimizerConvergenceTest.test_optimizer_modes_convergence_adam_row
Final accuracy: 80.4
[ OK ] NormalizedOptimizerConvergenceTest.test_optimizer_modes_convergence_adam_row
[ RUN ] NormalizedOptimizerConvergenceTest.test_optimizer_modes_convergence_sgd_col
Final accuracy: 74.8
[ OK ] NormalizedOptimizerConvergenceTest.test_optimizer_modes_convergence_sgd_col
[ RUN ] NormalizedOptimizerConvergenceTest.test_optimizer_modes_convergence_sgd_row
Final accuracy: 100.0
[ OK ] NormalizedOptimizerConvergenceTest.test_optimizer_modes_convergence_sgd_row

@mkhona-nvidia mkhona-nvidia force-pushed the mkhona/normalized_opt branch from b9a92ce to 25c7835 Compare October 4, 2025 00:04
@mkhona-nvidia
Copy link
Contributor Author

/ok to test 25c7835

@mkhona-nvidia
Copy link
Contributor Author

/ok to test f87162d

@mkhona-nvidia mkhona-nvidia force-pushed the mkhona/normalized_opt branch from c4db61a to 6eab498 Compare October 7, 2025 18:38
@mkhona-nvidia
Copy link
Contributor Author

/ok to test 6eab498

Copy link
Contributor

@skyw skyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rerun test after addressing last a few comment, plz.

# Set seed for CUDA if available
if torch.cuda.is_available():
torch.cuda.manual_seed_all(1234)
self.device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't auto select device, especially in test. If the test is accidently assigned to a machine without GPU, it can end up never being tested on GPU. Same for other test cases.

@mkhona-nvidia
Copy link
Contributor Author

/ok to test f97ef77

@skyw skyw merged commit fb1add8 into NVIDIA-NeMo:main Oct 7, 2025
12 checks passed
mkhona-nvidia added a commit to mkhona-nvidia/Emerging-Optimizers that referenced this pull request Oct 7, 2025
* added normalized optimizers and fixed docstrings and formatting

Signed-off-by: mikail <[email protected]>
mkhona-nvidia added a commit to mkhona-nvidia/Emerging-Optimizers that referenced this pull request Oct 7, 2025
* added normalized optimizers and fixed docstrings and formatting

Signed-off-by: mikail <[email protected]>
@mkhona-nvidia mkhona-nvidia deleted the mkhona/normalized_opt branch October 15, 2025 02:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants