Skip to content

Conversation

@skyw
Copy link
Contributor

@skyw skyw commented Nov 5, 2025

  • Slightly updated test.
  • Force keyword arguments for optimizers that have very long argument list, otherwise error prone
  • Abstract weight decay to single mixin class
  • Some code style fix/improvement

@mkhona-nvidia @FDecaYed please take a look, there will be another doc only clean up before we making a release, after which things changing things would become maintenance burden.

@mkhona-nvidia coverage of scalar optimizers and soap with adaptative criteria is low, would be good to improve. It can also be done after release, so not mandatory.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 5, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

@skyw
Copy link
Contributor Author

skyw commented Nov 5, 2025

/ok to test fb57fe5

mkhona-nvidia
mkhona-nvidia previously approved these changes Nov 5, 2025
@skyw skyw force-pushed the skyw/prerelease_cleanup branch from 9190b79 to 5144a6e Compare November 6, 2025 20:00
WeightDecayT = Literal["decoupled", "independent", "l2"]


class WeightDecayMixin:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for this to be a class rather than a set of functions that are chosen based on arguments?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

weight decay is a function highly coupled with optimizer, and shared among a lot of optim subclass.

Thought about a optimizer based class with this function and make everyone inherit from it, but not all of our optimizers use the same base.

@skyw skyw marked this pull request as ready for review November 6, 2025 23:24
@skyw skyw requested a review from a team as a code owner November 6, 2025 23:24
@skyw
Copy link
Contributor Author

skyw commented Nov 6, 2025

/ok to test 832ae49

Signed-off-by: Hao Wu <[email protected]>
@mkhona-nvidia mkhona-nvidia self-requested a review November 7, 2025 16:40
@chtruong814
Copy link
Contributor

/ok to test da5be01

@skyw skyw merged commit daacec6 into main Nov 7, 2025
21 of 23 checks passed
@skyw skyw deleted the skyw/prerelease_cleanup branch November 7, 2025 23:07
Copy link

@FDecaYed FDecaYed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants