-
Notifications
You must be signed in to change notification settings - Fork 10
Pre first release cleanup #71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Hao Wu <[email protected]>
Signed-off-by: Hao Wu <[email protected]>
Signed-off-by: Hao Wu <[email protected]>
|
/ok to test fb57fe5 |
Signed-off-by: Hao Wu <[email protected]>
fb57fe5 to
9190b79
Compare
Signed-off-by: Hao Wu <[email protected]>
Signed-off-by: Hao Wu <[email protected]>
9190b79 to
5144a6e
Compare
Signed-off-by: Hao Wu <[email protected]>
Signed-off-by: Hao Wu <[email protected]>
Signed-off-by: Hao Wu <[email protected]>
Signed-off-by: Hao Wu <[email protected]>
Signed-off-by: Hao Wu <[email protected]>
emerging_optimizers/orthogonalized_optimizers/orthogonalized_optimizer.py
Show resolved
Hide resolved
| WeightDecayT = Literal["decoupled", "independent", "l2"] | ||
|
|
||
|
|
||
| class WeightDecayMixin: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for this to be a class rather than a set of functions that are chosen based on arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weight decay is a function highly coupled with optimizer, and shared among a lot of optim subclass.
Thought about a optimizer based class with this function and make everyone inherit from it, but not all of our optimizers use the same base.
Signed-off-by: Hao Wu <[email protected]>
Signed-off-by: Hao Wu <[email protected]>
|
/ok to test 832ae49 |
Signed-off-by: Hao Wu <[email protected]>
|
/ok to test da5be01 |
FDecaYed
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@mkhona-nvidia @FDecaYed please take a look, there will be another doc only clean up before we making a release, after which things changing things would become maintenance burden.
@mkhona-nvidia coverage of scalar optimizers and soap with adaptative criteria is low, would be good to improve. It can also be done after release, so not mandatory.