-
Notifications
You must be signed in to change notification settings - Fork 19
Remove the type ParamSpaceSGD
#205
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hi @yebai , could you check to make sure that this is what you asked for? Personally, I feel the |
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
elseif alg isa KLMinScoreGradDescent | ||
return KLMinScoreGradDescentState(prob, q_init, 0, grad_buf, opt_st, obj_st, avg_st) | ||
else | ||
nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe throw a warning or error message here instead of letting it fail silently?
nothing | |
nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should never hit the else
condition, so let me use InvalidStateException
.
prob, re(params), iteration, grad_buf, opt_st, obj_st, avg_st | ||
) | ||
else | ||
nothing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above.
nothing | |
nothing |
obj_st::ObjSt | ||
avg_st::AvgSt | ||
end | ||
const ParamSpaceSGD = Union{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
const ParamSpaceSGD = Union{ | |
""" | |
This family of algorithms (`<:KLMinRepGradDescent`,`<:KLMinRepGradProxDescent`,`<:KLMinScoreGradDescent`) applies stochastic gradient descent (SGD) to the variational `objective` over the (Euclidean) space of variational parameters. | |
The trainable parameters in the variational approximation are expected to be extractable through `Optimisers.destructure`. | |
This requires the variational approximation to be marked as a functor through `Functors.@functor`. | |
""" | |
const ParamSpaceSGD = Union{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @Red-Portal -- I left some comments above. In addition, let's simplify the folder structure a bit for clarity:
- move all files in
paramsspacesgd
toalgorithms
, eg, "algorithms/paramspacesgd/constructors.jl" to "algorithms/constructors.jl" - keep each algorithm in its own file
Also, I'd suggest we consider renaming paramspacesgd.jl
to interface.jl
or something along the lines:
- "algorithms/paramspacesgd/paramspacesgd.jl" to "algorithms/interface.jl"
Hi Hong, I planned to do the restructuring in a separate PR to keep things simple in this one. Though:
After the release of v0.5, we'll be adding algorithms that don't conform to the original |
It is okay to keep all algorithms under
You can add more algorithms to |
I am saying that these new algorithms can't be grouped in |
"grouping" refers to grouping interface code together for similar VI algorithms in the proposed |
If I understand right, this PR flattens the existing Before we drop it, may I understand what concrete benefits the flattening delivers? In particular, are we planning to add other algorithm families alongside the current At the moment, I haven't quite convince myself that the flattening of the type hierarchy is necessary. |
I suggested the removal of the |
@yebai @sunxd3 Thanks for chiming in. Actually, I have a new idea. So I believe the main complaint at the moment is that the term In a nutshell, the nice thing about the current
or something along these lines? Would that resolve your concern? |
I think I see your point. But, I am not sure that helps.
That probably includes every learning algorithm in ML. |
Yes, that is indeed almost true! But the point is that there are a couple of important algorithms that don't quite conform to this formalism, as they result in a custom update rule. They don't fall out of a gradient estimator, but modify the parameter update step too. So this is the reason I wish to allow for two different abstraction levels. But as you said, most algorithms only require defining a gradient estimator. So the lower-level interface helps unify the code for all those algorithms. |
We are at risk of premature abstraction and introducing heuristic terminology here. It is better to work with concrete algorithms, and define a union type if sharing code is needed (eg, There might be some insights we can learn by taking a unifying view of parameter space gradient descent VI, but that is a discussion we should have offline for a review paper. |
My main beef with using Unions here is the following:
With that said, do you find the solution below still unsatisfying? At least I hope that this resolves your concern that the terminology is non-standard.
If you think we should still go with an implicit interface, then I'll follow for the sake of moving forward. |
This PR removes the use of the type
ParamSpaceSGD
, which provides a unifying implementation of VI algorithms that run SGD in parameter space. Instead, each parameter space SGD-based VI algorithm becomes its ownAbstractVariationalAlgorithm
, where the shared code implementingstep
is shared by dispatching over theirUnion
.This addresses #204