-
Notifications
You must be signed in to change notification settings - Fork 37
VNT Part 3: VarNamedTuple with concretized slices #1181
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
35c3e20
ArrayLikeBlock WIP
mhauru 4253e9b
ArrayLikeBlock WIP2
mhauru 5cb3916
Improve type stability of ArrayLikeBlock stuff
mhauru a96bb44
Test more invariants
mhauru a8014e6
Actually run VNT tests
mhauru cfc6041
Implement show for ArrayLikeBlock
mhauru e198fbb
Change keys on VNT to return an array
mhauru b77b0af
Fix keys and some tests for PartialArray
mhauru 633e920
Improve type stability
mhauru 222334a
Fix keys for PartialArray
mhauru d22face
More ArrayLikeBlock tests
mhauru 4cb49e1
Add docstrings
mhauru 420a6b2
Remove redundant code, improve documentation
mhauru d2f58d7
Merge branch 'mhauru/vnt-for-fastldf' into mhauru/arraylikeblock
mhauru ce9da19
Add Base.size(::RangeAndLinked)
mhauru 4eb33e9
Fix issues with RangeAndLinked and VNT
mhauru 51b399a
Write more design doc for ArrayLikeBlocks
mhauru 57fd11a
Make VNT support concretized slices
mhauru 7bdce5c
Start the VNT HISTORY.md entry
mhauru 9992051
Skip a type stability test on 1.10
mhauru 753ca81
Fix test_skip
mhauru d9e5405
Mark a test as broken on 1.10
mhauru 267c554
Trivial bug fix
mhauru 76ac5b6
Use skip rather than broken for an inference test
mhauru 0c50bd7
Fix a docs typo
mhauru 6b211b1
Use floatmin in test_utils
mhauru 57fd84b
Narrow a skip clause
mhauru File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -565,6 +565,71 @@ function varnames(model::Model{typeof(demo_assume_matrix_observe_matrix_index)}) | |
| return [@varname(s), @varname(m)] | ||
| end | ||
|
|
||
| @model function demo_nested_colons( | ||
| x=(; data=[(; subdata=transpose([1.5 2.0;]))]), ::Type{TV}=Array{Float64} | ||
| ) where {TV} | ||
| n = length(x.data[1].subdata) | ||
| d = n ÷ 2 | ||
| s = (; params=[(; subparams=TV(undef, (d, 1, 2)))]) | ||
| s.params[1].subparams[:, 1, :] ~ reshape( | ||
| product_distribution(fill(InverseGamma(2, 3), n)), d, 2 | ||
| ) | ||
| s_vec = vec(s.params[1].subparams) | ||
| # TODO(mhauru) The below element type concretisation is because of | ||
| # https://github.com/JuliaFolds2/BangBang.jl/issues/39 | ||
| # which causes, when this is evaluated with an untyped VarInfo, s_vec to be an | ||
| # Array{Any}. | ||
| s_vec = [x for x in s_vec] | ||
|
Comment on lines
+578
to
+582
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is untyped varinfo on the way out eventually?
Member
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, hopefully already in #1183. |
||
| m ~ MvNormal(zeros(n), Diagonal(s_vec)) | ||
|
|
||
| x.data[1].subdata[:, 1] ~ MvNormal(m, Diagonal(s_vec)) | ||
|
|
||
| return (; s=s, m=m, x=x) | ||
| end | ||
| function logprior_true(model::Model{typeof(demo_nested_colons)}, s, m) | ||
| n = length(model.args.x.data[1].subdata) | ||
| # TODO(mhauru) We need to enforce a convention on whether this function gets called | ||
| # with the parameters as the model returns them, or with the parameters "unpacked". | ||
| # Currently different tests do different things. | ||
| s_vec = if s isa NamedTuple | ||
| vec(s.params[1].subparams) | ||
| else | ||
| vec(s) | ||
| end | ||
| return loglikelihood(InverseGamma(2, 3), s_vec) + | ||
| logpdf(MvNormal(zeros(n), Diagonal(s_vec)), m) | ||
| end | ||
| function loglikelihood_true(model::Model{typeof(demo_nested_colons)}, s, m) | ||
| # TODO(mhauru) We need to enforce a convention on whether this function gets called | ||
| # with the parameters as the model returns them, or with the parameters "unpacked". | ||
| # Currently different tests do different things. | ||
| s_vec = if s isa NamedTuple | ||
| vec(s.params[1].subparams) | ||
| else | ||
| vec(s) | ||
| end | ||
| return loglikelihood(MvNormal(m, Diagonal(s_vec)), model.args.x.data[1].subdata) | ||
| end | ||
| function logprior_true_with_logabsdet_jacobian( | ||
| model::Model{typeof(demo_nested_colons)}, s, m | ||
| ) | ||
| return _demo_logprior_true_with_logabsdet_jacobian(model, s.params[1].subparams, m) | ||
| end | ||
| function varnames(::Model{typeof(demo_nested_colons)}) | ||
| return [ | ||
| @varname( | ||
| s.params[1].subparams[ | ||
| AbstractPPL.ConcretizedSlice(Base.Slice(Base.OneTo(1))), | ||
| 1, | ||
| AbstractPPL.ConcretizedSlice(Base.Slice(Base.OneTo(2))), | ||
| ] | ||
| ), | ||
| # @varname(s.params[1].subparams[1,1,1]), | ||
| # @varname(s.params[1].subparams[1,1,2]), | ||
| @varname(m), | ||
| ] | ||
| end | ||
|
|
||
| const UnivariateAssumeDemoModels = Union{ | ||
| Model{typeof(demo_assume_dot_observe)}, | ||
| Model{typeof(demo_assume_dot_observe_literal)}, | ||
|
|
@@ -615,8 +680,8 @@ function likelihood_optima(model::MultivariateAssumeDemoModels) | |
| vals = rand_prior_true(model) | ||
|
|
||
| # NOTE: These are "as close to zero as we can get". | ||
| vals.s[1] = 1e-32 | ||
| vals.s[2] = 1e-32 | ||
| vals.s[1] = floatmin() | ||
| vals.s[2] = floatmin() | ||
|
|
||
| vals.m[1] = 1.5 | ||
| vals.m[2] = 2.0 | ||
|
|
@@ -668,8 +733,8 @@ function likelihood_optima(model::MatrixvariateAssumeDemoModels) | |
| vals = rand_prior_true(model) | ||
|
|
||
| # NOTE: These are "as close to zero as we can get". | ||
| vals.s[1, 1] = 1e-32 | ||
| vals.s[1, 2] = 1e-32 | ||
| vals.s[1, 1] = floatmin() | ||
| vals.s[1, 2] = floatmin() | ||
|
|
||
| vals.m[1] = 1.5 | ||
| vals.m[2] = 2.0 | ||
|
|
@@ -701,6 +766,51 @@ function rand_prior_true(rng::Random.AbstractRNG, model::MatrixvariateAssumeDemo | |
| return vals | ||
| end | ||
|
|
||
| function posterior_mean(model::Model{typeof(demo_nested_colons)}) | ||
| # Get some containers to fill. | ||
| vals = rand_prior_true(model) | ||
|
|
||
| vals.s.params[1].subparams[1, 1, 1] = 19 / 8 | ||
| vals.m[1] = 3 / 4 | ||
|
|
||
| vals.s.params[1].subparams[1, 1, 2] = 8 / 3 | ||
| vals.m[2] = 1 | ||
|
|
||
| return vals | ||
| end | ||
| function likelihood_optima(model::Model{typeof(demo_nested_colons)}) | ||
| # Get some containers to fill. | ||
| vals = rand_prior_true(model) | ||
|
|
||
| # NOTE: These are "as close to zero as we can get". | ||
| vals.s.params[1].subparams[1, 1, 1] = floatmin() | ||
| vals.s.params[1].subparams[1, 1, 2] = floatmin() | ||
|
|
||
| vals.m[1] = 1.5 | ||
| vals.m[2] = 2.0 | ||
|
|
||
| return vals | ||
| end | ||
| function posterior_optima(model::Model{typeof(demo_nested_colons)}) | ||
| # Get some containers to fill. | ||
| vals = rand_prior_true(model) | ||
|
|
||
| # TODO: Figure out exact for `s[1]`. | ||
| vals.s.params[1].subparams[1, 1, 1] = 0.890625 | ||
| vals.s.params[1].subparams[1, 1, 2] = 1 | ||
| vals.m[1] = 3 / 4 | ||
| vals.m[2] = 1 | ||
|
|
||
| return vals | ||
| end | ||
| function rand_prior_true(rng::Random.AbstractRNG, ::Model{typeof(demo_nested_colons)}) | ||
| svec = rand(rng, InverseGamma(2, 3), 2) | ||
| return (; | ||
| s=(; params=[(; subparams=reshape(svec, (1, 1, 2)))]), | ||
| m=rand(rng, MvNormal(zeros(2), Diagonal(svec))), | ||
| ) | ||
| end | ||
|
|
||
| """ | ||
| A collection of models corresponding to the posterior distribution defined by | ||
| the generative process | ||
|
|
@@ -749,6 +859,7 @@ const DEMO_MODELS = ( | |
| demo_dot_assume_observe_submodel(), | ||
| demo_dot_assume_observe_matrix_index(), | ||
| demo_assume_matrix_observe_matrix_index(), | ||
| demo_nested_colons(), | ||
| ) | ||
|
|
||
| """ | ||
|
|
||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it strikes me that a lot of the problems we have here are actually incredibly similar to the problems of creating shadow memory in AD packages. the idea is that inside the model there is an
x, but inside the VNT there is a shadow copy ofxas well. cc @yebaiUh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think one possible general solution to this is going to be something that's similar to what ForwardDiff does (at least for
AbstractArray). Technically, it's even closer to a sparsity tracer.When
xis initialised to anOffsetArray{T}, we want the shadow ofx(let's call itdx) to be initialised to the sameOffsetArray{Dual{T}}. Each of these 'duals' needs to carry the value, along with the boolean mask indicating whether it was set or not. This is of course exactly the same as howPartialArrayhas one array of values and one array of masks.Then when
x[idxs...]is set to whateverrand(Normal())returns, we setdx[idxs...] = Dual(x[idxs...], true). Of course, the block elements will need the same workaround as before (complete with all thetypejoinshenanigans).Then, reading from the shadow memory becomes way more consistent, because indexing into
dxcarries the same semantics as indexing intox.For another example, let's say that
xanddxare regular Base.Matrix. Then when you index intodx[1, 1], it will be the same as indexing intodx[1]. You could even makedx[1] = ...error when you try to modify a value that was already set. ForBlockDuals, you could further 'resolve' or 'normalise' the indices into a standard, 1-based scheme because you now know the type of the container (I thinkBase.to_indicesdoes this, but not sure; in any case this should be easy to figure out). That allows you to use different indexing schemes to read the data, as long as they refer to the same elements.A scheme like this would also enable 'native support' for not only
OffsetArray, but also things likeDimensionalData.DimArraywhich probably has way more legitimate uses thanOffsetArray.This looks like a lot of work for VNT to do, but we are already duplicating the entire array in the VNT*, so in principle I don't think performance should be any worse. Indexing into this would still be type stable (in the absence of blocks, but that's the same as currently) and you would still be able to do all the checks you need on whether the value was set or not.
In particular, I think that this should be fairly easy to implement for all
AbstractArrays. (In fact, we know for a fact that it must be possible, because other packages do this successfully.) Extending this to dictionaries, or generally any type that supportsgetindex, would be annoying (for every new type you implement, you have to define the equivalent of a 'tangent type'†); it's doable, but it could be left for a later stage. I think theAbstractArraysupport would already solve 99% of pain points with this, and the path forwards to supporting other types would be clear, and when some type isn't supported, you can error with a very clear error message saying that "indexing into this type on LHS of tilde is not supported", whereas now our battle lines are quite arbitrarily drawn.* OK, there is one case where this will do a lot more work than VNT currently: if the user has a very large array, like a length-100 vector, but only
x[1] ~ ...and never the other elements. I don't really care about optimising performance for such cases. It is impossible to detect these without static analysis (and note that AD has a similar characteristic: if you differentiatef(x) = x[1]and pass in a length-100 vector, the gradient calculation will be equally unoptimised).† Notice that the way VNT normalises all property access into NamedTuples is exactly what Mooncake's
tangent_typedoes for structs.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Note that, since
tilde_assume!!is currently our only 'hook' into modifying the VNT, this means that you have to passxintotilde_assume!!as an argument. But I am not opposed to that at all. The alternative would be that you have to inspect the IR to determine when new arrays are being created.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a sound idea. VNT effectively carries a shadow copy for each LHS variable in tilde statements, and the dual-number mechanism used in autograd can be borrowed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this a lot. I had vague thoughts for something like this, some sort of minimal interface to add support for new container types, but I hadn't realised that we can just copy what AD packages do.