TuringLang · mhauru · Jan 6, 2026 · Dec 11, 2025 · Dec 12, 2025 · Dec 12, 2025
diff --git a/HISTORY.md b/HISTORY.md
@@ -2,6 +2,112 @@
 
 ## 0.40
 
+### Changes to indexing random variables with square brackets
+
+0.40 internally reimplements how DynamicPPL handles random variables like `x[1]`, `x.y[2,2]`, and `x[:,1:4,5]`, i.e. ones that use indexing with square brackets.
+Most of this is invisible to users, but it has some effects that show on the surface.
+The gist of the changes is that any indexing by square brackets is now implicitly assumed to be indexing into a regular `Base.Array`, with 1-based indexing.
+The general effect this has is that the new rules on what is and isn't allowed are stricter, forbidding some old syntax that used to be allowed, and at the same time guaranteeing that it works correctly.
+(Previously there were some sharp edges around these sorts of variable names.)
+
+#### No more linear indexing of multidimensional arrays
+
+Previously you could do this:
+
+```julia
+x = Array{Float64,2}(undef, (2, 2))
+x[1] ~ Normal()
+x[1, 1] ~ Normal()
+```
+
+Now you can't, this will error.
+If you first create a variable like `x[1]`, DynamicPPL from there on assumes that this variable only takes a single index (like a `Vector`).
+It will then error if you try to index the same variable with any other number of indices.
+
+The same logic also bans this, which likewise was previously allowed:
+
+```julia
+x = Array{Float64,2}(undef, (2, 2))
+x[1, 1, 1] ~ Normal()
+x[1, 1] ~ Normal()
+```
+
+This made use of Julia allowing trailing indices of `1`.
+
+Note that the above models were previously quite dangerous and easy to misuse, because DynamicPPL was oblivious to the fact that e.g. `x[1]` and `x[1,1]` refer to the same element.
+Both of the above examples previously created 2-dimensional models, with two distinct random variables, one of which effectively overwrote the other in the model body.
+
+TODO(mhauru) This may cause surprising issues when using `eachindex`, which is generally encouraged, e.g.
+
+```
+x = Array{Float64,2}(undef, (3, 3))
+for i in eachindex(x)
+    x[i] ~ Normal()
+end
+```
+
+Maybe we should fix linear indexing before releasing?
+
+#### No more square bracket indexing with arbitrary keys
+
+Previously you could do this:
+
+```julia
+x = Dict()
+x["a"] ~ Normal()
+```
+
+Now you can't, this will error.
+This is because DynamicPPL now assumes that if you are indexing with square brackets, you are dealing with an `Array`, for which `"a"` is not a valid index.
+You can still use a dictionary on the left-hand side of a `~` statement as long as the indices are valid indices to an `Array`, e.g. integers.
+
+#### No more unusually indexed arrays, such as `OffsetArrays`
+
+Previously you could do this
+
+```julia
+using OffsetArrays
+x = OffsetArray(Vector{Float64}(undef, 3), -3)
+x[-2] ~ Normal()
+0.0 ~ Normal(x[-2])
+```
+
+Now you can't, this will error.
+This is because DynamicPPL now assumes that if you are indexing with square brackes, you are dealing with an `Array`, for which `-2` is not a valid index.
+
+#### The above limitations are not fundamental
+
+The above, new restrictions to what sort of variable names are allowed aren't fundamental.
+With some effort we could e.g. add support for linear indexing, this time done properly, so that e.g. `x[1,1]` and `x[1]` would be the same variable.
+Likewise, we could manually add structures to support indexing into dictionaries or `OffsetArrays`.
+If this would be useful to you, let us know.
+
+#### This only affects `~` statements
+
+You can still use any arbitrary indexing within your model in statements that don't involve `~`.
+For instance, you can use `OffsetArray`s, or linear indexing, as long as you don't put them on the left-hand side of a `~`.
+
+#### Performance benefits
+
+The upside of all these new limitations is that models that use square bracket indexing are now faster.
+For instance, take the following model
+
+```julia
+@model function f()
+    x = Vector{Float64}(undef, 1000)
+    for i in eachindex(x)
+        x[i] ~ Normal()
+    end
+    return 0.0 ~ Normal(sum(x))
+end
+```
+
+Evaluating the log joint for this model has gotten about 3 times faster in v0.40.
+
+#### Robustness benefits
+
+TODO(mhauru) Add an example here for how this improves `condition`ing, once `condition` uses `VarNamedTuple`.
+
 ## 0.39.4
 
 Removed the internal functions `DynamicPPL.getranges`, `DynamicPPL.vector_getrange`, and `DynamicPPL.vector_getranges` (the new LogDensityFunction construction does exactly the same thing, so this specialised function was not needed).

diff --git a/docs/src/internals/varnamedtuple.md b/docs/src/internals/varnamedtuple.md
@@ -50,6 +50,7 @@ The typical use of this structure in DynamicPPL is that the user may define valu
 This is also the reason why `PartialArray`, and by extension `VarNamedTuple`, do not support indexing by `Colon()`, i.e. `:`, as in `x[:]`.
 A `Colon()` says that we should get or set all the values along that dimension, but a `PartialArray` does not know how many values there may be.
 If `x[1]` and `x[4]` have been set, asking for `x[:]` is not a well-posed question.
+Note however, that concretising the `VarName` resolves this ambiguity, and makes the `VarName` fine as a key to a `VarNamedTuple`.
 
 `PartialArray`s have other restrictions, compared to the full indexing syntax of Julia, as well:
 They do not support linearly indexing into multidimemensional arrays (as in `rand(3,3)[8]`), nor indexing with arrays of indices (as in `rand(4)[[1,3]]`), nor indexing with boolean mask arrays (as in `rand(4)[[true, false, true, false]]`).
@@ -144,6 +145,29 @@ You can also set the elements with `vnt = setindex!!(vnt, @varname(a[1]), 3.0)`,
 At this point you can not set any new values in that array that would be outside of its range, with something like `vnt = setindex!!(vnt, @varname(a[5]), 5.0)`.
 The philosophy here is that once a `Base.Array` has been attached to a `VarName`, that takes precedence, and a `PartialArray` is only used as a fallback when we are told to store a value for `@varname(a[i])` without having any previous knowledge about what `@varname(a)` is.
 
+## Non-Array blocks with `IndexLens`es
+
+The above is all that is needed for setting regular scalar values.
+However, in DynamicPPL we also have a particular need for something slightly odd:
+We sometimes need to do calls like `setindex!!(vnt, @varname(a[1:5]), val)` on a `val` that is _not_ an `AbstractArray`, or even iterable at all.
+Normally this would error: As a scalar value with size `()`, `val` is the wrong size to be set with `@varname(a[1:5])`, which clearly wants something with size `(5,)`.
+However, we want to allow this even if `val` is not an iterable, if it is some object for which `size` is well-defined, and `size(val) == (5,)`.
+In DynamicPPL this comes up when storing e.g. the priors of a model, where a random variable like `@varname(a[1:5])` may be associated with a prior that is a 5-dimensional distribution.
+
+Internally, a `PartialArray` is just a regular `Array` with a mask saying which elements have been set.
+Hence we can't store `val` directly in the same `PartialArray`:
+We need it to take up a sub-block of the array, in our example case a sub-block of length 5.
+To this end, internally, `PartialArray` uses a wrapper type called `ArrayLikeWrapper`, that stores `val` together with the indices that are being used to set it.
+The `PartialArray` has all its corresponding elements, in our example elements 1, 2, 3, 4, and, 5, point to the same wrapper object.
+
+While such blocks can be stored using a wrapper like this, some care must be taken in indexing into these blocks.
+For instance, after setting a block with `setindex!!(vnt, @varname(a[1:5]), val)`, we can't `getindex(vnt, @varname(a[1]))`, since we can't return "the first element of five in `val`", because `val` may not be indexable in any way.
+Similarly, if next we set `setindex!!(vnt, @varname(a[1]), some_other_value)`, that should invalidate/delete the elements `@varname(a[2:5])`, since the block only makes sense as a whole.
+Because of these reasons, setting and getting blocks of well-defined size like this is allowed with `VarNamedTuple`s, but _only by always using the full range_.
+For instance, if `setindex!!(vnt, @varname(a[1:5]), val)` has been set, then the only valid `getindex` key to access `val` is `@varname(a[1:5])`;
+Not `@varname(a[1:10])`, nor `@varname(a[3])`, nor for anything else that overlaps with `@varname(a[1:5])`.
+`haskey` likewise only returns true for `@varname(a[1:5])`, and `keys(vnt)` only has that as an element.
+
 ## Limitations
 
 This design has a several of benefits, for performance and generality, but it also has limitations:

diff --git a/ext/DynamicPPLMarginalLogDensitiesExt.jl b/ext/DynamicPPLMarginalLogDensitiesExt.jl
@@ -1,6 +1,6 @@
 module DynamicPPLMarginalLogDensitiesExt
 
-using DynamicPPL: DynamicPPL, LogDensityProblems, VarName
+using DynamicPPL: DynamicPPL, LogDensityProblems, VarName, RangeAndLinked
 using MarginalLogDensities: MarginalLogDensities
 
 # A thin wrapper to adapt a DynamicPPL.LogDensityFunction to the interface expected by
@@ -105,11 +105,9 @@ function DynamicPPL.marginalize(
     ldf = DynamicPPL.LogDensityFunction(model, getlogprob, varinfo)
     # Determine the indices for the variables to marginalise out.
     varindices = mapreduce(vcat, marginalized_varnames) do vn
-        if DynamicPPL.getoptic(vn) === identity
-            ldf._iden_varname_ranges[DynamicPPL.getsym(vn)].range
-        else
-            ldf._varname_ranges[vn].range
-        end
+        # The type assertion helps in cases where the model is type unstable and thus
+        # `varname_ranges` may have an abstract element type.
+        (ldf._varname_ranges[vn]::RangeAndLinked).range
     end
     mld = MarginalLogDensities.MarginalLogDensity(
         LogDensityFunctionWrapper(ldf, varinfo),

diff --git a/src/contexts/init.jl b/src/contexts/init.jl
@@ -206,13 +206,17 @@ an unlinked value.
 
 $(TYPEDFIELDS)
 """
-struct RangeAndLinked
+struct RangeAndLinked{T<:Tuple}
     # indices that the variable corresponds to in the vectorised parameter
     range::UnitRange{Int}
     # whether it's linked
     is_linked::Bool
+    # original size of the variable before vectorisation
+    original_size::T
 end
 
+Base.size(ral::RangeAndLinked) = ral.original_size
+
 """
     VectorWithRanges{Tlink}(
         varname_ranges::VarNamedTuple,
@@ -247,7 +251,12 @@ struct VectorWithRanges{Tlink,VNT<:VarNamedTuple,T<:AbstractVector{<:Real}}
 end
 
 function _get_range_and_linked(vr::VectorWithRanges, vn::VarName)
-    return vr.varname_ranges[vn]
+    # The type assertion does nothing if VectorWithRanges has concrete element types, as is
+    # the case for all type stable models. However, if the model is not type stable,
+    # vr.varname_ranges[vn] may infer to have type `Any`. In this case it is helpful to
+    # assert that it is a RangeAndLinked, because even though it remains non-concrete,
+    # it'll allow the compiler to infer the types of `range` and `is_linked`.
+    return vr.varname_ranges[vn]::RangeAndLinked
 end
 function init(
     ::Random.AbstractRNG,

diff --git a/src/logdensityfunction.jl b/src/logdensityfunction.jl
@@ -330,7 +330,10 @@ function get_ranges_and_linked_metadata(md::Metadata, start_offset::Int)
     for (vn, idx) in md.idcs
         is_linked = md.is_transformed[idx]
         range = md.ranges[idx] .+ (start_offset - 1)
-        all_ranges = BangBang.setindex!!(all_ranges, RangeAndLinked(range, is_linked), vn)
+        orig_size = varnamesize(vn)
+        all_ranges = BangBang.setindex!!(
+            all_ranges, RangeAndLinked(range, is_linked, orig_size), vn
+        )
         offset += length(range)
     end
     return all_ranges, offset
@@ -341,7 +344,10 @@ function get_ranges_and_linked_metadata(vnv::VarNamedVector, start_offset::Int)
     for (vn, idx) in vnv.varname_to_index
         is_linked = vnv.is_unconstrained[idx]
         range = vnv.ranges[idx] .+ (start_offset - 1)
-        all_ranges = BangBang.setindex!!(all_ranges, RangeAndLinked(range, is_linked), vn)
+        orig_size = varnamesize(vn)
+        all_ranges = BangBang.setindex!!(
+            all_ranges, RangeAndLinked(range, is_linked, orig_size), vn
+        )
         offset += length(range)
     end
     return all_ranges, offset

diff --git a/src/test_utils.jl b/src/test_utils.jl
@@ -1,6 +1,7 @@
 module TestUtils
 
 using AbstractMCMC
+using AbstractPPL: AbstractPPL
 using DynamicPPL
 using LinearAlgebra
 using Distributions

diff --git a/src/test_utils/models.jl b/src/test_utils/models.jl
@@ -565,6 +565,71 @@ function varnames(model::Model{typeof(demo_assume_matrix_observe_matrix_index)})
     return [@varname(s), @varname(m)]
 end
 
+@model function demo_nested_colons(
+    x=(; data=[(; subdata=transpose([1.5 2.0;]))]), ::Type{TV}=Array{Float64}
+) where {TV}
+    n = length(x.data[1].subdata)
+    d = n ÷ 2
+    s = (; params=[(; subparams=TV(undef, (d, 1, 2)))])
+    s.params[1].subparams[:, 1, :] ~ reshape(
+        product_distribution(fill(InverseGamma(2, 3), n)), d, 2
+    )
+    s_vec = vec(s.params[1].subparams)
+    # TODO(mhauru) The below element type concretisation is because of
+    # https://github.com/JuliaFolds2/BangBang.jl/issues/39
+    # which causes, when this is evaluated with an untyped VarInfo, s_vec to be an
+    # Array{Any}.
+    s_vec = [x for x in s_vec]
+    m ~ MvNormal(zeros(n), Diagonal(s_vec))
+
+    x.data[1].subdata[:, 1] ~ MvNormal(m, Diagonal(s_vec))
+
+    return (; s=s, m=m, x=x)
+end
+function logprior_true(model::Model{typeof(demo_nested_colons)}, s, m)
+    n = length(model.args.x.data[1].subdata)
+    # TODO(mhauru) We need to enforce a convention on whether this function gets called
+    # with the parameters as the model returns them, or with the parameters "unpacked".
+    # Currently different tests do different things.
+    s_vec = if s isa NamedTuple
+        vec(s.params[1].subparams)
+    else
+        vec(s)
+    end
+    return loglikelihood(InverseGamma(2, 3), s_vec) +
+           logpdf(MvNormal(zeros(n), Diagonal(s_vec)), m)
+end
+function loglikelihood_true(model::Model{typeof(demo_nested_colons)}, s, m)
+    # TODO(mhauru) We need to enforce a convention on whether this function gets called
+    # with the parameters as the model returns them, or with the parameters "unpacked".
+    # Currently different tests do different things.
+    s_vec = if s isa NamedTuple
+        vec(s.params[1].subparams)
+    else
+        vec(s)
+    end
+    return loglikelihood(MvNormal(m, Diagonal(s_vec)), model.args.x.data[1].subdata)
+end
+function logprior_true_with_logabsdet_jacobian(
+    model::Model{typeof(demo_nested_colons)}, s, m
+)
+    return _demo_logprior_true_with_logabsdet_jacobian(model, s.params[1].subparams, m)
+end
+function varnames(::Model{typeof(demo_nested_colons)})
+    return [
+        @varname(
+            s.params[1].subparams[
+                AbstractPPL.ConcretizedSlice(Base.Slice(Base.OneTo(1))),
+                1,
+                AbstractPPL.ConcretizedSlice(Base.Slice(Base.OneTo(2))),
+            ]
+        ),
+        # @varname(s.params[1].subparams[1,1,1]),
+        # @varname(s.params[1].subparams[1,1,2]),
+        @varname(m),
+    ]
+end
+
 const UnivariateAssumeDemoModels = Union{
     Model{typeof(demo_assume_dot_observe)},
     Model{typeof(demo_assume_dot_observe_literal)},
@@ -615,8 +680,8 @@ function likelihood_optima(model::MultivariateAssumeDemoModels)
     vals = rand_prior_true(model)
 
     # NOTE: These are "as close to zero as we can get".
-    vals.s[1] = 1e-32
-    vals.s[2] = 1e-32
+    vals.s[1] = floatmin()
+    vals.s[2] = floatmin()
 
     vals.m[1] = 1.5
     vals.m[2] = 2.0
@@ -668,8 +733,8 @@ function likelihood_optima(model::MatrixvariateAssumeDemoModels)
     vals = rand_prior_true(model)
 
     # NOTE: These are "as close to zero as we can get".
-    vals.s[1, 1] = 1e-32
-    vals.s[1, 2] = 1e-32
+    vals.s[1, 1] = floatmin()
+    vals.s[1, 2] = floatmin()
 
     vals.m[1] = 1.5
     vals.m[2] = 2.0
@@ -701,6 +766,51 @@ function rand_prior_true(rng::Random.AbstractRNG, model::MatrixvariateAssumeDemo
     return vals
 end
 
+function posterior_mean(model::Model{typeof(demo_nested_colons)})
+    # Get some containers to fill.
+    vals = rand_prior_true(model)
+
+    vals.s.params[1].subparams[1, 1, 1] = 19 / 8
+    vals.m[1] = 3 / 4
+
+    vals.s.params[1].subparams[1, 1, 2] = 8 / 3
+    vals.m[2] = 1
+
+    return vals
+end
+function likelihood_optima(model::Model{typeof(demo_nested_colons)})
+    # Get some containers to fill.
+    vals = rand_prior_true(model)
+
+    # NOTE: These are "as close to zero as we can get".
+    vals.s.params[1].subparams[1, 1, 1] = floatmin()
+    vals.s.params[1].subparams[1, 1, 2] = floatmin()
+
+    vals.m[1] = 1.5
+    vals.m[2] = 2.0
+
+    return vals
+end
+function posterior_optima(model::Model{typeof(demo_nested_colons)})
+    # Get some containers to fill.
+    vals = rand_prior_true(model)
+
+    # TODO: Figure out exact for `s[1]`.
+    vals.s.params[1].subparams[1, 1, 1] = 0.890625
+    vals.s.params[1].subparams[1, 1, 2] = 1
+    vals.m[1] = 3 / 4
+    vals.m[2] = 1
+
+    return vals
+end
+function rand_prior_true(rng::Random.AbstractRNG, ::Model{typeof(demo_nested_colons)})
+    svec = rand(rng, InverseGamma(2, 3), 2)
+    return (;
+        s=(; params=[(; subparams=reshape(svec, (1, 1, 2)))]),
+        m=rand(rng, MvNormal(zeros(2), Diagonal(svec))),
+    )
+end
+
 """
 A collection of models corresponding to the posterior distribution defined by
 the generative process
@@ -749,6 +859,7 @@ const DEMO_MODELS = (
     demo_dot_assume_observe_submodel(),
     demo_dot_assume_observe_matrix_index(),
     demo_assume_matrix_observe_matrix_index(),
+    demo_nested_colons(),
 )
 
 """