diff --git a/README.md b/README.md
index f92ecab..ba9d295 100644
--- a/README.md
+++ b/README.md
@@ -5,8 +5,11 @@
 [![Coverage Status](https://coveralls.io/repos/github/TuringLang/AbstractPPL.jl/badge.svg?branch=main)](https://coveralls.io/github/TuringLang/AbstractPPL.jl?branch=main)
 [![Codecov](https://codecov.io/gh/TuringLang/AbstractPPL.jl/branch/main/graph/badge.svg)](https://codecov.io/gh/TuringLang/AbstractPPL.jl)
 
-A light-weight package to factor out interfaces and associated APIs for modelling languages for
-probabilistic programming.  High level goals are:
+A lightweight package containing interfaces and associated APIs for modelling languages for probabilistic programming.
+
+## Goals
+
+High level goals are:
 
   - Definition of an interface of few abstract types and a small set of functions that should be
     supported by all [probabilistic programs](./src/abstractprobprog.jl) and [trace
@@ -20,560 +23,8 @@ implementations, from very rich, dynamic languages like Turing.jl to highly cons
 simplified models such as GPs, GLMs, or plain log-density problems.
 
 A more short term goal is to start a process of cleanly refactoring and justifying parts of
-DynamicPPL.jl’s design, and hopefully to get on closer terms with Soss.jl.
-
-## `AbstractProbabilisticProgram` interface (still somewhat drafty)
-
-There are at least two somewhat incompatible conventions used for the term “model”.  None of this is
-particularly exact, but:
-
-  - In Turing.jl, if you write down a `@model` function and call it on arguments, you get a model
-    object paired with (a possibly empty set of) observations. This can be treated as instantiated
-    “conditioned” object with fixed values for parameters and observations.
-  - In Soss.jl, “model” is used for a symbolic “generative” object from which concrete functions, such as
-    densities and sampling functions, can be derived, _and_ which you can later condition on (and in
-    turn get a conditional density etc.).
-
-Relevant discussions:
-[1](https://julialang.zulipchat.com/#narrow/stream/234072-probprog/topic/Naming.20the.20.22likelihood.22.20thingy),
-[2](https://github.com/TuringLang/AbstractPPL.jl/discussions/10).
-
-### TL/DR:
-
-There are three interrelating aspects that this interface intends to standardize:
-
-  - Density calculation
-  - Sampling
-  - “Conversions” between different conditionings of models
-
-Therefore, the interface consists of an `AbstractProbabilisticProgram` supertype, together with
-functions
-
-  - `condition(::Model, ::Trace) -> ConditionedModel`
-  - `decondition(::ConditionedModel) -> GenerativeModel`
-  - `sample(::Model, ::Sampler = Exact(), [Int])` (from `AbstractMCMC.sample`)
-  - `logdensityof(::Model, ::Trace)` and `densityof(::Model, ::Trace)` (from
-    [DensityInterface.jl](https://github.com/JuliaMath/DensityInterface.jl))
-
-### Traces & probability expressions
-
-First, an infrastructural requirement which we will need below to write things out.
-
-The kinds of models we consider are, at least in a theoretical sense, distributions over *traces* –
-types which carry collections of values together with their names.  Existing realizations of these
-are `VarInfo` in Turing.jl, choice maps in Gen.jl, and the usage of named tuples in Soss.jl.
-
-Traces solve the problem of having to name random variables in function calls, and in samples from
-models.  In essence, every concrete trace type will just be a fancy kind of dictionary from variable
-names (ideally, `VarName`s) to values.
-
-Since we have to use this kind of mapping a lot in the specification of the interface, let’s for now
-just choose some arbitrary macro-like syntax like the following:
-
-```julia
-@T(Y[1] = …, Z = …)
-```
-
-Some more ideas for this kind of object can be found at the end.
-
-### “Conversions”
-
-The purpose of this part is to provide common names for how we want a model instance to be
-understood.  As we have seen, in some modelling languages, model instances are primarily generative,
-with some parameters fixed, while other instance types pair model instances conditioned on
-observations.  What I call “conversions” here is just an interface to transform between these two
-views and unify the involved objects under one language.
-
-Let’s start from a generative model with parameter `μ`:
-
-```julia
-# (hypothetical) generative spec a la Soss
-@generative_model function foo_gen(μ)
-    X ~ Normal(0, μ)
-    Y[1] ~ Normal(X)
-    return Y[2] ~ Normal(X + 1)
-end
-```
-
-Applying the “constructor” `foo_gen` now means to fix the parameter, and should return a concrete
-object of the generative type:
-
-```julia
-g = foo_gen(; μ=…)::SomeGenerativeModel
-```
-
-With this kind of object, we should be able to sample and calculate joint log-densities from, i.e.,
-over the combined trace space of `X`, `Y[1]`, and `Y[2]` – either directly, or by deriving the
-respective functions (e.g., by converting form a symbolic representation).
-
-For model types that contain enough structural information, it should then be possible to condition
-on observed values and obtain a conditioned model:
-
-```julia
-condition(g, @T(Y = …))::SomeConditionedModel
-```
-
-For this operation, there will probably exist syntactic sugar in the form of
-
-```julia
-g | @T(Y = …)
-```
-
-Now, if we start from a Turing.jl-like model instead, with the “observation part” already specified,
-we have a situation like this, with the observations `Y` fixed in the instantiation:
-
-```julia
-# conditioned spec a la DPPL
-@model function foo(Y, μ)
-    X ~ Normal(0, μ)
-    Y[1] ~ Normal(X)
-    return Y[2] ~ Normal(X + 1)
-end
-
-m = foo(; Y=…, μ=…)::SomeConditionedModel
-```
-
-From this we can, if supported, go back to the generative form via `decondition`, and back via
-`condition`:
-
-```julia
-decondition(m) == g::SomeGenerativeModel
-m == condition(g, @T(Y = …))
-```
-
-(with equality in distribution).
-
-In the case of Turing.jl, the object `m` would at the same time contain the information about the
-generative and posterior distribution `condition` and `decondition` can simply return different
-kinds of “tagged” model types which put the model specification into a certain context.
-
-Soss.jl pretty much already works like the examples above, with one model object being either a
-`JointModel` or a `ConditionedModel`, and the `|` syntax just being sugar for the latter.
-
-A hypothetical `DensityModel`, or something like the types from LogDensityProblems.jl, would be a
-case for a model type that does not support the structural operations `condition` and
-`decondition`.
-
-The invariances between these operations should follow normal rules of probability theory.  Not all
-methods or directions need to be supported for every modelling language; in this case, a
-`MethodError` or some other runtime error should be raised.
-
-There is no strict requirement for generative models and conditioned models to have different types
-or be tagged with variable names etc.  This is a choice to be made by the concrete implementation.
-
-Decomposing models into prior and observation distributions is not yet specified; the former is
-rather easy, since it is only a marginal of the generative distribution, while the latter requires
-more structural information.  Perhaps both can be generalized under the `query` function I discuss
-at the end.
-
-### Sampling
-
-Sampling in this case refers to producing values from the distribution specified in a model
-instance, either following the distribution exactly, or approximating it through a Monte Carlo
-algorithm.
-
-All sampleable model instances are assumed to implement the `AbstractMCMC` interface – i.e., at
-least [`step`](https://github.com/TuringLang/AbstractMCMC.jl#sampling-step), and accordingly
-`sample`, `steps`, `Samples`.  The most important aspect is `sample`, though, which plays the role
-of `rand` for distributions.
-
-The results of `sample` generalize `rand` – while `rand(d, N)` is assumed to give you iid samples,
-`sample(m, sampler, N)` returns a sample from a sequence (known as chain in the case of MCMC) of
-length `N` approximating `m`’s distribution by a specific sampling algorithm (which of course
-subsumes the case that `m` can be sampled from exactly, in which case the “chain” actually is iid).
-
-Depending on which kind of sampling is supported, several methods may be supported.  In the case of
-a (posterior) conditioned model with no known sampling procedure, we just have what is given through
-`AbstractMCMC`:
-
-```julia
-sample([rng], m, N, sampler; [args]) # chain of length N using `sampler`
-```
-
-In the case of a generative model, or a posterior model with exact solution, we can have some more
-methods without the need to specify a sampler:
-
-```julia
-sample([rng], m; [args])    # one random sample
-sample([rng], m, N; [args]) # N iid samples; equivalent to `rand` in certain cases
-```
-
-It should be possible to implement this by a special sampler, say, `Exact` (name still to be
-discussed), that can then also be reused for generative sampling:
-
-```
-step(g, spl = Exact(), state = nothing) # IID sample from exact distribution with trivial state
-sample(g, Exact(), [N]) 
-```
-
-with dispatch failing for models types for which exact sampling is not possible (or not
-implemented).
-
-This could even be useful for Monte Carlo methods not being based on Markov Chains, e.g.,
-particle-based sampling using a return type with weights, or rejection sampling.
-
-Not all variants need to be supported – for example, a posterior model might not support
-`sample(m)` when exact sampling is not possible, only `sample(m, N, alg)` for Markov chains.
-
-`rand` is then just a special case when “trivial” exact sampling works for a model, e.g. a joint
-model.
-
-### Density Evaluation
-
-Since the different “versions” of how a model is to be understood as generative or conditioned are
-to be expressed in the type or dispatch they support, there should be no need for separate functions
-`logjoint`, `loglikelihood`, etc., which force these semantic distinctions on the implementor; we
-therefore adapt the interface of
-[DensityInterface.jl](https://github.com/JuliaMath/DensityInterface.jl).  Its main function
-`logdensityof` should suffice for variants, with the distinction being made by the capabilities of
-the concrete model instance.
-
-DensityInterface.jl also requires the trait function `DensityKind`, which is set to `HasDensity()`
-for the `AbstractProbabilisticProgram` type.  Additional functions
-
-```
-DensityInterface.densityof(d, x) = exp(logdensityof(d, x))
-DensityInterface.logdensityof(d) = Base.Fix1(logdensityof, d)
-DensityInterface.densityof(d) = Base.Fix1(densityof, d)
-```
-
-are provided automatically (repeated here for clarity).
-
-Note that `logdensityof` strictly generalizes `logpdf`, since the posterior density will of course
-in general be unnormalized and hence not a probability density.
-
-The evaluation will usually work with the internal, concrete trace type, like `VarInfo` in
-Turing.jl:
-
-```julia
-logdensityof(m, vi)
-```
-
-But the user will more likely work on the interface using probability expressions:
-
-```julia
-logdensityof(m, @T(X = …))
-```
-
-(Note that this would replace the current `prob` string macro in Turing.jl.)
-
-Densities need (and usually, will) not be normalized.
-
-#### Implementation notes
-
-It should be able to make this fall back on the internal method with the right definition and
-implementation of `maketrace`:
-
-```julia
-logdensityof(m, t::ProbabilityExpression) = logdensityof(m, maketrace(m, t))
-```
-
-There is one open question – should normalized and unnormalized densities be able to be
-distinguished?  This could be done by dispatch as well, e.g., if the caller wants to make sure
-normalization:
-
-```
-logdensityof(g, @T(X = …, Y = …, Z = …); normalized=Val{true})
-```
-
-Although there is proably a better way through traits; maybe like for arrays, with
-`NormalizationStyle(g, t) = IsNormalized()`?
-
-## More on probability expressions
-
-Note that this needs to be a macro, if written this way, since the keys may themselves be more
-complex than just symbols (e.g., indexed variables.)  (Don’t hang yourselves up on that `@T` name
-though, this is just a working draft.)
-
-The idea here is to standardize the construction (and manipulation) of *abstract probability
-expressions*, plus the interface for turning them into concrete traces for a specific model – like
-[`@formula`](https://juliastats.org/StatsModels.jl/stable/formula/#Modeling-tabular-data) and
-[`apply_schema`](https://juliastats.org/StatsModels.jl/stable/internals/#Semantics-time-(apply_schema))
-from StatsModels.jl are doing.
-
-Maybe the following would suffice to do that:
-
-```julia
-maketrace(m, t)::tracetype(m, t)
-```
-
-where `maketrace` produces a concrete trace corresponding to `t` for the model `m`, and `tracetype`
-is the corresponding `eltype`–like function giving you the concrete trace type for a certain model
-and probability expression combination.
-
-Possible extensions of this idea:
-
-  - Pearl-style do-notation: `@T(Y = y | do(X = x))`
-  - Allowing free variables, to specify model transformations: `query(m, @T(X | Y))`
-  - “Graph queries”: `@T(X | Parents(X))`, `@T(Y | Not(X))` (a nice way to express Gibbs conditionals!)
-  - Predicate style for “measure queries”: `@T(X < Y + Z)`
-
-The latter applications are the reason I originally liked the idea of the macro being called `@P`
-(or even `@𝓅` or `@ℙ`), since then it would look like a “Bayesian probability expression”: `@P(X < Y + Z)`.  But this would not be so meaningful in the case of representing a trace instance.
-
-Perhaps both `@T` and `@P` can coexist, and both produce different kinds of `ProbabilityExpression`
-objects?
-
-NB: the exact details of this kind of “schema application”, and what results from it, will need to
-be specified in the interface of `AbstractModelTrace`, aka “the new `VarInfo`”.
-
-# `AbstractModelTrace`/`VarInfo` interface draft
-
-**This part is even draftier than the above – we’ll try out things in DynamicPPL.jl first**
-
-## Background
-
-### Why do we do this?
-
-As I have said before:
-
-> There are many aspects that make VarInfo a very complex data structure.
-
-Currently, there is an insane amount of complexity and implementation details in DynamicPPL.jl’s
-`varinfo.jl`, which has been rewritten multiple times with different concerns in mind – most times
-to improve concrete needs of Turing.jl, such as type stability, or requirements of specific
-samplers.
-
-This unfortunately makes `VarInfo` extremely opaque: it is hard to refactor without breaking
-anything (nobody really dares touching it), and a lot of knowledge about Turing.jl/DynamicPPL.jl
-internals is needed in order to judge the effects of changes.
-
-### Design choices
-
-Recently, @torfjelde [has shown](https://github.com/TuringLang/DynamicPPL.jl/pull/267/files) that a
-much simpler implementation is feasible – basically, just a wrapped `NamedTuple` with a minimal
-interface.
-
-The purpose of this proposal is twofold: first, to think about what a sufficient interface for
-`AbstractModelTrace`, the abstract supertype of `VarInfo`, should be, to allow multiple specialized
-variants and refactor the existing ones (typed/untyped and simple).  Second, to view the problem as
-the design of an abstract data type: the specification of construction and modification mechanisms
-for a dictionary-like structure.
-
-Related previous discussions:
-
-  - [Discussion about `VarName`](https://github.com/TuringLang/AbstractPPL.jl/discussions/7)
-  - [`AbstractVarInfo` representation](https://github.com/TuringLang/AbstractPPL.jl/discussions/5)
-
-Additionally (but closely related), the second part tries to formalize the “subsumption” mechanism
-of `VarName`s, and its interaction with using `VarName`s as keys/indices.
-
-Our discussions take place in what is a bit of a fuzzy zone between the part that is really
-“abstract”, and meant for the wider purpuse of AbstractPPL.jl – the implementation of probabilistic
-programming systems in general – and our concrete needs within DPPL.  I hope to always stay abstract
-and reusable; and there are already a couple of candidates for APPL clients other than DPPL, which
-will hopefully keep us focused: simulation based calibration, SimplePPL (a BUGS-like frontend), and
-ParetoSmoothing.jl.
-
-### What is going to change?
-
-  - For the end user of Turing.jl: nothing.  You usually don’t use `VarInfo`, or the raw evaluator
-    interface, anyways.  (Although if the newer data structures are more user-friendly, they might occur
-    in more places in the future?)
-  - For people having a look into code using `VarInfo`, or starting to hack on Turing.jl/DPPL.jl: a
-    huge reduction in cognitive complexity.  `VarInfo` implementations should be readable on their own,
-    and the implemented functions layed out somewhere.  Its usages should look like for any other nice,
-    normal data structure.
-  - For core DPPL.jl implementors: same as the previous, plus: a standard against which to improve and
-    test `VarInfo`, and a clearly defined design space for new data structures.
-  - For AbstractPPL.jl clients/PPL implementors: an interface to program against (as with the rest of
-    APPL), and an existing set of well-specified, flexible trace data types with different
-    characteristics.
-
-And in terms of implementation work in DPPL.jl: once the interface is fixed (or even during fixing
-it), varinfo.jl will undergo a heavy refactoring – which should make it _simpler_! (No three
-different getter functions with slightly different semantics, etc…).
-
-## Property interface
-
-The basic idea is for all `VarInfo`s to behave like ordered dictionaries with `VarName` keys – all
-common operations should just work.  There are two things that make them more special, though:
-
- 1. “Fancy indexing”: since `VarName`s are structured themselves, the `VarInfo` should be have a bit
-    like a trie, in the sense that all prefixes of stored keys should be retrievable.  Also,
-    subsumption of `VarName`s should be respected (see end of this document):
-    
-    ```julia
-    vi[@varname(x.a)] = [1, 2, 3]
-    vi[@varname(x.b)] = [4, 5, 6]
-    vi[@varname(x.a[2])] == 2
-    vi[@varname(x)] == (; a=[1, 2, 3], b=[4, 5, 6])
-    ```
-    
-    Generalizations that go beyond simple cases (those that you can imagine by storing individual
-    `setfield!`s in a tree) need not be implemented in the beginning; e.g.,
-    
-    ```julia
-    vi[@varname(x[1])] = 1
-    vi[@varname(x[2])] = 2
-    keys(vi) == [x[1], x[2]]
-    
-    vi[@varname(x)] = [1, 2]
-    keys(vi) == [x]
-    ```
-
- 2. (_This has to be discussed further._)  Information other than the sampled values, such as flags,
-    metadata, pointwise likelihoods, etc., can in principle be stored in multiple of these “`VarInfo`
-    dicts” with parallel structure.  For efficiency, it is thinkable to devise a design such that
-    multiple fields can be stored under the same indexing structure.
-    
-    ```julia
-    vi[@varname(x[1])] == 1
-    vi[@varname(x[1])].meta["bla"] == false
-    ```
-    
-    or something in that direction.
-    
-    (This is logically equivalent to a dictionary with named tuple values.  Maybe we can do what
-    [`DictTable`](https://github.com/JuliaData/TypedTables.jl/blob/main/src/DictTable.jl) does?)
-    
-    The old `order` field, indicating at which position in the evaluator function a variable has
-    been added (essentially a counter of insertions) can actually be left out completely, since the
-    dictionary is specified to be ordered by insertion.
-    
-    The important question here is: should the “joint data structure” behave like a dictionary of
-    `NamedTuple`s (`eltype(vi) == @NamedTuple{value::T, ℓ::Float64, meta}`), or like a struct of
-    dicts with shared keys (`eltype(vi.value) <: T`, `eltype(vi.ℓ) <: Float64`, …)?
-
-The required dictionary functions are about the following:
-
-  - Pure functions:
-    
-      + `iterate`, yielding pairs of `VarName` and the stored value
-      + `IteratorEltype == HasEltype()`, `IteratorSize = HasLength()`
-      + `keys`, `values`, `pairs`, `length` consistent with `iterate`
-      + `eltype`, `keytype`, `valuetype`
-      + `get`, `getindex`, `haskey` for indexing by `VarName`
-      + `merge` to join two `VarInfo`s
-
-  - Mutating functions:
-    
-      + `insert!!`, `set!!`
-      + `merge!!` to add and join elements (TODO: think about `merge`)
-      + `setindex!!`
-      + `empty!!`, `delete!!`, `unset!!` (_Are these really used anywhere? Not having them makes persistent
-        implementations much easier!_)
-
-I believe that adopting the interface of
-[Dictionaries.jl](https://github.com/andyferris/Dictionaries.jl), not `Base.AbstractDict`, would be
-ideal, since their approach make key sharing and certain operations naturally easy (particularly
-“broadcast-style”, i.e., transformations on the values, but not the keys).
-
-Other `Base` functions, like `enumerate`, should follow from the above.
-
-`length` might appear weird – but it should definitely be consistent with the iterator.
-
-It would be really cool if `merge` supported the combination of distinct types of implementations,
-e.g., a dynamic and a tuple-based part.
-
-To support both mutable and immutable/persistent implementations, let’s require consistent
-BangBang.jl style mutators throughout.
-
-## Transformations/Bijectors
-
-Transformations should ideally be handled explicitely and from outside: automatically by the
-compiler macro, or at the places required by samplers.
-
-Implementation-wise, they can probably be expressed as folds?
-
-```julia
-map(v -> link(v.dist, v.value), vi)
-```
-
-## Linearization
-
-There are multiple possible approaches to handle this:
-
- 1. As a special case of conversion: `Vector(vi)`
- 2. `copy!(vals_array, vi)`.
- 3. As a fold: `mapreduce(v -> vec(v.value), append!, vi, init=Float64[])`
-
-Also here, I think that the best implementation would be through a fold.  Variants (1) or (2) might
-additionally be provided as syntactic sugar.
-
-* * *
-
-# `VarName`-based axioms
-
-What follows is mostly an attempt to formalize subsumption.
-
-First, remember that in Turing.jl we can always work with _concretized_ `VarName`s: `begin`/`end`,
-`:`, and boolean indexing are all turned into some form of concrete cartesian or array indexing
-(assuming [this suggestion](https://github.com/TuringLang/AbstractPPL.jl/issues/35) being
-implemented).  This makes all index comparisons static.
-
-Now, `VarName`s have a compositional structure: they can be built by composing a root variable with
-more and more lenses (`VarName{v}()` starts off with an `IdentityLens`):
-
-```julia
-julia> vn = VarName{:x}() ∘ Setfield.IndexLens((1:10, 1) ∘ Setfield.IndexLens((2,)))
-x[1:10,1][2]
-```
-
-(_Note that the composition function, `∘`, is really in wrong order; but this is a heritage of
-Setfield.jl._)
-
-By “subsumption”, we mean the notion of a `VarName` expressing a more nested path than another one:
-
-```julia
-subsumes(@varname(x.a), @varname(x.a[1]))
-@varname(x.a) ⊒ @varname(x.a[1]) # \sqsupseteq
-@varname(x.a) ⋢ @varname(x.a[1]) # \nsqsubseteq
-```
-
-Thus, we have the following axioms for `VarName`s (“variables” are `VarName{n}()`):
-
- 1. `x ⊑ x` for all variables `x`
- 2. `x ≍ y` for `x ≠ y` (i.e., distinct variables are incomparable; `x ⋢ y` and `y ⋢ x`) (`≍` is `\asymp`)
- 3. `x ∘ ℓ ⊑ x` for all variables `x` and lenses `ℓ`
- 4. `x ∘ ℓ₁ ⊑ x ∘ ℓ₂ ⇔ ℓ₁ ⊑ ℓ₂`
-
-For the last axiom to work, we also have to define subsumption of individual, non-composed lenses:
-
- 1. `PropertyLens(a) == PropertyLens(b) ⇔ a == b`, for all symbols `a`, `b`
- 2. `FunctionLens(f) == FunctionLens(g) ⇔ f == g` (under extensional equality; I’m only mentioning
-    this in case we ever generalize to Bijector-ed variables like `@varname(log(x))`)
- 3. `IndexLens(ι₁) ⊑ IndexLens(ι₂)` if the index tuple `ι₂` covers all indices in `ι₁`; for example,
-    `_[1, 2:10] ⊑ _[1:10, 1:20]`.  (_This is a bit fuzzy and not all corner cases have been
-    considered yet!_)
- 4. `IdentityLens() == IdentityLens()`
- 5. `ℓ₁ ≍ ℓ₂`, otherwise
-
-Together, this should make `VarName`s under subsumption a reflexive poset.
-
-The fundamental requirement for `VarInfo`s is then:
-
-```
-vi[x ∘ ℓ] == get(vi[x], ℓ)
-```
-
-So we always want the following to work, automatically:
-
-```julia
-vi = insert!!(vi, vn, x)
-vi[vn] == x
-```
-
-(the trivial case), and
-
-```julia
-x = set!!(x, ℓ₁, a)
-x = set!!(x, ℓ₂, b)
-vi = insert!!(vi, vn, x)
-vi[vn ∘ ℓ₁] == a
-vi[vn ∘ ℓ₂] == b
-```
-
-since `vn` subsumes both `vn ∘ ℓ₁` and `vn ∘ ℓ₂`.
-
-Whether the opposite case is supported may depend on the implementation.  The most complicated part
-is “unification”:
+DynamicPPL.jl's design, and hopefully to get on closer terms with Soss.jl.
 
-```julia
-vi = insert!!(vi, vn ∘ ℓ₁, a)
-vi = insert!!(vi, vn ∘ ℓ₂, b)
-get(vi[vn], ℓ₁) == a
-get(vi[vn], ℓ₂) == b
-```
+## Documentation
 
-where `vn ∘ ℓ₁` and `vn ∘ ℓ₂` need to be recognized as “children” of a common parent `vn`.
+For detailed design documentation and interface specifications, please see the [documentation](https://turinglang.org/AbstractPPL.jl/).
\ No newline at end of file
diff --git a/docs/make.jl b/docs/make.jl
index d3dbe83..68e934d 100644
--- a/docs/make.jl
+++ b/docs/make.jl
@@ -9,7 +9,15 @@ DocMeta.setdocmeta!(AbstractPPL, :DocTestSetup, :(using AbstractPPL); recursive=
 makedocs(;
     sitename="AbstractPPL",
     modules=[AbstractPPL, Base.get_extension(AbstractPPL, :AbstractPPLDistributionsExt)],
-    pages=["Home" => "index.md", "API" => "api.md"],
+    pages=[
+        "Home" => "index.md",
+        "Design Documentation" => [
+            "AbstractProbabilisticProgram Interface" => "abstractprobabilisticprogram.md",
+            "AbstractModelTrace Interface" => "abstractmodeltrace.md",
+            "VarName Axioms" => "varname_axioms.md",
+        ],
+        "API" => "api.md"
+    ],
     checkdocs=:exports,
     doctest=false,
 )
diff --git a/docs/src/abstractmodeltrace.md b/docs/src/abstractmodeltrace.md
new file mode 100644
index 0000000..322b61f
--- /dev/null
+++ b/docs/src/abstractmodeltrace.md
@@ -0,0 +1,173 @@
+# `AbstractModelTrace`/`VarInfo` interface draft
+
+**This part is even draftier than the above – we'll try out things in DynamicPPL.jl first**
+
+## Background
+
+### Why do we do this?
+
+As I have said before:
+
+> There are many aspects that make VarInfo a very complex data structure.
+
+Currently, there is an insane amount of complexity and implementation details in DynamicPPL.jl's
+`varinfo.jl`, which has been rewritten multiple times with different concerns in mind – most times
+to improve concrete needs of Turing.jl, such as type stability, or requirements of specific
+samplers.
+
+This unfortunately makes `VarInfo` extremely opaque: it is hard to refactor without breaking
+anything (nobody really dares touching it), and a lot of knowledge about Turing.jl/DynamicPPL.jl
+internals is needed in order to judge the effects of changes.
+
+### Design choices
+
+Recently, @torfjelde [has shown](https://github.com/TuringLang/DynamicPPL.jl/pull/267/files) that a
+much simpler implementation is feasible – basically, just a wrapped `NamedTuple` with a minimal
+interface.
+
+The purpose of this proposal is twofold: first, to think about what a sufficient interface for
+`AbstractModelTrace`, the abstract supertype of `VarInfo`, should be, to allow multiple specialized
+variants and refactor the existing ones (typed/untyped and simple).  Second, to view the problem as
+the design of an abstract data type: the specification of construction and modification mechanisms
+for a dictionary-like structure.
+
+Related previous discussions:
+
+  - [Discussion about `VarName`](https://github.com/TuringLang/AbstractPPL.jl/discussions/7)
+  - [`AbstractVarInfo` representation](https://github.com/TuringLang/AbstractPPL.jl/discussions/5)
+
+Additionally (but closely related), the second part tries to formalize the "subsumption" mechanism
+of `VarName`s, and its interaction with using `VarName`s as keys/indices.
+
+Our discussions take place in what is a bit of a fuzzy zone between the part that is really
+"abstract", and meant for the wider purpuse of AbstractPPL.jl – the implementation of probabilistic
+programming systems in general – and our concrete needs within DPPL.  I hope to always stay abstract
+and reusable; and there are already a couple of candidates for APPL clients other than DPPL, which
+will hopefully keep us focused: simulation based calibration, SimplePPL (a BUGS-like frontend), and
+ParetoSmoothing.jl.
+
+### What is going to change?
+
+  - For the end user of Turing.jl: nothing.  You usually don't use `VarInfo`, or the raw evaluator
+    interface, anyways.  (Although if the newer data structures are more user-friendly, they might occur
+    in more places in the future?)
+  - For people having a look into code using `VarInfo`, or starting to hack on Turing.jl/DPPL.jl: a
+    huge reduction in cognitive complexity.  `VarInfo` implementations should be readable on their own,
+    and the implemented functions layed out somewhere.  Its usages should look like for any other nice,
+    normal data structure.
+  - For core DPPL.jl implementors: same as the previous, plus: a standard against which to improve and
+    test `VarInfo`, and a clearly defined design space for new data structures.
+  - For AbstractPPL.jl clients/PPL implementors: an interface to program against (as with the rest of
+    APPL), and an existing set of well-specified, flexible trace data types with different
+    characteristics.
+
+And in terms of implementation work in DPPL.jl: once the interface is fixed (or even during fixing
+it), varinfo.jl will undergo a heavy refactoring – which should make it _simpler_! (No three
+different getter functions with slightly different semantics, etc…).
+
+## Property interface
+
+The basic idea is for all `VarInfo`s to behave like ordered dictionaries with `VarName` keys – all
+common operations should just work.  There are two things that make them more special, though:
+
+ 1. "Fancy indexing": since `VarName`s are structured themselves, the `VarInfo` should be have a bit
+    like a trie, in the sense that all prefixes of stored keys should be retrievable.  Also,
+    subsumption of `VarName`s should be respected (see end of this document):
+    
+    ```julia
+    vi[@varname(x.a)] = [1, 2, 3]
+    vi[@varname(x.b)] = [4, 5, 6]
+    vi[@varname(x.a[2])] == 2
+    vi[@varname(x)] == (; a=[1, 2, 3], b=[4, 5, 6])
+    ```
+    
+    Generalizations that go beyond simple cases (those that you can imagine by storing individual
+    `setfield!`s in a tree) need not be implemented in the beginning; e.g.,
+    
+    ```julia
+    vi[@varname(x[1])] = 1
+    vi[@varname(x[2])] = 2
+    keys(vi) == [x[1], x[2]]
+    
+    vi[@varname(x)] = [1, 2]
+    keys(vi) == [x]
+    ```
+
+ 2. (_This has to be discussed further._)  Information other than the sampled values, such as flags,
+    metadata, pointwise likelihoods, etc., can in principle be stored in multiple of these "`VarInfo`
+    dicts" with parallel structure.  For efficiency, it is thinkable to devise a design such that
+    multiple fields can be stored under the same indexing structure.
+    
+    ```julia
+    vi[@varname(x[1])] == 1
+    vi[@varname(x[1])].meta["bla"] == false
+    ```
+    
+    or something in that direction.
+    
+    (This is logically equivalent to a dictionary with named tuple values.  Maybe we can do what
+    [`DictTable`](https://github.com/JuliaData/TypedTables.jl/blob/main/src/DictTable.jl) does?)
+    
+    The old `order` field, indicating at which position in the evaluator function a variable has
+    been added (essentially a counter of insertions) can actually be left out completely, since the
+    dictionary is specified to be ordered by insertion.
+    
+    The important question here is: should the "joint data structure" behave like a dictionary of
+    `NamedTuple`s (`eltype(vi) == @NamedTuple{value::T, ℓ::Float64, meta}`), or like a struct of
+    dicts with shared keys (`eltype(vi.value) <: T`, `eltype(vi.ℓ) <: Float64`, …)?
+
+The required dictionary functions are about the following:
+
+  - Pure functions:
+    
+      + `iterate`, yielding pairs of `VarName` and the stored value
+      + `IteratorEltype == HasEltype()`, `IteratorSize = HasLength()`
+      + `keys`, `values`, `pairs`, `length` consistent with `iterate`
+      + `eltype`, `keytype`, `valuetype`
+      + `get`, `getindex`, `haskey` for indexing by `VarName`
+      + `merge` to join two `VarInfo`s
+
+  - Mutating functions:
+    
+      + `insert!!`, `set!!`
+      + `merge!!` to add and join elements (TODO: think about `merge`)
+      + `setindex!!`
+      + `empty!!`, `delete!!`, `unset!!` (_Are these really used anywhere? Not having them makes persistent
+        implementations much easier!_)
+
+I believe that adopting the interface of
+[Dictionaries.jl](https://github.com/andyferris/Dictionaries.jl), not `Base.AbstractDict`, would be
+ideal, since their approach make key sharing and certain operations naturally easy (particularly
+"broadcast-style", i.e., transformations on the values, but not the keys).
+
+Other `Base` functions, like `enumerate`, should follow from the above.
+
+`length` might appear weird – but it should definitely be consistent with the iterator.
+
+It would be really cool if `merge` supported the combination of distinct types of implementations,
+e.g., a dynamic and a tuple-based part.
+
+To support both mutable and immutable/persistent implementations, let's require consistent
+BangBang.jl style mutators throughout.
+
+## Transformations/Bijectors
+
+Transformations should ideally be handled explicitely and from outside: automatically by the
+compiler macro, or at the places required by samplers.
+
+Implementation-wise, they can probably be expressed as folds?
+
+```julia
+map(v -> link(v.dist, v.value), vi)
+```
+
+## Linearization
+
+There are multiple possible approaches to handle this:
+
+ 1. As a special case of conversion: `Vector(vi)`
+ 2. `copy!(vals_array, vi)`.
+ 3. As a fold: `mapreduce(v -> vec(v.value), append!, vi, init=Float64[])`
+
+Also here, I think that the best implementation would be through a fold.  Variants (1) or (2) might
+additionally be provided as syntactic sugar.
\ No newline at end of file
diff --git a/docs/src/abstractprobabilisticprogram.md b/docs/src/abstractprobabilisticprogram.md
new file mode 100644
index 0000000..55fc7a9
--- /dev/null
+++ b/docs/src/abstractprobabilisticprogram.md
@@ -0,0 +1,294 @@
+# `AbstractProbabilisticProgram` interface
+
+There are at least two somewhat incompatible conventions used for the term "model".  None of this is
+particularly exact, but:
+
+  - In Turing.jl, if you write down a `@model` function and call it on arguments, you get a model
+    object paired with (a possibly empty set of) observations. This can be treated as instantiated
+    "conditioned" object with fixed values for parameters and observations.
+  - In Soss.jl, "model" is used for a symbolic "generative" object from which concrete functions, such as
+    densities and sampling functions, can be derived, _and_ which you can later condition on (and in
+    turn get a conditional density etc.).
+
+Relevant discussions:
+[1](https://julialang.zulipchat.com/#narrow/stream/234072-probprog/topic/Naming.20the.20.22likelihood.22.20thingy),
+[2](https://github.com/TuringLang/AbstractPPL.jl/discussions/10).
+
+## TL/DR:
+
+There are three interrelating aspects that this interface intends to standardize:
+
+  - Density calculation
+  - Sampling
+  - "Conversions" between different conditionings of models
+
+Therefore, the interface consists of an `AbstractProbabilisticProgram` supertype, together with
+functions
+
+  - `condition(::Model, ::Trace) -> ConditionedModel`
+  - `decondition(::ConditionedModel) -> GenerativeModel`
+  - `sample(::Model, ::Sampler = Exact(), [Int])` (from `AbstractMCMC.sample`)
+  - `logdensityof(::Model, ::Trace)` and `densityof(::Model, ::Trace)` (from
+    [DensityInterface.jl](https://github.com/JuliaMath/DensityInterface.jl))
+
+## Traces & probability expressions
+
+First, an infrastructural requirement which we will need below to write things out.
+
+The kinds of models we consider are, at least in a theoretical sense, distributions over *traces* –
+types which carry collections of values together with their names.  Existing realizations of these
+are `VarInfo` in Turing.jl, choice maps in Gen.jl, and the usage of named tuples in Soss.jl.
+
+Traces solve the problem of having to name random variables in function calls, and in samples from
+models.  In essence, every concrete trace type will just be a fancy kind of dictionary from variable
+names (ideally, `VarName`s) to values.
+
+Since we have to use this kind of mapping a lot in the specification of the interface, let's for now
+just choose some arbitrary macro-like syntax like the following:
+
+```julia
+@T(Y[1] = …, Z = …)
+```
+
+Some more ideas for this kind of object can be found at the end.
+
+## "Conversions"
+
+The purpose of this part is to provide common names for how we want a model instance to be
+understood.  As we have seen, in some modelling languages, model instances are primarily generative,
+with some parameters fixed, while other instance types pair model instances conditioned on
+observations.  What I call "conversions" here is just an interface to transform between these two
+views and unify the involved objects under one language.
+
+Let's start from a generative model with parameter `μ`:
+
+```julia
+# (hypothetical) generative spec a la Soss
+@generative_model function foo_gen(μ)
+    X ~ Normal(0, μ)
+    Y[1] ~ Normal(X)
+    return Y[2] ~ Normal(X + 1)
+end
+```
+
+Applying the "constructor" `foo_gen` now means to fix the parameter, and should return a concrete
+object of the generative type:
+
+```julia
+g = foo_gen(; μ=…)::SomeGenerativeModel
+```
+
+With this kind of object, we should be able to sample and calculate joint log-densities from, i.e.,
+over the combined trace space of `X`, `Y[1]`, and `Y[2]` – either directly, or by deriving the
+respective functions (e.g., by converting form a symbolic representation).
+
+For model types that contain enough structural information, it should then be possible to condition
+on observed values and obtain a conditioned model:
+
+```julia
+condition(g, @T(Y = …))::SomeConditionedModel
+```
+
+For this operation, there will probably exist syntactic sugar in the form of
+
+```julia
+g | @T(Y = …)
+```
+
+Now, if we start from a Turing.jl-like model instead, with the "observation part" already specified,
+we have a situation like this, with the observations `Y` fixed in the instantiation:
+
+```julia
+# conditioned spec a la DPPL
+@model function foo(Y, μ)
+    X ~ Normal(0, μ)
+    Y[1] ~ Normal(X)
+    return Y[2] ~ Normal(X + 1)
+end
+
+m = foo(; Y=…, μ=…)::SomeConditionedModel
+```
+
+From this we can, if supported, go back to the generative form via `decondition`, and back via
+`condition`:
+
+```julia
+decondition(m) == g::SomeGenerativeModel
+m == condition(g, @T(Y = …))
+```
+
+(with equality in distribution).
+
+In the case of Turing.jl, the object `m` would at the same time contain the information about the
+generative and posterior distribution `condition` and `decondition` can simply return different
+kinds of "tagged" model types which put the model specification into a certain context.
+
+Soss.jl pretty much already works like the examples above, with one model object being either a
+`JointModel` or a `ConditionedModel`, and the `|` syntax just being sugar for the latter.
+
+A hypothetical `DensityModel`, or something like the types from LogDensityProblems.jl, would be a
+case for a model type that does not support the structural operations `condition` and
+`decondition`.
+
+The invariances between these operations should follow normal rules of probability theory.  Not all
+methods or directions need to be supported for every modelling language; in this case, a
+`MethodError` or some other runtime error should be raised.
+
+There is no strict requirement for generative models and conditioned models to have different types
+or be tagged with variable names etc.  This is a choice to be made by the concrete implementation.
+
+Decomposing models into prior and observation distributions is not yet specified; the former is
+rather easy, since it is only a marginal of the generative distribution, while the latter requires
+more structural information.  Perhaps both can be generalized under the `query` function I discuss
+at the end.
+
+## Sampling
+
+Sampling in this case refers to producing values from the distribution specified in a model
+instance, either following the distribution exactly, or approximating it through a Monte Carlo
+algorithm.
+
+All sampleable model instances are assumed to implement the `AbstractMCMC` interface – i.e., at
+least [`step`](https://github.com/TuringLang/AbstractMCMC.jl#sampling-step), and accordingly
+`sample`, `steps`, `Samples`.  The most important aspect is `sample`, though, which plays the role
+of `rand` for distributions.
+
+The results of `sample` generalize `rand` – while `rand(d, N)` is assumed to give you iid samples,
+`sample(m, sampler, N)` returns a sample from a sequence (known as chain in the case of MCMC) of
+length `N` approximating `m`'s distribution by a specific sampling algorithm (which of course
+subsumes the case that `m` can be sampled from exactly, in which case the "chain" actually is iid).
+
+Depending on which kind of sampling is supported, several methods may be supported.  In the case of
+a (posterior) conditioned model with no known sampling procedure, we just have what is given through
+`AbstractMCMC`:
+
+```julia
+sample([rng], m, N, sampler; [args]) # chain of length N using `sampler`
+```
+
+In the case of a generative model, or a posterior model with exact solution, we can have some more
+methods without the need to specify a sampler:
+
+```julia
+sample([rng], m; [args])    # one random sample
+sample([rng], m, N; [args]) # N iid samples; equivalent to `rand` in certain cases
+```
+
+It should be possible to implement this by a special sampler, say, `Exact` (name still to be
+discussed), that can then also be reused for generative sampling:
+
+```
+step(g, spl = Exact(), state = nothing) # IID sample from exact distribution with trivial state
+sample(g, Exact(), [N]) 
+```
+
+with dispatch failing for models types for which exact sampling is not possible (or not
+implemented).
+
+This could even be useful for Monte Carlo methods not being based on Markov Chains, e.g.,
+particle-based sampling using a return type with weights, or rejection sampling.
+
+Not all variants need to be supported – for example, a posterior model might not support
+`sample(m)` when exact sampling is not possible, only `sample(m, N, alg)` for Markov chains.
+
+`rand` is then just a special case when "trivial" exact sampling works for a model, e.g. a joint
+model.
+
+## Density Evaluation
+
+Since the different "versions" of how a model is to be understood as generative or conditioned are
+to be expressed in the type or dispatch they support, there should be no need for separate functions
+`logjoint`, `loglikelihood`, etc., which force these semantic distinctions on the implementor; we
+therefore adapt the interface of
+[DensityInterface.jl](https://github.com/JuliaMath/DensityInterface.jl).  Its main function
+`logdensityof` should suffice for variants, with the distinction being made by the capabilities of
+the concrete model instance.
+
+DensityInterface.jl also requires the trait function `DensityKind`, which is set to `HasDensity()`
+for the `AbstractProbabilisticProgram` type.  Additional functions
+
+```
+DensityInterface.densityof(d, x) = exp(logdensityof(d, x))
+DensityInterface.logdensityof(d) = Base.Fix1(logdensityof, d)
+DensityInterface.densityof(d) = Base.Fix1(densityof, d)
+```
+
+are provided automatically (repeated here for clarity).
+
+Note that `logdensityof` strictly generalizes `logpdf`, since the posterior density will of course
+in general be unnormalized and hence not a probability density.
+
+The evaluation will usually work with the internal, concrete trace type, like `VarInfo` in
+Turing.jl:
+
+```julia
+logdensityof(m, vi)
+```
+
+But the user will more likely work on the interface using probability expressions:
+
+```julia
+logdensityof(m, @T(X = …))
+```
+
+(Note that this would replace the current `prob` string macro in Turing.jl.)
+
+Densities need (and usually, will) not be normalized.
+
+### Implementation notes
+
+It should be able to make this fall back on the internal method with the right definition and
+implementation of `maketrace`:
+
+```julia
+logdensityof(m, t::ProbabilityExpression) = logdensityof(m, maketrace(m, t))
+```
+
+There is one open question – should normalized and unnormalized densities be able to be
+distinguished?  This could be done by dispatch as well, e.g., if the caller wants to make sure
+normalization:
+
+```
+logdensityof(g, @T(X = …, Y = …, Z = …); normalized=Val{true})
+```
+
+Although there is proably a better way through traits; maybe like for arrays, with
+`NormalizationStyle(g, t) = IsNormalized()`?
+
+## More on probability expressions
+
+Note that this needs to be a macro, if written this way, since the keys may themselves be more
+complex than just symbols (e.g., indexed variables.)  (Don't hang yourselves up on that `@T` name
+though, this is just a working draft.)
+
+The idea here is to standardize the construction (and manipulation) of *abstract probability
+expressions*, plus the interface for turning them into concrete traces for a specific model – like
+[`@formula`](https://juliastats.org/StatsModels.jl/stable/formula/#Modeling-tabular-data) and
+[`apply_schema`](https://juliastats.org/StatsModels.jl/stable/internals/#Semantics-time-(apply_schema))
+from StatsModels.jl are doing.
+
+Maybe the following would suffice to do that:
+
+```julia
+maketrace(m, t)::tracetype(m, t)
+```
+
+where `maketrace` produces a concrete trace corresponding to `t` for the model `m`, and `tracetype`
+is the corresponding `eltype`–like function giving you the concrete trace type for a certain model
+and probability expression combination.
+
+Possible extensions of this idea:
+
+  - Pearl-style do-notation: `@T(Y = y | do(X = x))`
+  - Allowing free variables, to specify model transformations: `query(m, @T(X | Y))`
+  - "Graph queries": `@T(X | Parents(X))`, `@T(Y | Not(X))` (a nice way to express Gibbs conditionals!)
+  - Predicate style for "measure queries": `@T(X < Y + Z)`
+
+The latter applications are the reason I originally liked the idea of the macro being called `@P`
+(or even `@𝓅` or `@ℙ`), since then it would look like a "Bayesian probability expression": `@P(X < Y + Z)`.  But this would not be so meaningful in the case of representing a trace instance.
+
+Perhaps both `@T` and `@P` can coexist, and both produce different kinds of `ProbabilityExpression`
+objects?
+
+NB: the exact details of this kind of "schema application", and what results from it, will need to
+be specified in the interface of `AbstractModelTrace`, aka "the new `VarInfo`".
\ No newline at end of file
diff --git a/docs/src/varname_axioms.md b/docs/src/varname_axioms.md
new file mode 100644
index 0000000..9581361
--- /dev/null
+++ b/docs/src/varname_axioms.md
@@ -0,0 +1,84 @@
+# `VarName`-based axioms
+
+What follows is mostly an attempt to formalize subsumption.
+
+First, remember that in Turing.jl we can always work with _concretized_ `VarName`s: `begin`/`end`,
+`:`, and boolean indexing are all turned into some form of concrete cartesian or array indexing
+(assuming [this suggestion](https://github.com/TuringLang/AbstractPPL.jl/issues/35) being
+implemented).  This makes all index comparisons static.
+
+Now, `VarName`s have a compositional structure: they can be built by composing a root variable with
+more and more lenses (`VarName{v}()` starts off with an `IdentityLens`):
+
+```julia
+julia> vn = VarName{:x}() ∘ Setfield.IndexLens((1:10, 1) ∘ Setfield.IndexLens((2,)))
+x[1:10,1][2]
+```
+
+(_Note that the composition function, `∘`, is really in wrong order; but this is a heritage of
+Setfield.jl._)
+
+By "subsumption", we mean the notion of a `VarName` expressing a more nested path than another one:
+
+```julia
+subsumes(@varname(x.a), @varname(x.a[1]))
+@varname(x.a) ⊒ @varname(x.a[1]) # \sqsupseteq
+@varname(x.a) ⋢ @varname(x.a[1]) # \nsqsubseteq
+```
+
+Thus, we have the following axioms for `VarName`s ("variables" are `VarName{n}()`):
+
+ 1. `x ⊑ x` for all variables `x`
+ 2. `x ≍ y` for `x ≠ y` (i.e., distinct variables are incomparable; `x ⋢ y` and `y ⋢ x`) (`≍` is `\asymp`)
+ 3. `x ∘ ℓ ⊑ x` for all variables `x` and lenses `ℓ`
+ 4. `x ∘ ℓ₁ ⊑ x ∘ ℓ₂ ⇔ ℓ₁ ⊑ ℓ₂`
+
+For the last axiom to work, we also have to define subsumption of individual, non-composed lenses:
+
+ 1. `PropertyLens(a) == PropertyLens(b) ⇔ a == b`, for all symbols `a`, `b`
+ 2. `FunctionLens(f) == FunctionLens(g) ⇔ f == g` (under extensional equality; I'm only mentioning
+    this in case we ever generalize to Bijector-ed variables like `@varname(log(x))`)
+ 3. `IndexLens(ι₁) ⊑ IndexLens(ι₂)` if the index tuple `ι₂` covers all indices in `ι₁`; for example,
+    `_[1, 2:10] ⊑ _[1:10, 1:20]`.  (_This is a bit fuzzy and not all corner cases have been
+    considered yet!_)
+ 4. `IdentityLens() == IdentityLens()`
+ 5. `ℓ₁ ≍ ℓ₂`, otherwise
+
+Together, this should make `VarName`s under subsumption a reflexive poset.
+
+The fundamental requirement for `VarInfo`s is then:
+
+```
+vi[x ∘ ℓ] == get(vi[x], ℓ)
+```
+
+So we always want the following to work, automatically:
+
+```julia
+vi = insert!!(vi, vn, x)
+vi[vn] == x
+```
+
+(the trivial case), and
+
+```julia
+x = set!!(x, ℓ₁, a)
+x = set!!(x, ℓ₂, b)
+vi = insert!!(vi, vn, x)
+vi[vn ∘ ℓ₁] == a
+vi[vn ∘ ℓ₂] == b
+```
+
+since `vn` subsumes both `vn ∘ ℓ₁` and `vn ∘ ℓ₂`.
+
+Whether the opposite case is supported may depend on the implementation.  The most complicated part
+is "unification":
+
+```julia
+vi = insert!!(vi, vn ∘ ℓ₁, a)
+vi = insert!!(vi, vn ∘ ℓ₂, b)
+get(vi[vn], ℓ₁) == a
+get(vi[vn], ℓ₂) == b
+```
+
+where `vn ∘ ℓ₁` and `vn ∘ ℓ₂` need to be recognized as "children" of a common parent `vn`.
\ No newline at end of file