This repository was archived by the owner on Nov 17, 2025. It is now read-only.
Replies: 1 comment 3 replies
-
|
This all sounds good. We definitely need to clarify which objects we're dealing with and how we need to manipulate and model them, and sampler and kernel types sound like a suitable set of abstractions. As always, we need to consider how we could represent such things within the graph and rewrite frameworks, when possible/relevant. When we can, we're generally able to do more. |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Current interface for
construct_samplerTo understand the kind of changes that having several possible samplers (including parametrized samplers) will require, let’s take a non-trivial example of building sampling functions for the Horseshoe prior, taken from AeMCMC’s test suite:
We observe
Y_rv, and we want to sample from the posterior distribution oftau_rv,lmbda_rv,beta_rv,h_rv.AeMCMCcurrently provides aconstruct_samplerfunction:The
sample_stepsdictionary maps the random variables to the sampling step that was assigned to them. We can print the graph of the sampling step assigned tolambda_rv:Samplers update rng state and the caller will need to pass these updates to the compiler later, so we return them as well. It consists of a dictionary that contains the updates of the state of the random number generator that we passed via
srng:And finally we pass the initial value variables of the random variables we wish to sample from:
We can now easily build the graph for the sampler:
And we can run the
samplerfunction in a python loop.Issues with streams of samplers and parametrized samplers
Although the current interface works perfectly for the Gibbs samplers, the downstream caller has no high-level information about what transformations were applied to the graph, and what samplers were assigned to the variables. They would have to reverse-engineer the information based on the graph that they receive. This becomes problematic the day we return a stream of samplers: how are humans (or machines) to reason about what AeMCMC returns?
Other issues related to information arise with NUTS and parametrized kernels in general:
It is useful to look at this from two perspectives: first from a caller that does not care about the details of the sampler and just wants it to "work", and then from the perspective of a statistician who would like to inspect AeMCMC's returned sampler.
If you just want to sample
We can simply create sampler types. Imagine we pass a complex model to
AeMCMCbut have no idea what the output sampling steps may be. All we can see is:If at least one of the RVs is assigned a parametrized sampler we will run into an issue with the previous workflow:
Indeed, compilation will fail with an unhelpful error message since the variables representing the parameters are missing. Thus we need to make it explicit that sampling steps might be parametrized. The simplest way to do that is by changing the API slightly and making
construct_sampleralways returns aparametersvariable:But that is not enough: one needs to know how to provide a value for these parameters at the very least. To set the value manually we need to know the type of the parameter and its shape. This information can be passed by setting the type and shape of the
TensorVariables when we initialize them.This is simple for models where random variables are built with concrete shape values, but immediately becomes problematic when shapes are symbolic:
We thus need to provide shape information in a /user-friendly/ way. We can even provide a function that returns the shape based on parameters, or provides an array of ones with this shape given the model parameters:
We thus need a parameter type to convey this information. For instance for the inverse mass matrix parameter of the NUTS sampler:
What if we don't want to provide values for the parameters and just want it to work? We need to bring in parameters adaptation.
Parameter adaptation
We could provide a
build_adaptation_stepfunction that is dispatched on the parameter type, but not only would this requires information about the previous sampler step, in many adaptation schemes it is not possible to decouple the updates of the parameters. The solution thus seems to provide a new high-level function:where
sampleris akin forsample_stepsabove but with extra information about the kernels that produced the sampling steps. With the current notations you would build an adaptation step in the following way:construct_adaptationuses the sampling steps found byconstruct_sampler. Now it works!If you want to understand
But what if you not only want it to work, but also to understand AeMCMC's output? As a statistician, I would like to get some textual information about the sampling steps, for instance for a Gibbs sampling kernel:
But
AeMCMCcan also return parametrized sampling steps. If NUTS were assigned, I would like (need) to know:As to not burden the API too much (and not bother those not interested in the details) I suggest to still return the same number of return values for
construct_sampler:and where
sampler[rv]still returns the sampling step for the variablerv. The difference beingsampleris not a dictionnary but a class:where
kernelsis a list of the kernels that are combined in the sampler. We need the notion of kernel since some algorithms (NUTS) update the values of several variables at once (we could call it sampling_unit as well, which unlike "kernel" is not overused).rvs_to_kernelsmaps the RVs the the kernel that updates their values.model_graphis theFunctionGraphthat was used to build the sampler (that the user can inspect using the tools provided by Aesara/AePPL).By the way, the need to access the graph representation that is used by the samplers means that the transfoms used by NUTS will need to be applied to
RandomVariables in AePPL.Representation within the graph / rewrite framework
TODO
This was originally a comment in #68 (comment)
Beta Was this translation helpful? Give feedback.
All reactions