Further improving the 'meta' simulator

The existing `meta` simulator feature is nice because it allows us to tell the simulator about meta variables that will affect the shape of some other simulated variables such that the meta variables cannot vary within a batch.

I believe we can improve this feature further as I illustrate below using our linear regression example. We use the simulator:

```Python
# TODO: do we have to require "batch_shape" to the function passed to meta_fn?
def meta(batch_shape):
    # N: number of observation in a dataset
    N = np.random.randint(5, 15)
    return dict(N=N)

def prior():
    # beta: regression coefficients (intercept, slope)
    beta = np.random.normal([2, 0], [3, 1])
    # sigma: residual standard deviation
    sigma = np.random.gamma(1, 1)
    return dict(beta=beta, sigma=sigma)

def likelihood(beta, sigma, N):
    # x: predictor variable
    x = np.random.normal(0, 1, size=N)
    # y: response variable
    y = np.random.normal(beta[0] + beta[1] * x, sigma, size=N)
    return dict(y=y, x=x)

simulator = bf.simulators.make_simulator([prior, likelihood], meta_fn=meta)
```

I see the following three problems:

1. By passing `meta_fn = meta`, we are internally computing

```Python
meta = make_simulator(meta, is_batched = true)
```

but this is kind of a lie. `meta` is not really batched. It just *should not be auto-batched* because its variables need to remain constant within each batch. 

2. Since we will treat `meta` as "already batched", we have to have `batch_shape` as first argument of `meta`, or at least have `_` there to indicate the presense of an argument even if unused. This is a quite technical requirement that is hard to communicate to users. They may just do it because we told them to but it will remain a bit weird I believe. 

3. Later on, in the `adapter`, we have to figure out which of the `meta` variables are actually already batched and which should just be constant within each batch. This is what `adapter.broadcast` does and its a good functionality. But perhaps we can relieve it of some of its burden by better informing the adapter about what variables are `meta`, i.e. which variables *are definitely not coming with a batch_size dimension and need one such that their values are constant within each batch*. 

I propose the following solution. Add a new `is_meta` flag in `make_simulator`, which is also automatically set to `true` for simulators passed to `meta_fn`. Any variables produced by such a `meta` simulator will carry the information that *they are definitely not coming with a batch_size dimension and need one such that their values are constant within each batch*. This would solve all three problems I think: 1. we are completely transparent to the user what a `meta` simulator does. 2. We don't need an unused `batch_size` (or `batch_shape`) argument anymore. In fact we will prohibit it to be present in meta simulators. 3. The adapter can safely (and automatically via the default adapters) broadcast all meta variables to include the `batch_size` as first dimension.

Any thoughts on this proposal would be much appreciated!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Further improving the 'meta' simulator #226

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Further improving the 'meta' simulator #226

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions