Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 12 additions & 33 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
# Simulation Based Calibration
# Simuk

A [PyMC](http://docs.pymc.io) and [Bambi](https://bambinos.github.io/bambi/) implementation of the algorithms from:

Sean Talts, Michael Betancourt, Daniel Simpson, Aki Vehtari, Andrew Gelman: “Validating Bayesian Inference Algorithms with Simulation-Based Calibration”, 2018; [arXiv:1804.06788](http://arxiv.org/abs/1804.06788)

Many thanks to the authors for providing open, reproducible code and implementations in `rstan` and `PyStan` ([link](https://github.com/seantalts/simulation-based-calibration)).
Simuk is a Python library for simulation-based calibration (SBC) and the generation of synthetic data.
Simulation-Based Calibration is a method for validating Bayesian inference by checking whether the posterior distributions align with the expected theoretical results derived from the prior.

Simuk works with [PyMC](http://docs.pymc.io), [Bambi](https://bambinos.github.io/bambi/) and [NumPyro](https://num.pyro.ai/en/latest/index.html) models.

## Installation

Expand All @@ -22,6 +20,7 @@ pip install simuk
```python
import numpy as np
import pymc as pm
from arviz_plots import plot_ecdf_pit

data = np.array([28.0, 8.0, -3.0, 7.0, -1.0, 1.0, 18.0, 12.0])
sigma = np.array([15.0, 10.0, 16.0, 11.0, 9.0, 11.0, 10.0, 18.0])
Expand All @@ -48,35 +47,15 @@ pip install simuk
should be close to uniform and within the oval envelope.

```python
sbc.plot_results()
plot_ecdf_pit(sbc.simulations,
visuals={"xlabel":False},
);
```

![Simulation based calibration plots, ecdf](ecdf.png)

## References

## What is going on here?

The [paper on the arXiv](http://arxiv.org/abs/1804.06788) is very well written, and explains the algorithm quite well.

Morally, the example below is exactly what this library does, but it generalizes to more complicated models:

```python
with pm.Model() as model:
x = pm.Normal('x')
pm.Normal('y', mu=x, observed=y)
```

Then what this library does is compute

```python
with my_model():
prior_samples = pm.sample_prior_predictive(num_trials)

simulations = {'x': []}
for idx in range(num_trials):
y_tilde = prior_samples['y'][idx]
x_tilde = prior_samples['x'][idx]
with model(y=y_tilde):
idata = pm.sample()
simulations['x'].append((idata.posterior['x'] < x_tilde).sum())
```
- Talts, S., Betancourt, M., Simpson, D., Vehtari A., and Gelman A. (2018). “Validating Bayesian Inference Algorithms with Simulation-Based Calibration.” `arXiv:1804.06788 <https://doi.org/10.48550/arXiv.1804.06788>`_.
- Modrák, M., Moon, A, Kim, S., Bürkner, P., Huurre, N., Faltejsková, K., Gelman A and Vehtari, A.(2023). "Simulation-based calibration checking for Bayesian computation: The choice of test quantities shapes sensitivity. Bayesian Analysis, advance publication, DOI: `10.1214/23-BA1404 <https://projecteuclid.org/journals/bayesian-analysis/volume--1/issue--1/Simulation-Based-Calibration-Checking-for-Bayesian-Computation--The-Choice/10.1214/23-BA1404.full>`_
- Säilynoja, T., Marvin Schmitt, Paul-Christian Bürkner and Aki Vehtari (2025). "Posterior SBC: Simulation-Based Calibration Checking Conditional on Data" `arXiv:2502.03279 <https://doi.org/10.48550/arXiv.2502.03279>`_.
14 changes: 6 additions & 8 deletions docs/examples/gallery/sbc.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,9 +62,8 @@ We expect a uniform distribution, the gray envelope corresponds to the 94% credi
```{jupyter-execute}

plot_ecdf_pit(sbc.simulations,
pc_kwargs={'col_wrap':4},
plot_kwargs={"xlabel":False},
)
visuals={"xlabel":False},
);
```

:::::
Expand Down Expand Up @@ -131,7 +130,7 @@ def eight_schools_cauchy_prior(J, sigma, y=None):
nuts_kernel = NUTS(eight_schools_cauchy_prior)
```

Pass the model to the `SBC` class, set the number of simulations to 100, and run the simulations. For numpyro model,
Pass the model to the `SBC` class, set the number of simulations to 100, and run the simulations. For numpyro model,
we pass in the ``data_dir`` parameter.

```{jupyter-execute}
Expand All @@ -147,8 +146,7 @@ To compare the prior and posterior distributions, we will plot the results.
We expect a uniform distribution, the gray envelope corresponds to the 94% credible interval.

```{jupyter-execute}
plot_ecdf_pit(sbc.simulations,
pc_kwargs={'col_wrap':4},
plot_kwargs={"xlabel":False}
)
plot_ecdf_pit(sbc.simulations,
visuals={"xlabel":False},
);
```
10 changes: 7 additions & 3 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,7 @@ In our case, we will take a PyMC model and pass it into our ``SBC`` class.

.. code-block:: python

from arviz_plots import plot_ecdf_pit
import numpy as np
import pymc as pm

Expand All @@ -44,7 +45,9 @@ Plot the empirical CDF to compare the differences between the prior and posterio

.. code-block:: python

sbc.plot_results()
plot_ecdf_pit(sbc.simulations,
visuals={"xlabel":False},
);

The lines should be nearly uniform and fall within the oval envelope. It suggests that the prior and posterior distributions
are properly aligned and that there are no significant biases or issues with the model.
Expand Down Expand Up @@ -82,5 +85,6 @@ are properly aligned and that there are no significant biases or issues with the
References
----------

- Talts, Sean, Michael Betancourt, Daniel Simpson, Aki Vehtari, and Andrew Gelman. 2018. “Validating Bayesian Inference Algorithms with Simulation-Based Calibration.” `arXiv:1804.06788 <https://doi.org/10.48550/arXiv.1804.06788>`_.
- Modrák, M., Moon, A. H., Kim, S., Bürkner, P., Huurre, N., Faltejsková, K., … & Vehtari, A. (2023). Simulation-based calibration checking for Bayesian computation: The choice of test quantities shapes sensitivity. Bayesian Analysis, advance publication, DOI: `10.1214/23-BA1404 <https://projecteuclid.org/journals/bayesian-analysis/volume--1/issue--1/Simulation-Based-Calibration-Checking-for-Bayesian-Computation--The-Choice/10.1214/23-BA1404.full>`_
- Talts, S., Betancourt, M., Simpson, D., Vehtari A., and Gelman A. (2018). “Validating Bayesian Inference Algorithms with Simulation-Based Calibration.” `arXiv:1804.06788 <https://doi.org/10.48550/arXiv.1804.06788>`_.
- Modrák, M., Moon, A, Kim, S., Bürkner, P., Huurre, N., Faltejsková, K., Gelman A and Vehtari, A.(2023). "Simulation-based calibration checking for Bayesian computation: The choice of test quantities shapes sensitivity. Bayesian Analysis, advance publication, DOI: `10.1214/23-BA1404 <https://projecteuclid.org/journals/bayesian-analysis/volume--1/issue--1/Simulation-Based-Calibration-Checking-for-Bayesian-Computation--The-Choice/10.1214/23-BA1404.full>`_
- Säilynoja, T., Marvin Schmitt, Paul-Christian Bürkner and Aki Vehtari (2025). "Posterior SBC: Simulation-Based Calibration Checking Conditional on Data" `arXiv:2502.03279 <https://doi.org/10.48550/arXiv.2502.03279>`_.
Binary file modified ecdf.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed hist.png
Binary file not shown.
2 changes: 1 addition & 1 deletion requirements-docs.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ pydata-sphinx-theme>=0.6.3
myst-nb
pymc>=5.20.1
bambi>=0.15.0
arviz_plots @ git+https://github.com/arviz-devs/arviz-plots@main
arviz_plots>=0.6.0
sphinx>=4
sphinx-copybutton
sphinx_tabs
Expand Down
2 changes: 0 additions & 2 deletions simuk/sbc.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,8 +71,6 @@ class SBC:

sbc = SBC(model)
sbc.run_simulations()
sbc.plot_results()

"""

def __init__(self, model, num_simulations=1000, sample_kwargs=None, seed=None, data_dir=None):
Expand Down