Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
9c1f811
format for joss
aloctavodia Dec 20, 2025
f324c68
add figures folder
aloctavodia Dec 22, 2025
35907c5
add mention to all contributors, trim trailing white space
aloctavodia Jan 5, 2026
47f580c
d
aloctavodia Jan 5, 2026
a254820
fix trailing whitespace
aloctavodia Jan 5, 2026
e8bccaa
trim trailing whitespace
aloctavodia Jan 5, 2026
2c05a1b
Minor edits
avehtari Jan 8, 2026
e75d34a
Update Paananen reference to 2021 version
avehtari Jan 8, 2026
3e2743d
Update Paanen reference year
avehtari Jan 8, 2026
0a40a9f
update acknowledgements
aloctavodia Jan 8, 2026
d568622
Apply suggestions from code review
aloctavodia Jan 8, 2026
497b0ee
Apply suggestions from code review
aloctavodia Jan 9, 2026
a08a693
add comment about dimension order
aloctavodia Jan 9, 2026
752f7c4
add comments
aloctavodia Jan 13, 2026
d795f26
add gha for compiling paper
aloctavodia Jan 13, 2026
abe207e
add missing sections, fix missing DOIs
aloctavodia Jan 17, 2026
c3761ee
update AI usage disclosure and move to the bottom
aloctavodia Jan 19, 2026
f3b14ba
remove prefix from DOIs
aloctavodia Jan 23, 2026
8c7ce78
Apply suggestions from code review
aloctavodia Jan 24, 2026
4e27959
update stan reference
aloctavodia Jan 24, 2026
88449c8
fix urls
aloctavodia Jan 24, 2026
1be8f4f
address reviewer comments
aloctavodia Jan 28, 2026
9975e80
tweaks related to review comments
OriolAbril Feb 3, 2026
b818746
update example to use qds, add missing references, small tweaks
aloctavodia Feb 3, 2026
4962735
move text from caption to body
aloctavodia Feb 3, 2026
bda05d1
make statement of need more concrete
aloctavodia Feb 3, 2026
378f772
Edits to JOSS paper (#2537)
matt-graham Feb 24, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
28 changes: 28 additions & 0 deletions .github/workflows/draft-pdf.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
name: Draft PDF
on:
push:
paths:
- paper/**
- .github/workflows/draft-pdf.yml

jobs:
paper:
runs-on: ubuntu-latest
name: Paper Draft
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Build draft PDF
uses: openjournals/openjournals-draft-action@master
with:
journal: joss
# This should be the path to the paper within your repo.
paper-path: paper/paper.md
- name: Upload
uses: actions/upload-artifact@v4
with:
name: paper
# This is the output path where Pandoc will write the compiled
# PDF. Note, this should be the same directory as the input
# paper.md
path: paper/paper.pdf
Binary file added paper/figures/figure_0.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
142 changes: 142 additions & 0 deletions paper/paper.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,142 @@
---
title: 'ArviZ: a modular and flexible library for exploratory analysis of Bayesian models'
tags:
- Python
- Bayesian statistics
- Bayesian workflow
authors:
- name: Osvaldo A Martin
orcid: 0000-0001-7419-8978
equal-contrib: true
corresponding: true
affiliation: 1
- name: Oriol Abril-Pla
orcid: 0000-0002-1847-9481
equal-contrib: true
corresponding: true
affiliation: 2
- name: Jordan Deklerk
affiliation: 3
- name: Seth D. Axen
orcid: 0000-0003-3933-8247
affiliation: 4
- name: Colin Carroll
orcid: 0000-0001-6977-0861
affiliation: 2
- name: Ari Hartikainen
orcid: 0000-0002-4569-569X
affiliation: 2
- name: Aki Vehtari
orcid: 0000-0003-2164-9469
affiliation: "1, 5"
affiliations:
- name: Aalto University, Espoo, Finland
index: 1
- name: arviz-devs
index: 2
- name: DICK's Sporting Goods, Coraopolis, Pennsylvania
index: 3
- name: University of Tübingen
index: 4
- name: ELLIS Institute Finland
index: 5
date: 17 January 2026
bibliography: references.bib
---

# Summary

When working with Bayesian models, a range of related tasks must be addressed beyond inference itself. These include diagnosing the quality of Markov chain Monte Carlo (MCMC) samples, model criticism, model comparison, etc. We collectively refer to these activities as exploratory analysis of Bayesian models.

In this work, we present a redesigned version of ArviZ, a Python package for exploratory analysis of Bayesian models (EABM). The redesign emphasizes greater user control and modularity. This redesign delivers a more flexible and efficient toolkit for exploratory analysis of Bayesian models. With its renewed focus on modularity and usability, ArviZ is well-positioned to remain an essential tool for Bayesian modelers in both research and applied settings.


# Statement of need

Probabilistic programming has emerged as a powerful paradigm for statistical modeling, accompanied by a growing ecosystem of tools for model specification and inference. Effective modeling requires robust support for uncertainty visualization, sampling diagnostics, model comparison, and model checking [@Gelman_2020; @Martin_2024; @Guo_2024]. ArviZ addresses this gap by providing a unified, backend-agnostic library to perform these tasks. The original ArviZ paper [@Kumar_2019] described the landscape of probabilistic programming tools at the time and the need for a unified, backend-agnostic library for exploratory analysis — a need that has only grown as the ecosystem has expanded.

The methods implemented in ArviZ are grounded in well-established statistical principles and provide robust, interpretable diagnostics and visualizations [@Vehtari_2017; @Gelman_2019; @Dimitriadis_2021; @Paananen_2021; @Padilla_2021; @Vehtari_2021; @Sailynoja_2022; @Kallioinen_2023; @Sailynoja_2025]. Modern Bayesian practice is a rapidly advancing field in which new methodological developments continually extend the range and complexity of models that can be fit in practice. For instance, the methods to compute key ArviZ features such as `ess`, `rhat`, `loo` or `compare` have been improved between 2019 and now, and new implementations needed significant development effort to adapt to because it wasn't possible to change a part of ArviZ without also adapting everything that interacted with it. The redesign addresses these challenges by modularizing the codebase, allowing individual components to be updated or replaced without affecting the entire system. This modularity not only facilitates maintenance and updates but also encourages community contributions, as developers can focus on specific components without needing to understand the entire codebase.


# State of the field

In the Python Bayesian ecosystem, ArviZ occupies a niche comparable to tools in the R/Stan community such as posterior [@gelman_2013;@Vehtari_2021], loo [@Vehtari_2017;@loo], bayesplot [@bayesplot0;@bayesplot1], priorsense [@Kallioinen_2023], and ggdist [@kay_2024] sharing similar goals while reflecting different language ecosystems and workflows.

# Research impact statement

ArviZ [@Kumar_2019] is a Python package for exploratory analysis of Bayesian models that has been widely used in academia and industry since its introduction in 2019, with over 700 citations and 75 million downloads. Its goal is to integrate seamlessly with established probabilistic programming languages and statistical interfaces, such as PyMC [@Abril-pla_2023], Stan (via the cmdstanpy interface) [@stan], Pyro, NumPyro [@Phan_2019; @Bingham_2019], emcee [@emcee], and Bambi [@Capretto_2022], among others.

The maturity of ArviZ has also led to other initiatives, including ArviZ.jl [@arvizjl_2025] for Julia, PreliZ [@icazatti_2023] for prior elicitation and the development of educational resources [@eabm_2025].

# Software design

The previous ArviZ design divided the package into three submodules, which are now available as three independent installable packages. This redesign emphasizes greater user control and modularity. The new architecture enables users to customize the installation and use of specific components. Key design changes include:

General functionality, data processing, and data input/output (I/O) have been streamlined and enhanced for greater versatility. Previously, ArviZ used the custom `InferenceData` class to organize and store the high-dimensional outputs of Bayesian inference in a structured, labeled format, enabling efficient analysis, metadata persistence, and serialization. These have been replaced with the `DataTree` class from xarray [@Hoyer_2017], which, like the original `InferenceData`, supports grouping but is more flexible, enabling richer nesting and automatic support for all xarray I/O formats. Additionally, converters allow more flexibility in dimensionality, naming, and indexing of their generated outputs.

Statistical functions are now accessible through two distinct interfaces:

* A low-level array interface with only `numpy` [@harris_2020] and `scipy` [@virtanen_2020] as dependencies, intended for advanced users
and developers of third-party libraries.
* A higher-level xarray interface designed for end users, which simplifies usage by automating common tasks and handling metadata.

Plotting functions have also been redesigned to support modularity at multiple levels:

* At a high level, ArviZ offers a collection of “batteries-included” plots. These are built-in plotting functions providing sensible defaults for common tasks like MCMC sampling diagnostics, predictive checks, and model comparison.
* At an intermediate level, the application programming interface enables easier customization of batteries-included plots and simplifies the creation of new plots. This is achieved through the `PlotCollection` class, which enables developers and advanced users to focus solely on the plotting logic, delegating any faceting or aesthetic mappings to `PlotCollection`.
* At a lower level, we have improved the separation between computational and plotting logic, reducing code duplication and enhancing modular design. These changes also facilitate support for multiple plotting backends, improving extensibility and maintainability. Currently, ArviZ supports three plotting backends: matplotlib [@Hunter_2007], Bokeh [@Bokeh_2018], and plotly [@plotly_2015].

Thanks to this new design, the cost of adding "batteries-included" plots has reduced in more than half even though ArviZ now supports one extra backend. Consequently, redesigned ArviZ already has 37 "batteries-included", 10 more than the 0.x versions.

## Examples

For the first example, we use the low-level array interface to compute the effective sample sizes for some fake data. We construct an array resembling data from MCMC sampling with 4 chains and 1000 draws for two posterior variables. When using the array interface we need to specify which axes represent the chains and which the draws.

import numpy as np
from arviz_stats.base import array_stats

rng = np.random.default_rng()
samples = rng.normal(size=(4, 1000, 2)) # (chain, draw, variable)
array_stats.ess(samples, chain_axis=0, draw_axis=1)

The array interface is lightweight and intended for advanced users and library developers. For most users, we instead recommend the xarray interface, as it is more user-friendly and automates many tasks. When converting the NumPy array to a `DataTree`, ArviZ assigns `chain` and `draw` as named dimensions based on the assumed dimension order, so this information is already encoded in the resulting object and does not need to be specified explicitly when calling other functions.

import arviz as az
dt_samples = az.convert_to_datatree(samples)
az.ess(dt_samples)

The only required argument for battery-included plots, like `plot_dist`, is the input data, typically a `DataTree` (`dt`). In this example we also apply optional customizations.

az.style.use('arviz-variat')
dt = az.load_arviz_data("centered_eight")
pc = az.plot_dist(
dt,
kind="dot",
visuals={"dist":{"marker": "C6"},
"point_estimate_text":False},
aes={"color": ["school"]}
);
pc.add_legend("school", loc="outside right upper")

![plot_dist with color mapped to school dimension.\label{fig:plot_dist}](figures/figure_0.png){width=4.5in}

To create \autoref{fig:plot_dist} we change the default kind argument in `plot_dist` from "kde" to "dot" to produce quantile dot plots [@kay_2016], and map the school dimension to color so that each school is shown in a different hue. Variables that do not have a school dimension (such as mu and tau) are automatically assigned a neutral color. We also disable the point-estimate text and set a custom marker style for the dots, and finally add a legend for the school.

For more examples and a more comprehensive overview, see the [ArviZ documentation](https://python.arviz.org/en/latest/) and the [EABM guide](https://arviz-devs.github.io/EABM/) [@eabm_2025]. These resources include a wide range of examples designed for all types of users, from casual users to advanced analysts and developers looking to use ArviZ in their projects or libraries.

## AI usage disclosure

Generative AI tools were used during software development and documentation in a limited capacity, primarily to assist with rewording and minor code suggestions. All AI-assisted contributions were reviewed and edited by the authors. Core design decisions, feature development, and scientific or technical judgment were carried out by the authors, and all code and claims were tested and manually verified to ensure correctness.

## Acknowledgements

We thank our fiscal sponsor, NumFOCUS, a nonprofit 501(c)(3) public charity, for their operational and financial support. We also thank all the contributors to `arviz`, `arviz-base`, `arviz-stats`, and `arviz-plots` repositories, including code contributors, documentation writers, issue reporters, and users who have provided feedback and suggestions.

This research was supported by:

* The Research Council of Finland Flagship Program "Finnish Center for Artificial Intelligence" (FCAI)
* Research Council of Finland grant 340721
* Essential Open Source Software Round 4 grant by the Chan Zuckerberg Initiative (CZI)
* Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) under Germany’s Excellence Strategy – EXC number 2064/1 – Project number 390727645

# References
Loading
Loading