Skip to content

Commit 3900c2c

Browse files
committed
Updated code to v5; added annotations
1 parent 92f15dc commit 3900c2c

File tree

2 files changed

+1356
-1271
lines changed

2 files changed

+1356
-1271
lines changed

examples/mixture_models/dependent_density_regression.ipynb

Lines changed: 1303 additions & 1245 deletions
Large diffs are not rendered by default.

examples/mixture_models/dependent_density_regression.myst.md

Lines changed: 53 additions & 26 deletions
Original file line numberDiff line numberDiff line change
@@ -10,24 +10,33 @@ kernelspec:
1010
name: python3
1111
---
1212

13+
(dependent_density_regression)=
1314
# Dependent density regression
15+
:::{post} 2017
16+
:tags: mixture model, nonparametric
17+
:category: intermediate
18+
:author: Austin Rochford
19+
:::
20+
1421
In another [example](dp_mix.ipynb), we showed how to use Dirichlet processes to perform Bayesian nonparametric density estimation. This example expands on the previous one, illustrating dependent density regression.
1522

1623
Just as Dirichlet process mixtures can be thought of as infinite mixture models that select the number of active components as part of inference, dependent density regression can be thought of as infinite [mixtures of experts](https://en.wikipedia.org/wiki/Committee_machine) that select the active experts as part of inference. Their flexibility and modularity make them powerful tools for performing nonparametric Bayesian Data analysis.
1724

1825
```{code-cell} ipython3
26+
from io import StringIO
27+
1928
import arviz as az
2029
import numpy as np
2130
import pandas as pd
22-
import pymc3 as pm
31+
import pymc as pm
32+
import pytensor.tensor as pt
33+
import requests
2334
import seaborn as sns
2435
25-
from IPython.display import HTML
2636
from matplotlib import animation as ani
2737
from matplotlib import pyplot as plt
28-
from theano import tensor as tt
2938
30-
print(f"Running on PyMC3 v{pm.__version__}")
39+
print(f"Running on PyMC v{pm.__version__}")
3140
```
3241

3342
```{code-cell} ipython3
@@ -49,7 +58,8 @@ def standardize(x):
4958
return (x - x.mean()) / x.std()
5059
5160
52-
df = pd.read_csv(DATA_URI, sep=r"\s{1,3}", engine="python").assign(
61+
response = requests.get(DATA_URI, verify=False)
62+
df = pd.read_csv(StringIO(response.text), sep=r"\s{1,3}", engine="python").assign(
5363
std_range=lambda df: standardize(df.range), std_logratio=lambda df: standardize(df.logratio)
5464
)
5565
```
@@ -120,6 +130,8 @@ plt.close()
120130
```
121131

122132
```{code-cell} ipython3
133+
from IPython.display import HTML
134+
123135
HTML(animation.to_html5_video())
124136
```
125137

@@ -147,12 +159,12 @@ For the LIDAR data set, we use independent normal priors $\alpha_i \sim N(0, 5^2
147159

148160
```{code-cell} ipython3
149161
def norm_cdf(z):
150-
return 0.5 * (1 + tt.erf(z / np.sqrt(2)))
162+
return 0.5 * (1 + pt.erf(z / np.sqrt(2)))
151163
152164
153165
def stick_breaking(v):
154-
return v * tt.concatenate(
155-
[tt.ones_like(v[:, :1]), tt.extra_ops.cumprod(1 - v, axis=1)[:, :-1]], axis=1
166+
return v * pt.concatenate(
167+
[pt.ones_like(v[:, :1]), pt.extra_ops.cumprod(1 - v, axis=1)[:, :-1]], axis=1
156168
)
157169
```
158170

@@ -167,7 +179,7 @@ with pm.Model(coords={"N": np.arange(N), "K": np.arange(K) + 1, "one": [1]}) as
167179
alpha = pm.Normal("alpha", 0.0, 5.0, dims="K")
168180
beta = pm.Normal("beta", 0.0, 5.0, dims=("one", "K"))
169181
x = pm.Data("x", std_range)
170-
v = norm_cdf(alpha + pm.math.dot(x, beta))
182+
v = norm_cdf(alpha + x @ beta)
171183
w = pm.Deterministic("w", stick_breaking(v), dims=["N", "K"])
172184
```
173185

@@ -194,7 +206,7 @@ for the conditional component means.
194206
with model:
195207
gamma = pm.Normal("gamma", 0.0, 10.0, dims="K")
196208
delta = pm.Normal("delta", 0.0, 10.0, dims=("one", "K"))
197-
mu = pm.Deterministic("mu", gamma + pm.math.dot(x, delta))
209+
mu = pm.Deterministic("mu", gamma + x @ delta)
198210
```
199211

200212
Finally, we place the prior $\tau_i \sim \textrm{Gamma}(1, 1)$ on the component precisions.
@@ -211,12 +223,8 @@ pm.model_to_graphviz(model)
211223
We now sample from the dependent density regression model.
212224

213225
```{code-cell} ipython3
214-
SAMPLES = 20000
215-
BURN = 10000
216-
217226
with model:
218-
step = pm.Metropolis()
219-
trace = pm.sample(SAMPLES, tune=BURN, step=step, random_seed=SEED, return_inferencedata=True)
227+
trace = pm.sample(random_seed=SEED)
220228
```
221229

222230
To verify that truncation did not unduly influence our results, we plot the largest posterior expected mixture weight for each component. (In this model, each point has a mixture weight for each component, so we plot the maximum mixture weight for each component across all data points in order to judge if the component exerts any influence on the posterior.)
@@ -239,28 +247,39 @@ Since only three mixture components have appreciable posterior expected weight f
239247
Visually, it is reasonable that the LIDAR data has three linear components, so these posterior expected weights seem to have identified the structure of the data well. We now sample from the posterior predictive distribution to get a better understand the model's performance.
240248

241249
```{code-cell} ipython3
242-
PP_SAMPLES = 5000
243-
244250
lidar_pp_x = np.linspace(std_range.min() - 0.05, std_range.max() + 0.05, 100)
245251
246252
with model:
247-
pm.set_data({"x": lidar_pp_x[:, np.newaxis]})
248-
pp_trace = pm.sample_posterior_predictive(trace, PP_SAMPLES, random_seed=SEED)
253+
pm.set_data({"x": lidar_pp_x[:, np.newaxis], "y": np.zeros_like(lidar_pp_x)})
254+
255+
pm.sample_posterior_predictive(
256+
trace, predictions=True, extend_inferencedata=True, random_seed=SEED
257+
)
249258
```
250259

251260
Below we plot the posterior expected value and the 95% posterior credible interval.
252261

262+
```{code-cell} ipython3
263+
trace.predictions["obs"].mean(("chain", "draw")).shape
264+
```
265+
253266
```{code-cell} ipython3
254267
fig, ax = plt.subplots(figsize=(8, 6))
255268
256269
ax.scatter(df.std_range, df.std_logratio, color=blue, zorder=10, label=None)
257270
258-
low, high = np.percentile(pp_trace["obs"], [2.5, 97.5], axis=0)
271+
low, high = np.percentile(az.extract(trace.predictions)["obs"].T, [2.5, 97.5], axis=0)
259272
ax.fill_between(
260273
lidar_pp_x, low, high, color="k", alpha=0.35, zorder=5, label="95% posterior credible interval"
261274
)
262275
263-
ax.plot(lidar_pp_x, pp_trace["obs"].mean(axis=0), c="k", zorder=6, label="Posterior expected value")
276+
ax.plot(
277+
lidar_pp_x,
278+
trace.predictions["obs"].mean(("chain", "draw")).values,
279+
c="k",
280+
zorder=6,
281+
label="Posterior expected value",
282+
)
264283
265284
ax.set_xticklabels([])
266285
ax.set_xlabel("Standardized range")
@@ -278,13 +297,21 @@ To learn more about dependent density regression and related models, consult [_B
278297

279298
This example first appeared [here](http://austinrochford.com/posts/2017-01-18-ddp-pymc3.html).
280299

281-
Author: [Austin Rochford](https://github.com/AustinRochford/)
300+
+++
301+
302+
## Authors
303+
* authored by Austin Rochford in 2017
304+
* updated to PyMC v5 by Christopher Fonnesbeck in September2024
305+
306+
+++
307+
308+
## References
309+
310+
:::{bibliography}
311+
:filter: docname in docnames
312+
:::
282313

283314
```{code-cell} ipython3
284315
%load_ext watermark
285316
%watermark -n -u -v -iv -w
286317
```
287-
288-
```{code-cell} ipython3
289-
290-
```

0 commit comments

Comments
 (0)