Skip to content

Commit 9a7729c

Browse files
committed
Add Wilkinson notation for each model
1 parent 1da30fb commit 9a7729c

File tree

2 files changed

+112
-0
lines changed

2 files changed

+112
-0
lines changed

examples/generalized_linear_models/GLM-simpsons-paradox.ipynb

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -368,6 +368,18 @@
368368
"$$"
369369
]
370370
},
371+
{
372+
"cell_type": "markdown",
373+
"metadata": {},
374+
"source": [
375+
":::{info}\n",
376+
"We can also express Model 1 in Wilkinson notation as `y ~ 1 + x` which is equivalent to `y ~ x` as the intercept is included by default.\n",
377+
"\n",
378+
"* The `1` term corresponds to the intercept term $\\beta_0$.\n",
379+
"* The `x` term corresponds to the slope term $\\beta_1$.\n",
380+
":::"
381+
]
382+
},
371383
{
372384
"cell_type": "markdown",
373385
"metadata": {},
@@ -918,6 +930,18 @@
918930
"Where $g_i$ is the group index for observation $i$. So the parameters $\\beta_0$ and $\\beta_1$ are now length $g$ vectors, not scalars. And the $[g_i]$ acts as an index to look up the group for the $i^\\text{th}$ observation."
919931
]
920932
},
933+
{
934+
"cell_type": "markdown",
935+
"metadata": {},
936+
"source": [
937+
":::{info}\n",
938+
"We can also express this Model 2 in Wilkinson notation as `y ~ g + x:g`.\n",
939+
"\n",
940+
"* The `g` term captures the group specific intercept $\\beta_0[g_i]$ parameters.\n",
941+
"* The `x:g` term captures group specific slope $\\beta_1[g_i]$ parameters.\n",
942+
":::"
943+
]
944+
},
921945
{
922946
"cell_type": "markdown",
923947
"metadata": {},
@@ -1423,6 +1447,21 @@
14231447
"This model could also be called a partial pooling model. "
14241448
]
14251449
},
1450+
{
1451+
"cell_type": "markdown",
1452+
"metadata": {},
1453+
"source": [
1454+
":::{info}\n",
1455+
"We can also express this Model 3 in Wilkinson notation as `1 + x + (1 + x | g)`.\n",
1456+
"\n",
1457+
"* The `1` captures the global intercept, $\\mathrm{Normal}(p_{0\\mu}, p_{0\\sigma})$.\n",
1458+
"* The `x` captures the global slope, $\\mathrm{Normal}(p_{1\\mu}, p_{1\\sigma})$.\n",
1459+
"* The `(1 + x | g)` term captures group specific random effects for the intercept and slope.\n",
1460+
" * `1 | g` captures the group specific intercept $\\vec{\\beta_0}[g_i]$ parameters.\n",
1461+
" * `x | g` captures the group specific slope $\\vec{\\beta_1}[g_i]$ parameters.\n",
1462+
":::"
1463+
]
1464+
},
14261465
{
14271466
"cell_type": "markdown",
14281467
"metadata": {},
@@ -1478,6 +1517,30 @@
14781517
" pm.Normal(\"y\", mu=μ, sigma=sigma, observed=data.y, dims=\"obs_id\")"
14791518
]
14801519
},
1520+
{
1521+
"cell_type": "code",
1522+
"execution_count": null,
1523+
"metadata": {},
1524+
"outputs": [],
1525+
"source": [
1526+
"with pm.Model(coords=coords) as model3:\n",
1527+
" # Define priors\n",
1528+
" intercept_mu = pm.Normal(\"intercept_mu\", 0, 1)\n",
1529+
" slope_mu = pm.Normal(\"slope_mu\", 0, 1)\n",
1530+
" intercept_sigma = pm.Gamma(\"intercept_sigma\", 2, 2)\n",
1531+
" slope_sigma = pm.Gamma(\"slope_sigma\", 2, 2)\n",
1532+
" sigma = pm.Gamma(\"sigma\", 2, 2)\n",
1533+
" β0 = pm.Normal(\"β0\", intercept_mu, intercept_sigma, dims=\"group\")\n",
1534+
" β1 = pm.Normal(\"β1\", slope_mu, slope_sigma, dims=\"group\")\n",
1535+
" # Data\n",
1536+
" x = pm.Data(\"x\", data.x, dims=\"obs_id\")\n",
1537+
" g = pm.Data(\"g\", data.group_idx, dims=\"obs_id\")\n",
1538+
" # Linear model\n",
1539+
" μ = pm.Deterministic(\"μ\", β0[g] + β1[g] * x, dims=\"obs_id\")\n",
1540+
" # Define likelihood\n",
1541+
" pm.Normal(\"y\", mu=μ, sigma=sigma, observed=data.y, dims=\"obs_id\")"
1542+
]
1543+
},
14811544
{
14821545
"cell_type": "markdown",
14831546
"metadata": {},

examples/generalized_linear_models/GLM-simpsons-paradox.myst.md

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,15 @@ $$
119119

120120
+++
121121

122+
:::{info}
123+
We can also express Model 1 in Wilkinson notation as `y ~ 1 + x` which is equivalent to `y ~ x` as the intercept is included by default.
124+
125+
* The `1` term corresponds to the intercept term $\beta_0$.
126+
* The `x` term corresponds to the slope term $\beta_1$.
127+
:::
128+
129+
+++
130+
122131
### Build model
123132

124133
```{code-cell} ipython3
@@ -274,6 +283,15 @@ Where $g_i$ is the group index for observation $i$. So the parameters $\beta_0$
274283

275284
+++
276285

286+
:::{info}
287+
We can also express this Model 2 in Wilkinson notation as `y ~ g + x:g`.
288+
289+
* The `g` term captures the group specific intercept $\beta_0[g_i]$ parameters.
290+
* The `x:g` term captures group specific slope $\beta_1[g_i]$ parameters.
291+
:::
292+
293+
+++
294+
277295
### Build model
278296

279297
```{code-cell} ipython3
@@ -432,6 +450,18 @@ This model could also be called a partial pooling model.
432450

433451
+++
434452

453+
:::{info}
454+
We can also express this Model 3 in Wilkinson notation as `1 + x + (1 + x | g)`.
455+
456+
* The `1` captures the global intercept, $\mathrm{Normal}(p_{0\mu}, p_{0\sigma})$.
457+
* The `x` captures the global slope, $\mathrm{Normal}(p_{1\mu}, p_{1\sigma})$.
458+
* The `(1 + x | g)` term captures group specific random effects for the intercept and slope.
459+
* `1 | g` captures the group specific intercept $\vec{\beta_0}[g_i]$ parameters.
460+
* `x | g` captures the group specific slope $\vec{\beta_1}[g_i]$ parameters.
461+
:::
462+
463+
+++
464+
435465
:::{note}
436466
The hierarchical model we are considering contains a simplification in that the population level slope and intercept are assumed to be independent. It is possible to relax this assumption and model any correlation between these parameters by using a multivariate normal distribution.
437467

@@ -474,6 +504,25 @@ with pm.Model(coords=coords) as model3:
474504
pm.Normal("y", mu=μ, sigma=sigma, observed=data.y, dims="obs_id")
475505
```
476506

507+
```{code-cell} ipython3
508+
with pm.Model(coords=coords) as model3:
509+
# Define priors
510+
intercept_mu = pm.Normal("intercept_mu", 0, 1)
511+
slope_mu = pm.Normal("slope_mu", 0, 1)
512+
intercept_sigma = pm.Gamma("intercept_sigma", 2, 2)
513+
slope_sigma = pm.Gamma("slope_sigma", 2, 2)
514+
sigma = pm.Gamma("sigma", 2, 2)
515+
β0 = pm.Normal("β0", intercept_mu, intercept_sigma, dims="group")
516+
β1 = pm.Normal("β1", slope_mu, slope_sigma, dims="group")
517+
# Data
518+
x = pm.Data("x", data.x, dims="obs_id")
519+
g = pm.Data("g", data.group_idx, dims="obs_id")
520+
# Linear model
521+
μ = pm.Deterministic("μ", β0[g] + β1[g] * x, dims="obs_id")
522+
# Define likelihood
523+
pm.Normal("y", mu=μ, sigma=sigma, observed=data.y, dims="obs_id")
524+
```
525+
477526
Plotting the DAG now makes it clear that the group-level intercept and slope parameters are drawn from a population level distributions. That is, we have hyper-priors for the slopes and intercept parameters. This particular model does not have a hyper-prior for the measurement error - this is just left as one parameter per group, as in the previous model.
478527

479528
```{code-cell} ipython3

0 commit comments

Comments
 (0)