You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: examples/causal_inference/moderation_analysis.myst.md
+8-14Lines changed: 8 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,15 +23,15 @@ This notebook covers Bayesian [moderation analysis](https://en.wikipedia.org/wik
23
23
24
24
This is not intended as a one-stop solution to a wide variety of data analysis problems, rather, it is intended as an educational exposition to show how moderation analysis works and how to conduct Bayesian parameter estimation in PyMC. This notebook focusses on observational methods and does not explore experimental interventions.
25
25
26
-
Moderation analysis has been approached from a variety of approaches:
26
+
Moderation analysis has been framed in a variety of ways:
27
27
* Statistical approaches: It is entirely possible to approach moderation analysis from a purely statistical perspective. In this approach we might build a linear model (for example) whose aim is purely to _describe_ the data we have while making no claims about causality.
28
28
* Path analysis: This approach asserts that the variables in the model are causally related and is exemplified in {cite:t}`hayes2017introduction`, for example. This approach cannot be considered as 'fully causal' as it lacks a variety of the concepts present in the causal approach.
29
29
* Causal inference: This approach builds upon the path analysis approach in that there is a claim of causal relationships between the variables. But it goes further in that there are additional causal concepts which can be brought to bear.
30
30
31
31
+++
32
32
33
33
:::{attention}
34
-
Note that this is sometimes mixed up with [mediation analysis](https://en.wikipedia.org/wiki/Mediation_(statistics)). Mediation analysis is appropriate when we believe the effect of a predictor variable upon an outcome variable is (partially, or fully) mediated through a 3rd mediating variable. Readers are referred to the textbook by {cite:t}`hayes2017introduction` as a comprehensive (albeit Frequentist) guide to moderation and related models as well as the PyMC example {ref}`mediation_analysis`.
34
+
Note that moderation is sometimes mixed up with [mediation analysis](https://en.wikipedia.org/wiki/Mediation_(statistics)). Mediation analysis is appropriate when we believe the effect of a predictor variable upon an outcome variable is (partially, or fully) mediated through a 3rd mediating variable. Readers are referred to the textbook by {cite:t}`hayes2017introduction` as a comprehensive (albeit Frequentist) guide to moderation and related models as well as the PyMC example {ref}`mediation_analysis`.
35
35
:::
36
36
37
37
```{code-cell} ipython3
@@ -257,23 +257,23 @@ We can see that the mean $y$ is simply a multiple linear regression with an inte
257
257
We can get some insight into why this is the case by thinking about this as a multiple linear regression with $x$ and $m$ as predictor variables, but where the value of $m$ influences the relationship between $x$ and $y$. This is achieved by making the regression coefficient for $x$ is a function of $m$:
258
258
259
259
$$
260
-
y \sim \beta_0 + f(m) \cdot x + \beta_3 \cdot m
260
+
\mathbb{E}(y) = \beta_0 + f(m) \cdot x + \beta_3 \cdot m
261
261
$$
262
262
263
263
and if we define that as a linear function, $f(m) = \beta_1 + \beta_2 \cdot m$, we get
264
264
265
265
$$
266
-
y \sim \beta_0 + (\beta_1 + \beta_2 \cdot m) \cdot x + \beta_3 \cdot m
266
+
\mathbb{E}(y) = \beta_0 + (\beta_1 + \beta_2 \cdot m) \cdot x + \beta_3 \cdot m
267
267
$$
268
268
269
269
which multiplies out to
270
270
271
271
$$
272
-
y \sim \beta_0 + \beta_1 \cdot x + \beta_2 \cdot x \cdot m + \beta_3 \cdot m
272
+
\mathbb{E}(y) = \beta_0 + \beta_1 \cdot x + \beta_2 \cdot x \cdot m + \beta_3 \cdot m
273
273
$$
274
274
275
275
:::{note}
276
-
We can use $f(m) = \beta_1 + \beta_2 \cdot m$ later to visualise the moderation effect.
276
+
We can use $f(m) = \beta_1 + \beta_2 \cdot m$ later to visualise the moderation effect in a so-called spotlight graph.
\mu &\sim \beta_0 + \beta_1 \cdot x + \beta_2 \cdot x \cdot m + \beta_3 \cdot m\\
289
+
\mu &= \beta_0 + \beta_1 \cdot x + \beta_2 \cdot x \cdot m + \beta_3 \cdot m\\
290
290
y &\sim \mathrm{Normal}(\mu, \sigma^2)
291
291
\end{aligned}
292
292
$$
@@ -343,11 +343,9 @@ def model_factory(x, m, y):
343
343
with pm.Model() as model:
344
344
x = pm.Data("x", x)
345
345
m = pm.Data("m", m)
346
-
# priors
347
346
β = pm.Normal("β", mu=0, sigma=10, size=4)
348
347
σ = pm.HalfCauchy("σ", 1)
349
-
# likelihood
350
-
y = pm.Normal("y", mu=β[0] + (β[1] * x) + (β[2] * x * m) + (β[3] * m), sigma=σ, observed=y)
348
+
pm.Normal("y", mu=β[0] + (β[1] * x) + (β[2] * x * m) + (β[3] * m), sigma=σ, observed=y)
351
349
352
350
return model
353
351
```
@@ -421,10 +419,6 @@ ax.set_title("Data and posterior prediction");
421
419
### Spotlight graph
422
420
We can also visualise the moderation effect by plotting $\beta_1 + \beta_2 \cdot m$ as a function of the $m$. This was named a spotlight graph, see {cite:t}`spiller2013spotlights` and {cite:t}`mcclelland2017multicollinearity`.
423
421
424
-
```{code-cell} ipython3
425
-
# result.posterior["β"].isel(β_dim_0=2)
426
-
```
427
-
428
422
```{code-cell} ipython3
429
423
fig, ax = plt.subplots(1, 2, figsize=(10, 5))
430
424
plot_moderation_effect(result, m, m_quantiles, ax[0])
0 commit comments