|
| 1 | +--- |
| 2 | +title: "Causal inference is not just a statistics problem" |
| 3 | +author: "Lucy D'Agostino McGowan" |
| 4 | +institute: "Wake Forest University" |
| 5 | +subtitle: "2023-04-12 (updated: `r Sys.Date()`)" |
| 6 | +format: kakashi-revealjs |
| 7 | +knitr: |
| 8 | + opts_chunk: |
| 9 | + eval: false |
| 10 | +fig-cap-location: bottom |
| 11 | +--- |
| 12 | + |
| 13 | +```{r} |
| 14 | +#| include: false |
| 15 | +options( |
| 16 | + tibble.max_extra_cols = 7, |
| 17 | + tibble.width = 60 |
| 18 | +) |
| 19 | +``` |
| 20 | + |
| 21 | +# Causal Inference is not a statistics problem {background-color="#23373B"} |
| 22 | + |
| 23 | +# Causal Inference is not *just* a statistics problem {background-color="#23373B"} |
| 24 | + |
| 25 | +## *The problem* |
| 26 | + |
| 27 | + |
| 28 | +### We have measured variables, what should we adjust for? |
| 29 | + |
| 30 | +. . . |
| 31 | + |
| 32 | + **exposure** | **outcome** | **covariate** |
| 33 | +---|---|--- |
| 34 | +0.49 | 1.71 | 2.24 |
| 35 | +0.07 | 0.68 | 0.92 |
| 36 | +0.40 | -1.60 | -0.10 |
| 37 | +. | . | . |
| 38 | +. | . | . |
| 39 | +. | . | . |
| 40 | +0.55 | -1.73 | -2.34 |
| 41 | + |
| 42 | +## *A bit more info* |
| 43 | + |
| 44 | +:::: columns |
| 45 | + |
| 46 | +::: column |
| 47 | +```{r} |
| 48 | +#| echo: false |
| 49 | +#| eval: true |
| 50 | +#| message: false |
| 51 | +#| warning: false |
| 52 | +library(quartets) |
| 53 | +library(tidyverse) |
| 54 | +ggplot(causal_confounding, aes(x = exposure, y = outcome))+ |
| 55 | + geom_point() + |
| 56 | + geom_smooth(method = "lm", formula = "y ~ x") |
| 57 | +``` |
| 58 | + |
| 59 | +**One unit increase in the exposure yields an average increase in the outcome of 1** |
| 60 | +::: |
| 61 | + |
| 62 | +::: column |
| 63 | + |
| 64 | +```{r} |
| 65 | +#| echo: true |
| 66 | +cor(exposure, covariate) |
| 67 | +``` |
| 68 | + |
| 69 | +```{r} |
| 70 | +#| echo: false |
| 71 | +#| eval: true |
| 72 | +cor(causal_confounding$exposure, causal_confounding$covariate) |> |
| 73 | + round(digits = 2) |
| 74 | +``` |
| 75 | + |
| 76 | +**The exposure and measured factor are positively correlated** |
| 77 | +::: |
| 78 | + |
| 79 | +:::: |
| 80 | + |
| 81 | +## |
| 82 | + |
| 83 | +:::: columns |
| 84 | + |
| 85 | +::: column |
| 86 | + |
| 87 | + |
| 88 | +::: |
| 89 | + |
| 90 | +::: column |
| 91 | + |
| 92 | +### To adjust or not adjust? That is the question. |
| 93 | +::: |
| 94 | +:::: |
| 95 | + |
| 96 | +## *Causal Quartet* |
| 97 | + |
| 98 | +:::: columns |
| 99 | + |
| 100 | +::: column |
| 101 | + |
| 102 | +{width=75%} |
| 103 | +**collider** |
| 104 | +{width=75%} |
| 105 | +**confounder** |
| 106 | +::: |
| 107 | + |
| 108 | +::: column |
| 109 | +{width=75%} |
| 110 | +**Mediator** |
| 111 | + |
| 112 | +{width=75%} |
| 113 | +**M-bias** |
| 114 | + |
| 115 | +::: |
| 116 | + |
| 117 | +:::: |
| 118 | + |
| 119 | +## |
| 120 | + |
| 121 | + |
| 122 | +## *Your turn* |
| 123 | + |
| 124 | +::: small |
| 125 | +::: nonincremental |
| 126 | +* Install the `quartets` package: `install.packages("quartets")` |
| 127 | +* For each of the following 4 datasets, look at the correlation between `exposure` and `covariate`: `causal_collider`, `causal_confounding`, `causal_mediator`, `causal_m_bias` |
| 128 | +* For each of the above 4 datasets, create a scatterplot looking at the relationship between `exposure` and `outcome` |
| 129 | +* For each of the above 4 datasets, fit a linear model to examine the relationship between the `exposure` and the `outcome` |
| 130 | +::: |
| 131 | +::: |
| 132 | + |
| 133 | +```{r} |
| 134 | +#| echo: false |
| 135 | +#| eval: true |
| 136 | +
|
| 137 | +countdown::countdown(10) |
| 138 | +``` |
| 139 | + |
| 140 | +## *Relationship between exposure and outcome* |
| 141 | + |
| 142 | +```{r} |
| 143 | +#| echo: false |
| 144 | +#| eval: true |
| 145 | +
|
| 146 | +ggplot(causal_quartet, aes(x = exposure, y = outcome)) + |
| 147 | + geom_point() + |
| 148 | + geom_smooth(method = "lm", formula = "y ~ x") + |
| 149 | + facet_wrap(~ dataset) |
| 150 | +``` |
| 151 | + |
| 152 | +## *Relationship between exposure and covariate* |
| 153 | + |
| 154 | +```{r} |
| 155 | +#| eval: true |
| 156 | +causal_quartet |> |
| 157 | + group_by(dataset) |> |
| 158 | + summarise(cor(exposure, covariate)) |
| 159 | +``` |
| 160 | + |
| 161 | +## Correct effects |
| 162 | + |
| 163 | + |
| 164 | + |
| 165 | +::: tiny |
| 166 | +D'Agostino McGowan L, Gerke T, Barrett M (2023). |
| 167 | +Causal inference is not a statistical problem. Preprint arXiv:2304.02683v1. |
| 168 | +::: |
| 169 | + |
| 170 | +## Observed effects |
| 171 | + |
| 172 | + |
| 173 | + |
| 174 | +::: tiny |
| 175 | +D'Agostino McGowan L, Gerke T, Barrett M (2023). |
| 176 | +Causal inference is not a statistical problem. Preprint arXiv:2304.02683v1. |
| 177 | +::: |
| 178 | + |
| 179 | +## The solution |
| 180 | + |
| 181 | +:::: columns |
| 182 | + |
| 183 | +::: column |
| 184 | + |
| 185 | +{width=75%} |
| 186 | +{width=75%} |
| 187 | +::: |
| 188 | + |
| 189 | +::: column |
| 190 | +{width=75%} |
| 191 | +{width=75%} |
| 192 | + |
| 193 | +::: |
| 194 | + |
| 195 | +:::: |
| 196 | + |
| 197 | +## The *partial* solution |
| 198 | + |
| 199 | +::: small |
| 200 | +```{r} |
| 201 | +#| echo: false |
| 202 | +#| eval: true |
| 203 | +
|
| 204 | +causal_collider_time |
| 205 | +``` |
| 206 | + |
| 207 | +::: |
| 208 | + |
| 209 | +. . . |
| 210 | + |
| 211 | +*Time-varying data* |
| 212 | + |
| 213 | +## Time-varying DAG |
| 214 | + |
| 215 | +{width=80%} |
| 216 | + |
| 217 | +**True causal effect**: 1 |
| 218 | +**Estimated causal effect**: 0.55 |
| 219 | + |
| 220 | +## Time-varying DAG |
| 221 | + |
| 222 | + |
| 223 | + |
| 224 | +. . . |
| 225 | + |
| 226 | +**True causal effect**: 1 |
| 227 | +**Estimated causal effect**: 1 |
| 228 | + |
| 229 | +# `outcome_followup ~ exposure_baseline + covariate_baseline` {.tiny background-color="#23373B"} |
| 230 | + |
| 231 | +## The *partial* solution |
| 232 | + |
| 233 | + |
| 234 | + |
| 235 | +::: tiny |
| 236 | +D'Agostino McGowan L, Gerke T, Barrett M (2023). |
| 237 | +Causal inference is not a statistical problem. Preprint arXiv:2304.02683v1. |
| 238 | +::: |
| 239 | + |
| 240 | +## *On M-Bias* |
| 241 | + |
| 242 | +::: small |
| 243 | +* The relationship between Z and the unmeasured confounders needs to be really large (Liu et al 2012) |
| 244 | +* “To obsess about the possibility of [M-bias] generates bad practical advice in all but the most unusual circumstances” (Rubin 2009) |
| 245 | +* There are (almost) no true zeros (Gelman 2011) |
| 246 | +* Asymptotic theory shows that induction of M-bias is quite sensitive to various deviations from the exact M-Structure (Ding and Miratrix 2014) |
| 247 | +::: |
| 248 | + |
| 249 | +## *Your turn* |
| 250 | + |
| 251 | +::: nonincremental |
| 252 | +* For each of the following 4 datasets, fit a linear linear model examining the relationship between `outcome_followup` and `exposure_baseline` adjusting for `covariate_baseline`: `causal_collider_time`, `causal_confounding_time`, `causal_mediator_time`, `causal_m_bias_time` |
| 253 | +::: |
| 254 | + |
| 255 | +```{r} |
| 256 | +#| echo: false |
| 257 | +#| eval: true |
| 258 | +
|
| 259 | +countdown::countdown(10) |
| 260 | +``` |
| 261 | + |
0 commit comments