Skip to content

Commit 9f09149

Browse files
add link to repo
1 parent 8aabf4b commit 9f09149

File tree

80 files changed

+6919
-93
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

80 files changed

+6919
-93
lines changed

slides/raw/13-bonus-selection-bias.qmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "G-Computation"
2+
title: "Bonus: Selection bias"
33
author: "Malcolm Barrett"
44
date: "2021-09-01 (updated: `r Sys.Date()`)"
55
format: "kakashi-revealjs"

slides/raw/14-bonus-continuous-pscores.html

Lines changed: 1057 additions & 0 deletions
Large diffs are not rendered by default.

slides/raw/10-continuous-exposures.qmd renamed to slides/raw/14-bonus-continuous-pscores.qmd

Lines changed: 63 additions & 92 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: "Propensity scores for continuous exposures"
2+
title: "Continuous exposures with propensity scores"
33
author: "Malcolm Barrett"
44
format: "kakashi-revealjs"
55
date: "2021-09-01 (updated: `r Sys.Date()`)"
@@ -17,10 +17,14 @@ library(tidyverse)
1717
library(broom)
1818
library(causaldata)
1919
library(touringplans)
20+
library(propensity)
2021
2122
set.seed(1000)
2223
```
2324

25+
## {background-color="#23373B" .huge .center}
26+
### *Warning!* Propensity score weights are sensitive to positivity violations for continuous exposures.
27+
2428
## {background-color="#23373B" .huge .center}
2529
### **The story so far**
2630

@@ -34,11 +38,11 @@ set.seed(1000)
3438

3539
## Continous exposures {background-color="#23373B"}
3640

37-
1. Use a model like `lm(x ~ z)` for the propensity score model
38-
2. Scale weights to probability-like scale using `dnorm(true_value, fitted_value, estimated_sd)`
41+
1. Use a model like `lm(x ~ z)` for the propensity score model.
42+
2. Use `wt_ate()` with `.fitted` and `.sigma`; transforms using `dnorm()` to get on probability-like scale.
3943
3. Apply the weights to the outcome model as normal!
4044

41-
## Alternative: quantile binning {background-color="#23373B"}
45+
## Alternative: quantile binning {background-color="#23373B" .small}
4246

4347
1. Bin the continuous exposure into quantiles and use categorical regression like a multinomial model to calculate probabilities.
4448
2. Calculate the weights where the propensity score is the probability you fall into the quantile you actually fell into. Same as the binary ATE!
@@ -54,17 +58,18 @@ model <- lm(
5458
)
5559
```
5660

57-
## 2. Calculate the weights with `dnorm()`
61+
## 2. Calculate the weights with `wt_ate()`
5862

5963
```{r}
6064
#| eval: false
61-
#| code-line-numbers: "|3-7"
65+
#| code-line-numbers: "|3-8"
6266
model |>
6367
augment(data = df) |>
64-
mutate(denominator = dnorm(
68+
mutate(wts = wt_ate(
6569
exposure,
66-
mean = .fitted,
67-
sd = mean(.sigma, na.rm = TRUE)
70+
.fitted,
71+
# .sigma is from augment()
72+
.sigma = .sigma
6873
))
6974
```
7075

@@ -79,7 +84,7 @@ nhefs_light_smokers <- nhefs_complete |>
7984

8085
```{r}
8186
#| code-line-numbers: "|1-2|3-6"
82-
nhefs_denominator_model <- lm(
87+
nhefs_model <- lm(
8388
smkintensity82_71 ~ sex + race + age + I(age^2) +
8489
education + smokeintensity + I(smokeintensity^2) +
8590
smokeyrs + I(smokeyrs^2) + exercise + active +
@@ -88,25 +93,24 @@ nhefs_denominator_model <- lm(
8893
)
8994
```
9095

91-
## 2. Calculate the weights with `dnorm()`
96+
## 2. Calculate the weights with `wt_ate()`
9297

9398
```{r}
94-
#| code-line-numbers: "|3-6"
95-
nhefs_denominators <- nhefs_denominator_model |>
99+
#| code-line-numbers: "|3-7"
100+
nhefs_wts <- nhefs_model |>
96101
augment(data = nhefs_light_smokers) |>
97-
mutate(denominator = dnorm(
102+
mutate(wts = wt_ate(
98103
smkintensity82_71,
99104
.fitted,
100-
mean(.sigma, na.rm = TRUE)
101-
)) |>
102-
select(id, denominator)
105+
.sigma = .sigma
106+
))
103107
```
104108

105109

106-
## 2. Calculate the weights with `dnorm()`
110+
## 2. Calculate the weights with `wt_ate()`
107111

108112
```{r}
109-
nhefs_denominators
113+
nhefs_wts
110114
```
111115

112116
## Do *posted* wait times at 8 am affect *actual* wait times at 9 am?
@@ -190,7 +194,7 @@ dagify(
190194

191195
### Fit a model using `lm()` with `avg_spostmin` as the outcome and the confounders identified in the DAG.
192196
### Use `augment()` to add model predictions to the data frame
193-
### In `dnorm()`, use `.fitted` as the mean and the mean of `.sigma` as the SD to calculate the propensity score for the denominator.
197+
### In `wt_ate()`, calculate the weights using `avg_postmin`, `.fitted`, and `.sigma`
194198

195199
`r countdown::countdown(minutes = 5)`
196200

@@ -212,7 +216,7 @@ wait_times <- eight |>
212216
```
213217

214218
```{r}
215-
denominator_model <- lm(
219+
post_time_model <- lm(
216220
avg_spostmin ~
217221
close + extra_magic_morning +
218222
weather_wdwhigh + wdw_ticket_season,
@@ -223,22 +227,18 @@ denominator_model <- lm(
223227
## *Your Turn 1*
224228

225229
```{r}
226-
denominators <- denominator_model |>
230+
wait_times_wts <- post_time_model |>
227231
augment(data = wait_times) |>
228-
mutate(
229-
denominator = dnorm(
230-
avg_spostmin, .fitted, mean(.sigma, na.rm = TRUE)
231-
)
232-
) |>
233-
select(date, denominator)
232+
mutate(wts = wt_ate(
233+
avg_spostmin, .fitted, .sigma = .sigma
234+
))
234235
```
235236

236237
## *Stabilizing extreme weights*
237238

238239
```{r}
239240
#| echo: false
240-
nhefs_denominators |>
241-
mutate(wts = 1 / denominator) |>
241+
nhefs_wts |>
242242
ggplot(aes(wts)) +
243243
geom_density(col = "#E69F00", fill = "#E69F0095") +
244244
scale_x_log10() +
@@ -248,49 +248,29 @@ nhefs_denominators |>
248248

249249
## Stabilizing extreme weights {background-color="#23373B"}
250250

251-
1. Fit an intercept-only model (e.g. `lm(x ~ 1)`)
252-
2. Calculate weights from this model
253-
3. Divide these weights by the propensity score weights
254-
255-
## 1. Fit an intercept-only model
256-
257-
```{r}
258-
#| code-line-numbers: "|2"
259-
nhefs_numerator_model <- lm(
260-
smkintensity82_71 ~ 1,
261-
data = nhefs_light_smokers
262-
)
263-
```
251+
1. Fit an intercept-only model (e.g. `lm(x ~ 1)`) or use mean and SD of `x`
252+
2. Calculate weights from this model.
253+
3. Divide these weights by the propensity score weights. `wt_ate(.., stabilize = TRUE)` does this all!
264254

265-
## 2. Calculate weights from this model
255+
## Calculate stabilized weights
266256

267257
```{r}
268-
#| code-line-numbers: "|1"
269-
nhefs_numerators <- nhefs_numerator_model |>
258+
#| code-line-numbers: "|7"
259+
nhefs_swts <- nhefs_model |>
270260
augment(data = nhefs_light_smokers) |>
271-
mutate(numerator = dnorm(
261+
mutate(swts = wt_ate(
272262
smkintensity82_71,
273-
mean = .fitted,
274-
sd = mean(.sigma, na.rm = TRUE))
275-
) |>
276-
select(id, numerator)
277-
```
278-
279-
## 3. Divide these weights by the propensity score weights
280-
281-
```{r}
282-
#| code-line-numbers: "|4"
283-
nhefs_light_smokers <- nhefs_light_smokers |>
284-
left_join(nhefs_numerators, by = "id") |>
285-
left_join(nhefs_denominators, by = "id") |>
286-
mutate(swts = numerator / denominator)
263+
.fitted,
264+
.sigma = .sigma,
265+
stabilize = TRUE
266+
))
287267
```
288268

289269
## Stabilizing extreme weights
290270

291271
```{r}
292272
#| echo: false
293-
ggplot(nhefs_light_smokers, aes(swts)) +
273+
ggplot(nhefs_swts, aes(swts)) +
294274
geom_density(col = "#E69F00", fill = "#E69F0095") +
295275
scale_x_log10() +
296276
theme_minimal(base_size = 20) +
@@ -299,42 +279,23 @@ ggplot(nhefs_light_smokers, aes(swts)) +
299279

300280
## *Your Turn 2*
301281

302-
### Fit an intercept-only model of posted weight times to use as the numerator model
303-
### Calculate the numerator weights using `dnorm()` as above.
304-
### Finally, calculate the stabilized weights, `swts`, using the `numerator` and `denominator` weights
282+
### Re-fit the above using stabilized weights
305283

306-
`r countdown::countdown(minutes = 5)`
284+
`r countdown::countdown(minutes = 3)`
307285

308286
## *Your Turn 2*
309287

310288
```{r}
311-
numerator_model <- lm(
312-
avg_spostmin ~ 1,
313-
data = wait_times
314-
)
315-
```
316-
317-
---
318-
319-
## Your Turn 2
320-
321-
```{r}
322-
numerators <- numerator_model |>
289+
wait_times_swts <- post_time_model |>
323290
augment(data = wait_times) |>
324-
mutate(
325-
numerator = dnorm(
326-
avg_spostmin, .fitted, mean(.sigma, na.rm = TRUE)
327-
)
328-
) |>
329-
select(date, numerator)
330-
331-
wait_times_wts <- wait_times |>
332-
left_join(numerators, by = "date") |>
333-
left_join(denominators, by = "date") |>
334-
mutate(swts = numerator / denominator)
291+
mutate(swts = wt_ate(
292+
avg_spostmin,
293+
.fitted,
294+
.sigma = .sigma,
295+
stabilize = TRUE
296+
))
335297
```
336298

337-
338299
## Fitting the outcome model {background-color="#23373B"}
339300

340301
1. Use the stabilized weights in the outcome model. Nothing new here!
@@ -346,7 +307,7 @@ wait_times_wts <- wait_times |>
346307
lm(
347308
wt82_71 ~ smkintensity82_71,
348309
weights = swts,
349-
data = nhefs_light_smokers
310+
data = nhefs_swts
350311
) |>
351312
tidy() |>
352313
filter(term == "smkintensity82_71") |>
@@ -365,10 +326,20 @@ lm(
365326
lm(
366327
avg_sactmin ~ avg_spostmin,
367328
weights = swts,
368-
data = wait_times_wts
329+
data = wait_times_swts
369330
) |>
370331
tidy() |>
371332
filter(term == "avg_spostmin") |>
372333
mutate(estimate = estimate * 10)
373334
```
374335

336+
337+
## Diagnosing issues {background-color="#23373B"}
338+
339+
1. Extreme weights even after stabilization
340+
2. Bootstrap: non-normal distribution
341+
3. Bootstrap: estimate different from original model
342+
343+
## More info {background-color="#23373B"}
344+
345+
### https://github.com/LucyMcGowan/writing-positivity-continous-ps
121 KB
Loading
165 KB
Loading
360 KB
Loading

slides/raw/14-bonus-continuous-pscores_files/libs/clipboard/clipboard.min.js

Lines changed: 7 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)