You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: exercises/15-bonus-ml-for-causal-exercises.qmd
+15-9Lines changed: 15 additions & 9 deletions
Original file line number
Diff line number
Diff line change
@@ -179,7 +179,7 @@ ate_gcomp
179
179
180
180
## Your Turn 1
181
181
182
-
1. First, create a character vector `sl_library` that specifies the following algorithms: "SL.glm", "SL.ranger", "SL.xgboost", "SL.gam". Then, Fit a SuperLearner for the exposure model using the `SuperLearner` package. The predictors for this model should be the confounders identified in the DAG: `park_ticket_season`, `park_close`, and `park_temperature_high`. The outcome is `park_extra_magic_morning`.
182
+
1. First, create a character vector `sl_library` that specifies the following algorithms: "SL.glm", "SL.ranger", "SL.gam". Then, Fit a SuperLearner for the exposure model using the `SuperLearner` package. The predictors for this model should be the confounders identified in the DAG: `park_ticket_season`, `park_close`, and `park_temperature_high`. The outcome is `park_extra_magic_morning`.
183
183
2. Fit a SuperLearner for the outcome model using the `SuperLearner` package. The predictors for this model should be the confounders plus the exposure: `park_extra_magic_morning`, `park_ticket_season`, `park_close`, and `park_temperature_high`. The outcome is `wait_minutes_posted_avg`.
184
184
3. Inspect the fitted SuperLearner objects.
185
185
@@ -251,7 +251,6 @@ outcome_rmse
251
251
sl_library_extended <- c(
252
252
"SL.glm",
253
253
"SL.ranger",
254
-
"SL.xgboost",
255
254
"SL.earth",
256
255
"SL.gam",
257
256
"SL.glm.interaction",
@@ -310,16 +309,23 @@ tidy(ipw_model) |>
310
309
```{r}
311
310
# G-computation with SuperLearner outcome model
312
311
# Step 1: Create counterfactual datasets
313
-
seven_dwarfs_clone <- seven_dwarfs |>
314
-
mutate(park_close = as.numeric(park_close))
312
+
# For SuperLearner prediction, we need only the columns used in the model
315
313
316
314
# Dataset where everyone is treated, `park_extra_magic_morning` = 1
- In **IPW** and **G-computation**, we estimate the average treatment effect (ATE) using predictions from the exposure and outcome models. But these algorithms optimize for the predictions, not the ATE.
544
552
- In **TMLE**, we adjust the predictions to specifically target the ATE. We change the bias-variance tradeoff to focus on the ATE rather than just minimizing prediction error. This is a debiasing step that also improves the efficiency of the estimate!
545
-
- Targeting is a general technique that can be applied to many problems, not just causal ones
546
553
547
554
## Targeted Learning: valid statistical inference
548
-
- In **IPW** and **G-computation**, we can using ML algorithms to make predictions, but we cannot easily get valid confidence intervals. Bootstrapping is often used, but it can be computationally intensive and not always valid.
555
+
- In **IPW** and **G-computation**, we cannot easily get valid confidence intervals with ML. Bootstrapping is often used, but it can be computationally intensive and not always valid.
549
556
- In **TMLE**, we can use the influence curve to get valid confidence intervals. The influence curve is a way to estimate the variance of the TMLE estimate, even when using complex ML algorithms.
550
557
551
558
## The TMLE Algorithm {background-color="#23373B"}
552
559
553
560
1. Start with SuperLearner predictions for the outcome
554
561
2. Calculate the propensity scores using SuperLearner
555
562
3. Create the clever covariate using the propensity scores
563
+
564
+
## The TMLE Algorithm {background-color="#23373B"}
565
+
556
566
4. Fit the fluctuation model to learn how much to adjust the outcome predictions
557
567
5. Update the predictions with the targeted adjustment
558
568
6. Calculate the TMLE estimate and standard error using the influence curve
0 commit comments