You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: r-package/grf/vignettes/diagnostics.Rmd
+5-16Lines changed: 5 additions & 16 deletions
Original file line number
Diff line number
Diff line change
@@ -64,27 +64,16 @@ The forest summary function [test_calibration](https://grf-labs.github.io/grf/re
64
64
test_calibration(cf)
65
65
```
66
66
67
-
Another heuristic for testing for heterogeneity involves grouping observations into a high and low CATE group, then estimating average treatment effects in each subgroup. The function [average_treatment_effect](https://grf-labs.github.io/grf/reference/average_treatment_effect.html) estimates ATEs using a double robust approach:
67
+
This exercise and function is motivated by earlier developments in the econometrics literature. A more intuitive exercise is to look at subgroup ATEs where the subgroups are formed according to low or high CATE predictions (Athey & Wager, 2019).
68
+
While this approach may give some qualitative insight into heterogeneity, the grouping is naive, because the doubly robust scores used to determine subgroups are not independent of the scores used to estimate those group ATEs.
For another way to assess heterogeneity, see the function [rank_average_treatment_effect](https://grf-labs.github.io/grf/reference/rank_average_treatment_effect.html) and the accompanying [vignette](https://grf-labs.github.io/grf/articles/rate.html).
70
+
The [RATE](https://grf-labs.github.io/grf/reference/rank_average_treatment_effect.html) function automates this exercise over all possible subgroups using the quantiles of the CATE predictions. If we use separate data to fit CATE models and estimate RATE metrics, we obtain a test statistic with expectation zero under no heterogeneity, which can be used to construct confidence intervals for the presence of treatment effect heterogeneity. For more details on this preferred approach, please see [this vignette](https://grf-labs.github.io/grf/articles/rate.html).
84
71
85
72
Athey et al. (2017) suggests a bias measure to gauge how much work the propensity and outcome models have to do to get an unbiased estimate, relative to looking at a simple difference-in-means: $bias(x) = (e(x) - p) \times (p(\mu(0, x) - \mu_0) + (1 - p) (\mu(1, x) - \mu_1)$.
0 commit comments