|
22 | 22 | #' weights have been stabilized, trimmed, or truncated. |
23 | 23 | #' |
24 | 24 | #' @details |
| 25 | +#' ## Theoretical Background |
| 26 | +#' |
| 27 | +#' Propensity score weighting is a method for estimating causal effects by |
| 28 | +#' creating a pseudo-population where the exposure is independent of measured |
| 29 | +#' confounders. The propensity score, \eqn{e(X)}, is the probability of receiving |
| 30 | +#' treatment given observed covariates \eqn{X}. By weighting observations inversely |
| 31 | +#' proportional to their propensity scores, we can balance the distribution of |
| 32 | +#' covariates between treatment groups. Other weights allow for different target populations. |
| 33 | +#' |
| 34 | +#' ## Mathematical Formulas |
| 35 | +#' |
| 36 | +#' ### Binary Exposures |
| 37 | +#' |
| 38 | +#' For binary treatments (\eqn{A = 0} or \eqn{1}), the weights are: |
| 39 | +#' |
| 40 | +#' - **ATE**: \eqn{w = \frac{A}{e(X)} + \frac{1-A}{1-e(X)}} |
| 41 | +#' - **ATT**: \eqn{w = A + \frac{(1-A) \cdot e(X)}{1-e(X)}} |
| 42 | +#' - **ATU**: \eqn{w = \frac{A \cdot (1-e(X))}{e(X)} + (1-A)} |
| 43 | +#' - **ATM**: \eqn{w = \frac{\min(e(X), 1-e(X))}{A \cdot e(X) + (1-A) \cdot (1-e(X))}} |
| 44 | +#' - **ATO**: \eqn{w = A \cdot (1-e(X)) + (1-A) \cdot e(X)} |
| 45 | +#' - **Entropy**: \eqn{w = \frac{h(e(X))}{A \cdot e(X) + (1-A) \cdot (1-e(X))}}, where \eqn{h(e) = -[e \cdot \log(e) + (1-e) \cdot \log(1-e)]} |
| 46 | +#' |
| 47 | +#' ### Continuous Exposures |
| 48 | +#' |
| 49 | +#' For continuous treatments, weights use the density ratio: |
| 50 | +#' \eqn{w = \frac{f_A(A)}{f_{A|X}(A|X)}}, where \eqn{f_A} is the marginal density of \eqn{A} |
| 51 | +#' and \eqn{f_{A|X}} is the conditional density given \eqn{X}. |
| 52 | +#' |
25 | 53 | #' ## Exposure Types |
26 | 54 | #' |
27 | | -#' The functions support different types of exposures: |
| 55 | +#' The functions support different types of exposures: |
28 | 56 | #' |
29 | 57 | #' - **`binary`**: For dichotomous treatments (e.g. 0/1). |
30 | 58 | #' - **`continuous`**: For numeric exposures. Here, weights are calculated via the normal density using |
31 | 59 | #' `dnorm()`. |
32 | 60 | #' - **`categorical`**: Currently not supported (an error will be raised). |
33 | 61 | #' - **`auto`**: Automatically detects the exposure type based on `.exposure`. |
34 | 62 | #' |
35 | | -#' ## Stabilization |
| 63 | +#' ## Stabilization |
| 64 | +#' |
| 65 | +#' For ATE weights, stabilization can improve the performance of the estimator |
| 66 | +#' by reducing variance. When `stabilize` is `TRUE` and no |
| 67 | +#' `stabilization_score` is provided, the weights are multiplied by the mean |
| 68 | +#' of `.exposure`. Alternatively, if a `stabilization_score` is provided, it |
| 69 | +#' is used as the multiplier. Stabilized weights have the form: |
| 70 | +#' \eqn{w_s = f_A(A) \times w}, where \eqn{f_A(A)} is the marginal probability or density. |
36 | 71 | #' |
37 | | -#' For ATE weights, stabilization can improve the performance of the estimator |
38 | | -#' by reducing variance. When `stabilize` is `TRUE` and no |
39 | | -#' `stabilization_score` is provided, the weights are multiplied by the mean |
40 | | -#' of `.exposure`. Alternatively, if a `stabilization_score` is provided, it |
41 | | -#' is used as the multiplier. |
| 72 | +#' ## Weight Properties and Diagnostics |
42 | 73 | #' |
43 | | -#' ## Trimmed and Truncated Weights |
| 74 | +#' Extreme weights can indicate: |
| 75 | +#' - Positivity violations (near 0 or 1 propensity scores) |
| 76 | +#' - Poor model specification |
| 77 | +#' - Lack of overlap between treatment groups |
| 78 | +#' |
| 79 | +#' See the halfmoon package for tools to diagnose and visualize weights. |
| 80 | +#' |
| 81 | +#' You can address extreme weights in several ways. The first is to modify the target population: |
| 82 | +#' use trimming, truncation, or alternative estimands (ATM, ATO, entropy). |
| 83 | +#' Another technique that can help is stabilization, which reduces variance of the weights. |
44 | 84 | #' |
45 | | -#' In addition to the standard weight functions, versions exist for trimmed |
46 | | -#' and truncated propensity score weights created by [ps_trim()], |
47 | | -#' [ps_trunc()], and [ps_refit()]. These variants calculate the weights using |
48 | | -#' modified propensity scores (trimmed or truncated) and update the estimand |
49 | | -#' attribute accordingly. |
| 85 | +#' ## Trimmed and Truncated Weights |
50 | 86 | #' |
51 | | -#' The main functions (`wt_ate`, `wt_att`, `wt_atu`, `wt_atm`, and `wt_ato`) |
52 | | -#' dispatch on the class of `.propensity`. For binary exposures, the weights |
53 | | -#' are computed using inverse probability formulas. For continuous exposures |
54 | | -#' (supported only for ATE), weights are computed as the inverse of the |
55 | | -#' density function evaluated at the observed exposure. |
| 87 | +#' In addition to the standard weight functions, versions exist for trimmed |
| 88 | +#' and truncated propensity score weights created by [ps_trim()], |
| 89 | +#' [ps_trunc()], and [ps_refit()]. These variants calculate the weights using |
| 90 | +#' modified propensity scores (trimmed or truncated) and update the estimand |
| 91 | +#' attribute accordingly. |
56 | 92 | #' |
57 | 93 | #' @param .propensity Either a numeric vector of predicted probabilities or a |
58 | 94 | #' `data.frame` where each column corresponds to a level of the exposure. |
|
83 | 119 | #' - **truncated**: A logical flag indicating if the weights are based on truncated propensity scores. |
84 | 120 | #' |
85 | 121 | #' @examples |
86 | | -#' ## ATE Weights with a Binary Exposure |
| 122 | +#' ## Basic Usage with Binary Exposures |
| 123 | +#' |
| 124 | +#' # Simulate a simple dataset |
| 125 | +#' set.seed(123) |
| 126 | +#' n <- 100 |
| 127 | +#' propensity_scores <- runif(n, 0.1, 0.9) |
| 128 | +#' treatment <- rbinom(n, 1, propensity_scores) |
| 129 | +#' |
| 130 | +#' # Calculate different weight types |
| 131 | +#' weights_ate <- wt_ate(propensity_scores, treatment) |
| 132 | +#' weights_att <- wt_att(propensity_scores, treatment) |
| 133 | +#' weights_atu <- wt_atu(propensity_scores, treatment) |
| 134 | +#' weights_atm <- wt_atm(propensity_scores, treatment) |
| 135 | +#' weights_ato <- wt_ato(propensity_scores, treatment) |
| 136 | +#' weights_entropy <- wt_entropy(propensity_scores, treatment) |
| 137 | +#' |
| 138 | +#' # Compare weight distributions |
| 139 | +#' summary(weights_ate) |
| 140 | +#' summary(weights_ato) # Often more stable than ATE |
| 141 | +#' |
| 142 | +#' ## Stabilized Weights |
| 143 | +#' |
| 144 | +#' # Stabilization reduces variance |
| 145 | +#' weights_ate_stab <- wt_ate(propensity_scores, treatment, stabilize = TRUE) |
| 146 | +#' |
| 147 | +#' # Compare coefficient of variation |
| 148 | +#' sd(weights_ate) / mean(weights_ate) # Unstabilized |
| 149 | +#' sd(weights_ate_stab) / mean(weights_ate_stab) # Stabilized (lower is better) |
| 150 | +#' |
| 151 | +#' ## Handling Extreme Propensity Scores |
| 152 | +#' |
| 153 | +#' # Create data with positivity violations |
| 154 | +#' ps_extreme <- c(0.01, 0.02, 0.98, 0.99, rep(0.5, 4)) |
| 155 | +#' trt_extreme <- c(0, 0, 1, 1, 0, 1, 0, 1) |
| 156 | +#' |
| 157 | +#' # Standard ATE weights can be extreme |
| 158 | +#' wt_extreme <- wt_ate(ps_extreme, trt_extreme) |
| 159 | +#' # Very large! |
| 160 | +#' max(wt_extreme) |
| 161 | +#' |
| 162 | +#' # ATO weights are bounded |
| 163 | +#' wt_extreme_atm <- wt_ato(ps_extreme, trt_extreme) |
| 164 | +#' # Much more reasonable |
| 165 | +#' max(wt_extreme_atm) |
| 166 | +#' # but they target a different population |
| 167 | +#' estimand(wt_extreme_atm) # "ato" |
| 168 | +#' |
| 169 | +#' @references |
| 170 | +#' |
| 171 | +#' For detailed guidance on causal inference in R, see [*Causal Inference in R*](https://www.r-causal.org/) |
| 172 | +#' by Malcolm Barrett, Lucy D'Agostino McGowan, and Travis Gerke. |
| 173 | +#' |
| 174 | +#' ## Foundational Papers |
| 175 | +#' |
| 176 | +#' Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity |
| 177 | +#' score in observational studies for causal effects. *Biometrika*, 70(1), 41-55. |
| 178 | +#' |
| 179 | +#' ## Estimand-Specific Methods |
| 180 | +#' |
| 181 | +#' Li, L., & Greene, T. (2013). A weighting analogue to pair matching in |
| 182 | +#' propensity score analysis. *The International Journal of Biostatistics*, 9(2), |
| 183 | +#' 215-234. (ATM weights) |
87 | 184 | #' |
88 | | -#' # Simulate a binary treatment and corresponding propensity scores |
89 | | -#' propensity_scores <- c(0.2, 0.7, 0.5, 0.8) |
90 | | -#' treatment <- c(0, 1, 0, 1) |
| 185 | +#' Li, F., Morgan, K. L., & Zaslavsky, A. M. (2018). Balancing covariates via |
| 186 | +#' propensity score weighting. *Journal of the American Statistical Association*, |
| 187 | +#' 113(521), 390-400. (ATO weights) |
91 | 188 | #' |
92 | | -#' # Compute ATE weights (unstabilized) |
93 | | -#' weights_ate <- wt_ate(propensity_scores, .exposure = treatment) |
94 | | -#' weights_ate |
| 189 | +#' Zhou, Y., Matsouaka, R. A., & Thomas, L. (2020). Propensity score weighting |
| 190 | +#' under limited overlap and model misspecification. *Statistical Methods in |
| 191 | +#' Medical Research*, 29(12), 3721-3756. (Entropy weights) |
95 | 192 | #' |
96 | | -#' # Compute ATE weights with stabilization using the mean of the exposure |
97 | | -#' weights_ate_stab <- wt_ate(propensity_scores, .exposure = treatment, stabilize = TRUE) |
98 | | -#' weights_ate_stab |
| 193 | +#' ## Continuous Exposures |
99 | 194 | #' |
100 | | -#' ## ATT Weights for a Binary Exposure |
| 195 | +#' Hirano, K., & Imbens, G. W. (2004). The propensity score with continuous |
| 196 | +#' treatments. *Applied Bayesian Modeling and Causal Inference from |
| 197 | +#' Incomplete-Data Perspectives*, 226164, 73-84. |
101 | 198 | #' |
102 | | -#' propensity_scores <- c(0.3, 0.6, 0.4, 0.7) |
103 | | -#' treatment <- c(1, 1, 0, 0) |
| 199 | +#' ## Practical Guidance |
104 | 200 | #' |
105 | | -#' # Compute ATT weights |
106 | | -#' weights_att <- wt_att(propensity_scores, .exposure = treatment) |
107 | | -#' weights_att |
| 201 | +#' Austin, P. C., & Stuart, E. A. (2015). Moving towards best practice when |
| 202 | +#' using inverse probability of treatment weighting (IPTW) using the propensity |
| 203 | +#' score to estimate causal treatment effects in observational studies. |
| 204 | +#' *Statistics in Medicine*, 34(28), 3661-3679. |
108 | 205 | #' |
109 | 206 | #' @seealso |
110 | 207 | #' - [psw()] for details on the structure of the returned weight objects. |
| 208 | +#' - [ps_trim()], [ps_trunc()], and [ps_refit()] for handling extreme weights. |
| 209 | +#' - [ps_calibrate()] for calibrating weights. |
111 | 210 | #' |
112 | 211 | #' @export |
113 | 212 | wt_ate <- function( |
|
0 commit comments