-
Notifications
You must be signed in to change notification settings - Fork 265
Open
Labels
Description
I have observational data for a binary outcome variable (dispersed, not dispersed). My treatment is a continuous variable (breeding success, a proportion). I created a DAG to identify confounding variables and need to control for diet (proportion) and predator abundance (average counts). I want to estimate the effect of breeding success (BS) on dispersal (DP), acknowledging that there are heterogenous effects across islands (HI).
From reading the grf documentation and other issues on github, I understand that I need to orthogonalize my data prior to running the causal forest. My preliminary code is below and I had several questions:
- Is my setup to adjust for confounding and input the orthogonalized data into causal forest correct? Should I be inputting the residuals as Y and W (as below) or should I be adjusting Y.hat and W.hat?
- Does this code give me my desired results: effect of breeding success on dispersal for each island after adjusting for confounding (diet, predators)?
Thanks for developing such an amazing package!
#-------------------------
# STEP 1 — Adjust for confounding
#-------------------------
# Outcome model: DP ~ confounders
m_y <- ranger(DP ~ Diet + Predation,
data = ATPU_ExtProspOth_BS_Summary_Complete, probability = TRUE)
# Treatment model: BS ~ confounders
m_t <- ranger(BS ~ Diet + Predation,
data = ATPU_ExtProspOth_BS_Summary_Complete)
# Predicted nuisance components
e_hat <- predict(m_y,
ATPU_ExtProspOth_BS_Summary_Complete)$predictions[,2]
# P(DP=1 | confounders)
p_hat <- predict(m_t,
ATPU_ExtProspOth_BS_Summary_Complete)$predictions
# E[BS | confounders]
# Orthogonalized residuals
res_y <- ATPU_ExtProspOth_BS_Summary_Complete$DP - e_hat
# resid_Y represents variation in DP that is not explained by diet or predation
# aka orthogonalized outcome
res_t <- ATPU_ExtProspOth_BS_Summary_Complete$BS - p_hat
# resid_T represents variation in BS that is not explained by diet or predation
# aka orthogonalized treatment
#-------------------------
# STEP 2 — Causal Forest on residuals
#-------------------------
# Use moderators (HI) for heterogeneity
X_mat <- model.matrix(~ HI, ATPU_ExtProspOth_BS_Summary_Complete)[, -1] # removes intercept
# Estimates how BS affects DP, free from confounding
cf <- causal_forest(
X = X_mat,
Y = res_y,
W = res_t
)
#-------------------------
# STEP 3 — CATE
#-------------------------
# CATE predictions
ATPU_ExtProspOth_BS_Summary_Complete$CATE_HI <- predict(cf)$predictions