-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
When running interventions using the do() calculus, an error occurs if the intervention value falls outside the original treatment variable’s observed distribution.
For example, if the observed distribution of sample_rate is approximately in the range [0, 1], setting an intervention to 1.2 triggers an error. Clipping values ≤1 avoids the error, but this limits the ability to simulate interventions beyond observed bounds, which is needed for counterfactual policy analysis. Similar inconsistency reported with GCM intervention samples here #1241
Interventions
## Observed distribution of sample_rate: ~ [0, 1], increase by 20%
intervention = {"sample_rate": df["sample_rate"] * 1.20} # Works only if clipped <= 1
## OR
intervention = {"failure_rate": 1.20}
df_do2 = df.causal.do(
x=intervention,
outcome="",
variable_types=variable_types,
graph=causal_graph, # networkx.DiGraph
stateful=False
)
Is clipping the recommended approach, or should alternative modeling strategies be used since the intervention we are using is not a fixed one but a heterogeneous intervention (policy simulations), e.g., continuous parametric models to support extrapolation? @amit-sharma