Skip to content

Error in do() operator for interventions outside original treatment distribution #1344

@ravulapreethi

Description

@ravulapreethi

When running interventions using the do() calculus, an error occurs if the intervention value falls outside the original treatment variable’s observed distribution.
For example, if the observed distribution of sample_rate is approximately in the range [0, 1], setting an intervention to 1.2 triggers an error. Clipping values ≤1 avoids the error, but this limits the ability to simulate interventions beyond observed bounds, which is needed for counterfactual policy analysis. Similar inconsistency reported with GCM intervention samples here #1241

Image

Interventions

## Observed distribution of sample_rate: ~ [0, 1], increase by 20%
intervention = {"sample_rate": df["sample_rate"] * 1.20}   # Works only if clipped <=  1
## OR
intervention = {"failure_rate": 1.20}  

df_do2 = df.causal.do(
    x=intervention,
    outcome="",
    variable_types=variable_types,
    graph=causal_graph,  # networkx.DiGraph
    stateful=False
)

Is clipping the recommended approach, or should alternative modeling strategies be used since the intervention we are using is not a fixed one but a heterogeneous intervention (policy simulations), e.g., continuous parametric models to support extrapolation? @amit-sharma

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions