Error in do() operator for interventions outside original treatment distribution

When running interventions using the do() calculus, an error occurs if the intervention value falls outside the original treatment variable’s observed distribution.
For example, if the observed distribution of sample_rate is approximately in the range [0, 1], setting an intervention to 1.2 triggers an error. Clipping values ≤1 avoids the error, but this limits the ability to simulate interventions beyond observed bounds, which is needed for counterfactual policy analysis. Similar inconsistency reported with GCM intervention samples here #1241 

<img width="1010" height="328" alt="Image" src="https://github.com/user-attachments/assets/519748c6-eced-4d37-a00c-8450955e1229" />

## Interventions

```
## Observed distribution of sample_rate: ~ [0, 1], increase by 20%
intervention = {"sample_rate": df["sample_rate"] * 1.20}   # Works only if clipped <=  1
## OR
intervention = {"failure_rate": 1.20}  

df_do2 = df.causal.do(
    x=intervention,
    outcome="",
    variable_types=variable_types,
    graph=causal_graph,  # networkx.DiGraph
    stateful=False
)
```

Is clipping the recommended approach, or should alternative modeling strategies be used since the intervention we are using is not a fixed one but a heterogeneous intervention (policy simulations), e.g., continuous parametric models to support extrapolation? @amit-sharma 


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error in do() operator for interventions outside original treatment distribution #1344

Interventions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Error in do() operator for interventions outside original treatment distribution #1344

Description

Interventions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions