|
13 | 13 | # limitations under the License. |
14 | 14 | """ |
15 | 15 | Interrupted Time Series Analysis |
| 16 | +
|
| 17 | +This module implements interrupted time series (ITS) analysis for causal inference, |
| 18 | +supporting both traditional scenarios where the intervention time is known and |
| 19 | +advanced scenarios where the intervention time must be inferred from the data. |
| 20 | +
|
| 21 | +Overview |
| 22 | +-------- |
| 23 | +Interrupted time series analysis is a quasi-experimental design used to evaluate |
| 24 | +the impact of an intervention by comparing time series data before and after the |
| 25 | +intervention occurs. This module provides a flexible framework that can handle: |
| 26 | +
|
| 27 | +1. **Known intervention times**: Traditional ITS where you specify exactly when |
| 28 | + the treatment occurred (e.g., policy implementation date) |
| 29 | +2. **Unknown intervention times**: Advanced ITS where the model infers when an |
| 30 | + intervention likely occurred based on observed changes in the data |
| 31 | +
|
| 32 | +Treatment Time Handler Architecture |
| 33 | +---------------------------------- |
| 34 | +The core design pattern in this module is the Strategy pattern implemented through |
| 35 | +the `TreatmentTimeHandler` hierarchy. This architecture was necessary because known |
| 36 | +and unknown treatment times require fundamentally different approaches: |
| 37 | +
|
| 38 | +**Why the Handler Architecture?** |
| 39 | +
|
| 40 | +- **Data Processing**: Known times require splitting data at a specific point; |
| 41 | + unknown times need the full dataset for inference |
| 42 | +- **Model Training**: Known times train only on pre-intervention data; unknown |
| 43 | + times train on all available data to detect the changepoint |
| 44 | +- **Uncertainty Handling**: Known times have deterministic splits; unknown times |
| 45 | + have probabilistic splits with confidence intervals |
| 46 | +- **Visualization**: Different plotting strategies for certain vs. uncertain |
| 47 | + intervention times |
| 48 | +
|
| 49 | +**Handler Classes:** |
| 50 | +
|
| 51 | +1. **TreatmentTimeHandler (Abstract Base Class)** |
| 52 | +
|
| 53 | + - Defines the interface that all concrete handlers must implement |
| 54 | + - Ensures consistent API regardless of whether treatment time is known/unknown |
| 55 | + - Abstract methods: data_preprocessing, data_postprocessing, plot_intervention_line, |
| 56 | + plot_impact_cumulative |
| 57 | + - Optional method: plot_treated_counterfactual (only needed for unknown times) |
| 58 | +
|
| 59 | +2. **KnownTreatmentTimeHandler** |
| 60 | +
|
| 61 | + - Handles traditional ITS scenarios with predetermined intervention times |
| 62 | + - **Data Preprocessing**: Filters data to pre-intervention period only for training |
| 63 | + - **Data Postprocessing**: Creates clean pre/post splits at the known time point |
| 64 | + - **Plotting**: Draws single vertical line at the intervention time |
| 65 | + - **Use Case**: Policy evaluations, clinical trials, A/B tests with known start dates |
| 66 | +
|
| 67 | +3. **UnknownTreatmentTimeHandler** |
| 68 | +
|
| 69 | + - Handles advanced ITS scenarios where intervention time is inferred |
| 70 | + - **Data Preprocessing**: Uses full dataset and constrains model's search window |
| 71 | + - **Data Postprocessing**: Extracts inferred treatment time from posterior samples, |
| 72 | + creates probabilistic pre/post splits, handles uncertainty propagation |
| 73 | + - **Plotting**: Draws intervention line with uncertainty bands (HDI), shows |
| 74 | + "treated counterfactual" predictions |
| 75 | + - **Use Case**: Exploratory analysis, natural experiments, detecting unknown |
| 76 | + structural breaks |
| 77 | +
|
| 78 | +The handler pattern ensures that: |
| 79 | +
|
| 80 | +- The main `InterruptedTimeSeries` class maintains a clean, unified API |
| 81 | +- Different treatment time scenarios are handled with appropriate algorithms |
| 82 | +- New handler types can be easily added (e.g., multiple intervention times) |
| 83 | +- Code is maintainable and testable with clear separation of concerns |
| 84 | +
|
| 85 | +Usage Examples |
| 86 | +-------------- |
| 87 | +Known treatment time (traditional approach): |
| 88 | +
|
| 89 | +>>> result = cp.InterruptedTimeSeries( |
| 90 | +... data=df, |
| 91 | +... treatment_time=pd.to_datetime("2017-01-01"), # Known intervention |
| 92 | +... formula="y ~ 1 + t + C(month)", |
| 93 | +... model=cp.pymc_models.LinearRegression(), |
| 94 | +... ) |
| 95 | +
|
| 96 | +Unknown treatment time (inference approach): |
| 97 | +
|
| 98 | +>>> model = cp.pymc_models.InterventionTimeEstimator(treatment_effect_type="level") |
| 99 | +>>> result = cp.InterruptedTimeSeries( |
| 100 | +... data=df, |
| 101 | +... treatment_time=None, # Let model infer the time |
| 102 | +... formula="y ~ 1 + t + C(month)", |
| 103 | +... model=model, |
| 104 | +... ) |
| 105 | +
|
| 106 | +The module automatically selects the appropriate handler based on the treatment_time |
| 107 | +parameter and model type, providing a seamless user experience while maintaining |
| 108 | +the flexibility to handle diverse analytical scenarios. |
16 | 109 | """ |
17 | 110 |
|
18 | 111 | from abc import ABC, abstractmethod |
|
0 commit comments