Skip to content

Commit fc0f7a8

Browse files
authored
Merge pull request #128 from pymc-labs/ancova2
add ANCOVA example to integration tests + README
2 parents e7b4bfb + 50d143e commit fc0f7a8

File tree

5 files changed

+5519
-3
lines changed

5 files changed

+5519
-3
lines changed

README.md

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -90,9 +90,26 @@ This is appropriate when you have multiple units, one of which is treated. You b
9090

9191
> The data (treated and untreated units), pre-treatment model fit, and counterfactual (i.e. the synthetic control) are plotted (top). The causal impact is shown as a blue shaded region. The Bayesian analysis shows shaded Bayesian credible regions of the model fit and counterfactual. Also shown is the causal impact (middle) and cumulative causal impact (bottom).
9292
93+
### ANCOVA
94+
95+
This is appropriate for non-equivalent group designs when you have a single pre and post intervention measurement and have a treament and a control group.
96+
97+
| Group | pre | post |
98+
|------|---|-------|
99+
| 0 | $x_1$ | $y_1$ |
100+
| 0 | $x_2$ | $y_2$ |
101+
| 1 | $x_3$ | $y_3$ |
102+
| 1 | $x_4$ | $y_4$ |
103+
104+
| Frequentist | Bayesian |
105+
|--|--|
106+
| coming soon | ![](img/anova_pymc.svg) |
107+
108+
> The data from the control and treatment group are plotted, along with posterior predictive 94% credible intervals. The lower panel shows the estimated treatment effect.
109+
93110
### Difference in Differences
94111

95-
This is appropriate when you have a single pre and post intervention measurement and have a treament and a control group.
112+
This is appropriate for non-equivalent group designs when you have pre and post intervention measurement and have a treament and a control group. Unlike the ANCOVA approach, difference in differences is appropriate when there are multiple pre and/or post treatment measurements.
96113

97114
Data is expected to be in the following form. Shown are just two units - one in the treated group (`group=1`) and one in the untreated group (`group=0`), but there can of course be multiple units per group. This is panel data (also known as repeated measures) where each unit is measured at 2 time points.
98115

@@ -107,7 +124,7 @@ Data is expected to be in the following form. Shown are just two units - one in
107124
|--|--|
108125
| ![](img/difference_in_differences_skl.svg) | ![](img/difference_in_differences_pymc.svg) |
109126

110-
The data, model fit, and counterfactual are plotted. Frequentist model fits result in points estimates, but the Bayesian analysis results in posterior distributions, represented by the violin plots. The causal impact is the difference between the counterfactual prediction (treated group, post treatment) and the observed values for the treated group, post treatment.
127+
>The data, model fit, and counterfactual are plotted. Frequentist model fits result in points estimates, but the Bayesian analysis results in posterior distributions, represented by the violin plots. The causal impact is the difference between the counterfactual prediction (treated group, post treatment) and the observed values for the treated group, post treatment.
111128
112129
### Regression discontinuity designs
113130

causalpy/tests/test_integration_pymc_examples.py

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -218,3 +218,19 @@ def test_sc_brexit():
218218
len(result.prediction_model.idata.posterior.coords["draw"])
219219
== sample_kwargs["draws"]
220220
)
221+
222+
223+
@pytest.mark.integration
224+
def test_ancova():
225+
df = cp.load_data("anova1")
226+
result = cp.pymc_experiments.PrePostNEGD(
227+
df,
228+
formula="post ~ 1 + C(group) + pre",
229+
group_variable_name="group",
230+
pretreatment_variable_name="pre",
231+
prediction_model=cp.pymc_models.LinearRegression(sample_kwargs=sample_kwargs),
232+
)
233+
assert isinstance(df, pd.DataFrame)
234+
assert isinstance(result, cp.pymc_experiments.PrePostNEGD)
235+
assert len(result.idata.posterior.coords["chain"]) == sample_kwargs["chains"]
236+
assert len(result.idata.posterior.coords["draw"]) == sample_kwargs["draws"]

docs/index.rst

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -69,10 +69,17 @@ This is appropriate when you have multiple units, one of which is treated. You b
6969

7070
.. image:: ../img/synthetic_control_pymc.svg
7171

72+
ANCOVA
73+
""""""
74+
75+
This is appropriate when you have a single pre and post intervention measurement and have a treament and a control group.
76+
77+
.. image:: ../img/anova_pymc.svg
78+
7279
Difference in differences
7380
"""""""""""""""""""""""""
7481

75-
This is appropriate when you have a single pre and post intervention measurement and have a treament and a control group.
82+
This is appropriate when you have pre and post intervention measurement(s) and have a treament and a control group.
7683

7784
.. image:: ../img/difference_in_differences_pymc.svg
7885

docs/notebooks/generate_plots.ipynb

Lines changed: 303 additions & 0 deletions
Large diffs are not rendered by default.

0 commit comments

Comments
 (0)