Skip to content

Commit fc5e6ce

Browse files
committed
Clarify HAC standard errors in docs and code
Expanded documentation and code comments to better explain HAC (Newey-West) standard errors, their purpose, and the hac_maxlags parameter. Added a detailed explanation and citation in the notebook, and improved docstrings and print output in transfer_function_its.py. Added the Newey-West reference to references.bib.
1 parent bd9a1ff commit fc5e6ce

File tree

3 files changed

+60
-6
lines changed

3 files changed

+60
-6
lines changed

causalpy/experiments/transfer_function_its.py

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -293,7 +293,11 @@ def with_estimated_transforms(
293293
coef_constraint : str, default="nonnegative"
294294
Constraint on treatment coefficient ("nonnegative" or "unconstrained").
295295
hac_maxlags : int, optional
296-
Maximum lags for HAC standard errors. If None, uses rule of thumb.
296+
Maximum lags for HAC (Newey-West) standard errors, which correct for
297+
autocorrelation and heteroskedasticity in residuals. Higher values account
298+
for longer-range dependencies but reduce degrees of freedom. If None, uses
299+
the Newey-West rule of thumb: floor(4*(n/100)^(2/9)). For example, with
300+
n=104 observations, the default is hac_maxlags=4.
297301
**estimation_kwargs
298302
Additional keyword arguments for the estimation method:
299303
@@ -936,7 +940,10 @@ def summary(self, round_to: Optional[int] = None) -> None:
936940
print(f"Outcome variable: {self.y_column}")
937941
print(f"Number of observations: {len(self.y)}")
938942
print(f"R-squared: {round_num(self.score, round_to)}")
939-
print(f"HAC max lags: {self.hac_maxlags}")
943+
print(
944+
f"HAC max lags: {self.hac_maxlags} "
945+
f"(robust SEs accounting for {self.hac_maxlags} periods of autocorrelation)"
946+
)
940947
print("-" * 80)
941948
print("Baseline coefficients:")
942949
for label, coef, se in zip(

docs/source/notebooks/tfits_single_channel.ipynb

Lines changed: 40 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,13 @@
3030
"Compare to related methods:\n",
3131
"- **Classic {term}`Interrupted Time Series`**: Binary on/off intervention (no dose-response modeling)\n",
3232
"- **{term}`Synthetic Control`**: Multiple control units available for comparison\n",
33-
"- **{term}`Difference in Differences`**: Panel data with treatment/control groups\n",
33+
"- **{term}`Difference in Differences`**: Panel data with treatment/control groups"
34+
]
35+
},
36+
{
37+
"cell_type": "markdown",
38+
"metadata": {},
39+
"source": [
3440
"\n",
3541
"## Example Scenario: Water Restrictions Policy\n",
3642
"\n",
@@ -57,13 +63,36 @@
5763
"source": [
5864
":::{admonition} Implementation notes\n",
5965
":class: warning\n",
60-
"This notebook demonstrates the **MVP (non-Bayesian) implementation** using:\n",
66+
"This notebook demonstrates the **non-Bayesian implementation** using:\n",
6167
"- OLS regression with HAC standard errors (fast, robust inference)\n",
62-
"- User-specified transform parameters (future: parameter estimation)\n",
68+
"- Automated transform parameter estimation via grid search or continuous optimization\n",
6369
"- Point estimates only (future: bootstrap confidence intervals, Bayesian uncertainty quantification)\n",
6470
":::"
6571
]
6672
},
73+
{
74+
"cell_type": "markdown",
75+
"metadata": {},
76+
"source": [
77+
"::::{admonition} Understanding HAC Standard Errors\n",
78+
":class: note\n",
79+
"\n",
80+
"Time series data typically violates OLS assumptions because:\n",
81+
"- **Autocorrelation**: Past values influence current values (e.g., yesterday's weather affects today's, habits persist over weeks)\n",
82+
"- **Heteroskedasticity**: Variance changes over time (e.g., more volatility in certain seasons)\n",
83+
"\n",
84+
"When these violations occur, OLS **coefficient estimates remain unbiased**, but **standard errors are incorrect** — typically too small, leading to overconfident inference (narrow confidence intervals, artificially low p-values).\n",
85+
"\n",
86+
"**HAC (Heteroskedasticity and Autocorrelation Consistent) standard errors** — also known as **Newey-West standard errors** {cite:p}`newey1987simple` — provide robust inference by correcting standard errors for these violations. This gives reliable confidence intervals and hypothesis tests even when residuals are correlated.\n",
87+
"\n",
88+
"**Key Parameter:**\n",
89+
"- `hac_maxlags`: Controls how many periods of autocorrelation to account for. CausalPy uses the Newey-West rule of thumb: `floor(4*(n/100)^(2/9))`. For our 104-week dataset, this gives `hac_maxlags=4`, accounting for up to 4 weeks of residual correlation.\n",
90+
"\n",
91+
"**Tradeoff:** HAC standard errors are wider (more conservative) than naive OLS, but provide honest uncertainty quantification for time series data.\n",
92+
"\n",
93+
"::::\n"
94+
]
95+
},
6796
{
6897
"cell_type": "code",
6998
"execution_count": 1,
@@ -801,7 +830,7 @@
801830
},
802831
{
803832
"cell_type": "code",
804-
"execution_count": 11,
833+
"execution_count": 12,
805834
"metadata": {},
806835
"outputs": [
807836
{
@@ -878,6 +907,13 @@
878907
" \"reduction in consumption.\"\n",
879908
")"
880909
]
910+
},
911+
{
912+
"cell_type": "code",
913+
"execution_count": null,
914+
"metadata": {},
915+
"outputs": [],
916+
"source": []
881917
}
882918
],
883919
"metadata": {

docs/source/references.bib

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -204,3 +204,14 @@ @article{box1975intervention
204204
year={1975},
205205
publisher={Taylor \& Francis}
206206
}
207+
208+
@article{newey1987simple,
209+
title={A simple, positive semi-definite, heteroskedasticity and autocorrelation consistent covariance matrix},
210+
author={Newey, Whitney K and West, Kenneth D},
211+
journal={Econometrica},
212+
volume={55},
213+
number={3},
214+
pages={703--708},
215+
year={1987},
216+
publisher={JSTOR}
217+
}

0 commit comments

Comments
 (0)