|
30 | 30 | "Compare to related methods:\n", |
31 | 31 | "- **Classic {term}`Interrupted Time Series`**: Binary on/off intervention (no dose-response modeling)\n", |
32 | 32 | "- **{term}`Synthetic Control`**: Multiple control units available for comparison\n", |
33 | | - "- **{term}`Difference in Differences`**: Panel data with treatment/control groups\n", |
| 33 | + "- **{term}`Difference in Differences`**: Panel data with treatment/control groups" |
| 34 | + ] |
| 35 | + }, |
| 36 | + { |
| 37 | + "cell_type": "markdown", |
| 38 | + "metadata": {}, |
| 39 | + "source": [ |
34 | 40 | "\n", |
35 | 41 | "## Example Scenario: Water Restrictions Policy\n", |
36 | 42 | "\n", |
|
57 | 63 | "source": [ |
58 | 64 | ":::{admonition} Implementation notes\n", |
59 | 65 | ":class: warning\n", |
60 | | - "This notebook demonstrates the **MVP (non-Bayesian) implementation** using:\n", |
| 66 | + "This notebook demonstrates the **non-Bayesian implementation** using:\n", |
61 | 67 | "- OLS regression with HAC standard errors (fast, robust inference)\n", |
62 | | - "- User-specified transform parameters (future: parameter estimation)\n", |
| 68 | + "- Automated transform parameter estimation via grid search or continuous optimization\n", |
63 | 69 | "- Point estimates only (future: bootstrap confidence intervals, Bayesian uncertainty quantification)\n", |
64 | 70 | ":::" |
65 | 71 | ] |
66 | 72 | }, |
| 73 | + { |
| 74 | + "cell_type": "markdown", |
| 75 | + "metadata": {}, |
| 76 | + "source": [ |
| 77 | + "::::{admonition} Understanding HAC Standard Errors\n", |
| 78 | + ":class: note\n", |
| 79 | + "\n", |
| 80 | + "Time series data typically violates OLS assumptions because:\n", |
| 81 | + "- **Autocorrelation**: Past values influence current values (e.g., yesterday's weather affects today's, habits persist over weeks)\n", |
| 82 | + "- **Heteroskedasticity**: Variance changes over time (e.g., more volatility in certain seasons)\n", |
| 83 | + "\n", |
| 84 | + "When these violations occur, OLS **coefficient estimates remain unbiased**, but **standard errors are incorrect** — typically too small, leading to overconfident inference (narrow confidence intervals, artificially low p-values).\n", |
| 85 | + "\n", |
| 86 | + "**HAC (Heteroskedasticity and Autocorrelation Consistent) standard errors** — also known as **Newey-West standard errors** {cite:p}`newey1987simple` — provide robust inference by correcting standard errors for these violations. This gives reliable confidence intervals and hypothesis tests even when residuals are correlated.\n", |
| 87 | + "\n", |
| 88 | + "**Key Parameter:**\n", |
| 89 | + "- `hac_maxlags`: Controls how many periods of autocorrelation to account for. CausalPy uses the Newey-West rule of thumb: `floor(4*(n/100)^(2/9))`. For our 104-week dataset, this gives `hac_maxlags=4`, accounting for up to 4 weeks of residual correlation.\n", |
| 90 | + "\n", |
| 91 | + "**Tradeoff:** HAC standard errors are wider (more conservative) than naive OLS, but provide honest uncertainty quantification for time series data.\n", |
| 92 | + "\n", |
| 93 | + "::::\n" |
| 94 | + ] |
| 95 | + }, |
67 | 96 | { |
68 | 97 | "cell_type": "code", |
69 | 98 | "execution_count": 1, |
|
801 | 830 | }, |
802 | 831 | { |
803 | 832 | "cell_type": "code", |
804 | | - "execution_count": 11, |
| 833 | + "execution_count": 12, |
805 | 834 | "metadata": {}, |
806 | 835 | "outputs": [ |
807 | 836 | { |
|
878 | 907 | " \"reduction in consumption.\"\n", |
879 | 908 | ")" |
880 | 909 | ] |
| 910 | + }, |
| 911 | + { |
| 912 | + "cell_type": "code", |
| 913 | + "execution_count": null, |
| 914 | + "metadata": {}, |
| 915 | + "outputs": [], |
| 916 | + "source": [] |
881 | 917 | } |
882 | 918 | ], |
883 | 919 | "metadata": { |
|
0 commit comments