Skip to content

Commit b139865

Browse files
committed
Expand notes on AR error modeling in TransferFunctionLinearRegression
Added detailed documentation explaining challenges of AR error modeling in PyMC, why standard approaches fail, and the rationale for using quasi-differencing in TransferFunctionARRegression. Also clarified alternative latent AR component modeling and why it is not used, providing guidance on when to use each model.
1 parent a1eea92 commit b139865

File tree

1 file changed

+74
-2
lines changed

1 file changed

+74
-2
lines changed

causalpy/pymc_models.py

Lines changed: 74 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -959,8 +959,80 @@ class TransferFunctionLinearRegression(PyMCModel):
959959
960960
Notes
961961
-----
962-
The current implementation uses independent Normal errors. Future versions may
963-
include AR(1) autocorrelation modeling for residuals.
962+
The current implementation uses independent Normal errors.
963+
964+
**Autocorrelation in Errors: Implementation Challenges**
965+
966+
For time series with autocorrelated residuals, see ``TransferFunctionARRegression``,
967+
which implements AR(1) errors via quasi-differencing. This note explains why AR
968+
errors require special handling and the design decisions made.
969+
970+
*Why Standard PyMC AR Approaches Fail:*
971+
972+
The fundamental issue is that **observed data cannot depend on model parameters**
973+
in PyMC's computational graph. For AR errors, we want:
974+
975+
.. math::
976+
y[t] = \mu[t] + \epsilon[t]
977+
\epsilon[t] = \rho \cdot \epsilon[t-1] + \nu[t]
978+
979+
But ``\epsilon[t] = y[t] - \mu[t]`` depends on ``\mu`` (which depends on ``\beta``,
980+
``\theta``, ``half_life``, etc.), so this fails:
981+
982+
>>> # This FAILS with TypeError
983+
>>> residuals = y - mu # depends on parameters!
984+
>>> pm.AR("epsilon", rho, observed=residuals) # ❌
985+
986+
*Implemented Solution (TransferFunctionARRegression):*
987+
988+
Uses **quasi-differencing** following Box & Tiao (1975):
989+
990+
.. math::
991+
y[t] - \rho \cdot y[t-1] = \mu[t] - \rho \cdot \mu[t-1] + \nu[t]
992+
993+
This transforms to independent innovations ``\nu[t]``, allowing manual log-likelihood
994+
via ``pm.Potential``. This is the theoretically correct Box & Tiao intervention
995+
analysis model where AR(1) represents the noise structure itself.
996+
997+
*Alternative: AR as Latent Component (Not Implemented):*
998+
999+
An alternative that avoids the ``observed=`` issue would be:
1000+
1001+
.. math::
1002+
y[t] = \mu_{baseline}[t] + ar[t] + \epsilon_{obs}[t]
1003+
ar[t] = \rho \cdot ar[t-1] + \eta[t]
1004+
\epsilon_{obs}[t] \sim N(0, \sigma^2_{obs})
1005+
1006+
This could use PyMC's built-in ``pm.AR`` since ``ar[t]`` is latent (unobserved):
1007+
1008+
>>> # This WOULD work (but changes model interpretation)
1009+
>>> ar = pm.AR("ar", rho=[rho_param], sigma=sigma_ar, shape=n_obs)
1010+
>>> mu_total = mu_baseline + ar
1011+
>>> pm.Normal("y", mu=mu_total, sigma=sigma_obs, observed=y_data) # ✅
1012+
1013+
**Why We Don't Use This Approach:**
1014+
1015+
1. **Different model class**: Represents AR + white noise, not pure AR errors.
1016+
Has two variance parameters (``\sigma_{ar}``, ``\sigma_{obs}``) vs. one (``\sigma``).
1017+
1018+
2. **Theoretical mismatch**: Box & Tiao's intervention analysis models autocorrelation
1019+
in the noise process itself, not as an additional latent component. The AR process
1020+
IS the residual structure, not a separate mean component.
1021+
1022+
3. **Identifiability concerns**: With both AR and white noise, parameters may be
1023+
poorly identified unless ``\sigma_{obs} \approx 0`` (which defeats the purpose).
1024+
1025+
4. **Interpretation**: The latent AR component would represent a time-varying offset
1026+
rather than residual autocorrelation, changing the causal interpretation.
1027+
1028+
**When to Use Each Model:**
1029+
1030+
- **TransferFunctionLinearRegression** (this class): When residuals show minimal
1031+
autocorrelation, or computational efficiency is critical.
1032+
1033+
- **TransferFunctionARRegression**: When residual diagnostics show significant
1034+
autocorrelation (e.g., ACF plots, Durbin-Watson test), and you want to follow
1035+
the classical Box & Tiao specification.
9641036
9651037
**Prior Customization**:
9661038

0 commit comments

Comments
 (0)