-
Notifications
You must be signed in to change notification settings - Fork 87
Transfer-Function ITS: Graded Interventions with Saturation & Adstock Transforms #548
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
causalpy/transforms.py
Outdated
|
|
||
|
|
||
| @dataclass | ||
| class Saturation: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We avoided this intentionally with pymc-marketing. Is this the only way to implement this?
Reference:
https://williambdean.github.io/blog/posts/2024/pymc-marketing-strategy-pattern/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for flagging this. Very early days on this PR, will look into changing it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully resolved in 659b502
Refactored transform classes to use a strategy pattern with explicit Adstock, Saturation, and Lag implementations. Added transform_optimization.py for grid search and optimization of transform parameters. Updated TransferFunctionITS to support transform parameter estimation and metadata. Revised tests to use new transform classes and parameter estimation workflows.
Expanded and clarified docstrings in transfer_function_its.py to document the nested parameter estimation approach for saturation and adstock transforms. Updated the example and usage instructions to reflect the new estimation workflow. Revised the notebook to demonstrate transform parameter estimation via grid search, show parameter recovery, and clarify the distinction between grid search and continuous optimization. Removed the outdated and redundant test class for TransferFunctionITS in test_transfer_function_its.py.
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #548 +/- ##
==========================================
+ Coverage 95.59% 95.73% +0.13%
==========================================
Files 29 34 +5
Lines 2681 5108 +2427
==========================================
+ Hits 2563 4890 +2327
- Misses 118 218 +100 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Expanded documentation and code comments to better explain HAC (Newey-West) standard errors, their purpose, and the hac_maxlags parameter. Added a detailed explanation and citation in the notebook, and improved docstrings and print output in transfer_function_its.py. Added the Newey-West reference to references.bib.
Expanded the TF-ITS notebook with a detailed explanation of autocorrelation in time series, its impact on causal inference, and the motivation for using HAC (Newey-West) standard errors. Updated the simulation to generate autocorrelated errors using an AR(1) process, and clarified the importance of robust inference in the context of time series interventions.
Extended TransferFunctionITS and transform optimization to support ARIMAX (ARIMA with exogenous variables) error models in addition to HAC standard errors. Updated model fitting, parameter estimation, and documentation to allow users to specify error_model ('hac' or 'arimax') and ARIMA order. Added comprehensive tests for ARIMAX functionality and updated the notebook to demonstrate ARIMAX usage and comparison with HAC.
Refactors GradedInterventionTimeSeries and TransferFunctionOLS to follow the standard CausalPy pattern: the experiment class now takes an unfitted model and handles transform parameter estimation, fitting, and result extraction. Removes the with_estimated_transforms factory method, updates all docstrings, and adapts tests and documentation to the new workflow. This enables more flexible and consistent usage for multi-treatment and advanced modeling scenarios.
Introduces new plotting methods to GradedInterventionTimeSeries, including plot_effect and plot_transforms, and renames diagnostics() to plot_diagnostics(). Updates tests to cover new plotting features. Enhances documentation and notebook explanations for model fitting and parameter estimation, and updates the interrogate badge.
Renamed 'tfits_single_channel.ipynb' to 'graded_intervention_time_series_single_channel_ols.ipynb' and updated the notebook title and references in both the notebook and the index.md file to reflect the new name and description.
Added detailed documentation explaining challenges of AR error modeling in PyMC, why standard approaches fail, and the rationale for using quasi-differencing in TransferFunctionARRegression. Also clarified alternative latent AR component modeling and why it is not used, providing guidance on when to use each model.
Introduces a separate 'ar_sample_kwargs' dictionary for Bayesian AR(1) model sampling in the transfer function test. Updates assertions to reference the new parameters, clarifying the need for increased sampling due to model complexity.
Added a citation for pymc-marketing to references.bib and updated the graded intervention time series notebook to explain the use of transformation functions from pymc-marketing for modeling temporal and intensity dynamics.
Clarified the rationale for HAC standard errors, improved the explanation of autocorrelation and heteroskedasticity, and streamlined the discussion of advantages and tradeoffs. Updated ARIMAX section reference for consistency.
|
I will take a look into this in the next days 🙏 |
|
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2025-11-12T17:48:06Z After this cell explain you are going to generate data (and ideally hide this cell using |
|
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2025-11-12T17:48:06Z
|
|
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2025-11-12T17:48:07Z Shall we add HDI uncertainty in predictions ? |
|
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2025-11-12T17:48:08Z Nice plot! (I suggest hidding the code) |
|
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2025-11-12T17:48:08Z Came comment on uncertainty |
|
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2025-11-12T17:48:09Z Can we put the legend out? and bring the intervals (e.g. [-84, -55]) close to the figure? |
|
View / edit / reply to this conversation on ReviewNB juanitorduz commented on 2025-11-12T17:48:10Z It seems we have some divergences |
juanitorduz
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is an amazing PR! I had no idea about this literature. I left some initial comments around code style. In a next round I could go deeper into the logic 💪
Really cool stuff !
| base_formula: str, | ||
| saturation, | ||
| adstock, | ||
| lag=None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing type hint
| adstock, | ||
| lag=None, | ||
| hac_maxlags: Optional[int] = None, | ||
| error_model: str = "hac", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can use the type Literal["hac", "arima"]
| saturation, | ||
| adstock, | ||
| lag=None, | ||
| hac_maxlags: Optional[int] = None, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
better int | None = None (we do not wanna use Optional as this is for Python 3.9, it works just for compatibility)
| # Fit OLS with HAC standard errors | ||
| if hac_maxlags is None: | ||
| n = len(y) | ||
| hac_maxlags = int(np.floor(4 * (n / 100) ** (2 / 9))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does this come from? any reference?
|
|
||
| import numpy as np | ||
| import pandas as pd | ||
| import statsmodels.api as sm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggest we use https://github.com/py-econometrics/pyfixest :) It is way faster, and has all these HAC cov metrics.
|
|
||
| return priors | ||
|
|
||
| def build_model(self, X, y, coords, treatment_data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add type hints
| saturation_type: Optional[str] = None, | ||
| adstock_config: Optional[Dict] = None, | ||
| saturation_config: Optional[Dict] = None, | ||
| coef_constraint: str = "unconstrained", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove Optionals
| self.treatment_names = None | ||
| self.n_treatments = None | ||
|
|
||
| def priors_from_data(self, X, y) -> Dict[str, Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing type hints (also, use dict)
|
|
||
| return priors | ||
|
|
||
| def build_model(self, X, y, coords, treatment_data): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Type hints
| from typing import Optional, Tuple | ||
|
|
||
| import numpy as np | ||
| import statsmodels.api as sm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, suggest using https://github.com/py-econometrics/pyfixest
This PR introduces Transfer-Function Interrupted Time Series (TF-ITS), a powerful new experiment class that extends CausalPy's causal inference capabilities to handle graded (non-binary) interventions in single-market time series data.
One reason why it's exciting is that it starts to create a bridge between CausalPy and pymc-marketing.
🎯 What This Adds
Unlike traditional ITS methods that focus on binary on/off treatments, TF-ITS enables practitioners to:
This makes TF-ITS particularly valuable for marketing mix modeling, policy evaluation, and any scenario where treatment intensity varies over time.
🔧 Implementation Details
Architecture:
Dependencies: Adds pymc-marketing>=0.7.0
Breaking Changes: None
Future Work: Multiple intervention channels (this may be close to working, but just not worked through an example). Add another notebook directly focussing on a marketing-based case study.
📚 Documentation preview 📚: https://causalpy--548.org.readthedocs.build/en/548/