-
Notifications
You must be signed in to change notification settings - Fork 88
Add variable selection priors #568
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
| :param vs_prior_type : str or None, default=None | ||
| Type of variable selection prior: 'spike_and_slab', 'horseshoe', or None. | ||
| If None, uses standard normal priors. | ||
| :param vs_hyperparams : dict, optional |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is sphinx format and not numpy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add that into AGENTS.md if it's not already there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is sphinx format and not numpy
You got me. The doc strings were AI generated. Will fix.
| Provides continuous shrinkage with heavy tails, allowing strong signals | ||
| to escape shrinkage while weak signals are dampened: | ||
| β_j = τ · λ̃_j · β_j^raw |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should be able to add maths in here for nice rendering in the API docs
Signed-off-by: Nathaniel <[email protected]>
|
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #568 +/- ##
==========================================
+ Coverage 93.21% 93.32% +0.11%
==========================================
Files 35 37 +2
Lines 5511 5785 +274
Branches 358 375 +17
==========================================
+ Hits 5137 5399 +262
- Misses 246 252 +6
- Partials 128 134 +6 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
|
Marking this one as ready for review. There is still some work to be done on the notebook illustrating the functionality. But i think there is enough here that's it worth flagging the architecture choices for discussion. I've made the variable selection priors available as a module. Currently, just integrated with the IV class, but in principle can be dropped into all regression based modules with coefficients. The pattern simply requires an if-else block to be used in e.g. the propensity score model, linear regression model etc.... What do you guys think? |
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
|
Great @NathanielF ! I think the notebooks need a bit more storyline and explanation 🙏 |
|
Cool, thanks @juanitorduz . I can take another pass at it this weekend. |
|
Will take a look at this. But it's worth updating from |
| @@ -0,0 +1,2219 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could be worth adding a graphviz DAG here to represent the DGP for the simulated data.
I'd also be tempted to add a corresponding short paragraph to explain the data setup a bit.
Could add in a link to the knowledge base page you recently added
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a note and clarification
| @@ -0,0 +1,2219 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we add bibtex references (and citations in text) for the spike-and-slab and the horseshoe.
For the horseshoe, are the lambdas also {0,1} ?
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarified this too. Lambdas are not in {0, 1} strictly speaking since we're using a relaxed spike and slab, the gammas aren't either. Also clarified this. Added a reference to the Kaplan book
| @@ -0,0 +1,2219 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
temperature and pi could more clearly map on to the equations given in the above spike-and-slab + horseshoe sections
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Linked this a back a bit. Didn't give a full breakdown in formulas of how the hyperparameters work... this seemed overkill?
| @@ -0,0 +1,2219 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
| @@ -0,0 +1,2219 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add a bit of explanation. I'm still not over my cold, so I'm probably missing something, but my naive prediction would be that parameters would be heaped at zero. Could you just add something in there to address whatever misunderstanding I have there
Reply via ReviewNB
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general prediction parameters and especially instruments will be heaped at zero but the treatment effect is real as per the simulation and we want the models to identify the true effect here, which they generally do
| @@ -0,0 +1,2219 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes.
| @@ -0,0 +1,2219 @@ | |||
| { | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
|
Added some comments on the new notebook. Not looked at the code yet. |
Signed-off-by: Nathaniel <[email protected]>
PR SummaryIntroduces variable selection priors, integrates them into PyMC models and IV workflow, and adds tests and docs.
Written by Cursor Bugbot for commit dbc8614. This will update automatically on new commits. Configure here. |
Signed-off-by: Nathaniel <[email protected]>
| the assumption of a simple IV experiment. | ||
| The coefficients should be interpreted appropriately.""" | ||
| We will use the multivariate normal likelihood | ||
| for continuous treatment.""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Validation warning ignores binary_treatment flag setting
The input_validation method checks if the treatment variable has more than 2 unique values and warns that "We will use the multivariate normal likelihood for continuous treatment." However, this warning doesn't account for the new binary_treatment parameter. If a user sets binary_treatment=True while having continuous treatment data, the warning incorrectly suggests MVN will be used, when actually the Bernoulli likelihood will be applied (which would fail on non-binary data). The validation needs to cross-check the actual data against the self.binary_treatment flag.
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
Signed-off-by: Nathaniel <[email protected]>
| mu=mu_outcome, | ||
| sigma=sigma_U, | ||
| observed=y.flatten(), | ||
| ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Binary treatment likelihood has double noise specification
In the binary treatment model, the outcome equation has a double noise issue. The error term U (derived from eps[:, 0]) already contains variance from sigma_U through the Cholesky transformation. However, the outcome likelihood then adds another sigma_U noise term via pm.Normal("likelihood_outcome", mu=mu_outcome, sigma=sigma_U, ...). This effectively doubles the noise variance, making the model statistically incorrect. The latent error U should either be omitted from mu_outcome if using a Normal likelihood with sigma_U, or the likelihood should be changed to reflect that U is already the error component.
|
|
||
| # Cholesky decomposition for correlated errors | ||
| inverse_rho = pm.math.sqrt(pm.math.maximum(1 - rho_clipped**2, 1e-12)) | ||
| chol = pt.stack([[sigma_U, 0.0], [sigma_U * rho_clipped, inverse_rho]]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bug: Cholesky decomposition has incorrect off-diagonal element
The Cholesky matrix for correlated errors in the binary treatment model is mathematically incorrect. The code constructs chol = [[sigma_U, 0], [sigma_U * rho_clipped, inverse_rho]], but for the intended covariance structure where the outcome error has variance sigma_U² and the treatment error has unit variance with correlation rho, the correct Cholesky should have [rho_clipped, inverse_rho] for the second row (not [sigma_U * rho_clipped, inverse_rho]). This produces an incorrect covariance matrix where the off-diagonal is sigma_U² * rho instead of sigma_U * rho, and the treatment variance is sigma_U² * rho² + 1 - rho² instead of 1.
Signed-off-by: Nathaniel <[email protected]>
Took another pass @juanitorduz , should be more friendly now |
Just a draft for the minute. Working through some ideas.
📚 Documentation preview 📚: https://causalpy--568.org.readthedocs.build/en/568/