-
Notifications
You must be signed in to change notification settings - Fork 75
Refactor formula parsing #1118
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Refactor formula parsing #1118
Conversation
|
After first look at the code base: much better and much cleaner than before. No fundamental suggestions for improvement from my side. Thank you! |
|
Note to self: Look into allowing |
|
@leostimpfle fixed a bug in the bin() function + adjusted the tests. You can run them via 40 passed
2 failed
- tests/test_i.py:286 test_factor_x_factor[Y ~ i(f_str, i.g)-Y ~ i(f_str, g)]
- tests/test_i.py:304 test_factor_x_factor_with_fe[Y ~ i(f_str, i.g) | fe1-Y ~ i(f_str, g) | fe1]
E AssertionError: Name mismatch:
E py=['f_str::apple:g::X', 'f_str::apple:g::Y', 'f_str::apple:g::Z', 'f_str::banana:g::X', 'f_str::banana:g::Y', 'f_str::banana:g::Z', 'f_str::cherry:g::X', 'f_str::cherry:g::Y']
E r=['f_str::apple:g::Y', 'f_str::apple:g::Z', 'f_str::banana:g::X', 'f_str::banana:g::Y', 'f_str::banana:g::Z', 'f_str::cherry:g::X', 'f_str::cherry:g::Y', 'f_str::cherry:g::Z']
E assert ['f_str::appl...na:g::Z', ...] == ['f_str::appl...ry:g::X', ...]
E
E At index 0 diff: 'f_str::apple:g::X' != 'f_str::apple:g::Y' |
|
@s3alfisc I've gone over the PR once more today. The biggest change is that I have rewritten I've also let Claude improve the docstrings and renamed a few functions/attributes for purely aesthetic reasons. The PR is now ready from my side (although there are many other potential improvements around formula parsing and model matrix construction: #1125, #1126, #1127, #1130). |
|
Review in 4 steps:
|
parse
I committed a few changes, I hope all of these make sense to you @leostimpfle and are more or less self-explanatory by the commit message? |
Agreed that this is somewhat unintuitive. An alternative to changing the attribute names could be to include the encoded fixed effects directly in the formula. For example, instead of
This is a hangover from my early attempts to use
Not needed, and I have removed it
Yes, all good. Thanks @s3alfisc! |
This is a proof of concept for a refactor of PyFixest's formula parsing. The PR introduces a new module
parsethat refactors formula parsing from the ground up.The core logic is implemented in
pyfixest.estimation.formula.parse.parsewhich takes in a formula string and returns a collection of parsed formulas represented bypyfixest.estimation.formula.parse.Formula.All references to the old
FormulaParserare bypassed (mostly by renaming the oldFixestFormulausing imports of the formfrom pyfixest.estimation.formula.parse import Formula as FixestFormula)