Format generated Python parser module #347

kieran-ryan · 2025-01-04T22:00:09Z

🤔 What's changed?

Updated parser generation code with Python formatting as per style guidelines
Invoked no-operation Token.detach method in generated parser code
Migrated Python formatter from black to ruff
Applied refactorings to generated parser code
- Reorganised imports - I001 (ruff check gherkin/parser.py --select=I001)
- Dropped or False logic from conditionals - RET505 (ruff check gherkin/parser.py --select=RET505)
- Replace series of or statements with any call
- Drop else statement and nesting proceeding by conditionals that would return from the enclosing function

⚡️ What's your motivation?

Ensures entire Python implementation is formatted consistently - building on Apply black code formatter to the python codebase #286
- Improves code review as pull requests less-susceptible to stylistic changes
- Simplifies configuration - removes exclusion of incompliant modules
- Improves readability
  - Bulk of change reduces mix of 4 and 8 space indentation within match_token_at_'x' methods to just 4
    - Was difficult to read and was most-likely a workaround in the generator code to satisfy generating code for two cases: to run inside conditionals (with 4 spaces) and outside conditionals (8 spaces) without having to handle within the generator; a SyntaxError would be raised if standardised to 4 spaces without handling (as per this pull request)
Migrating formatting to ruff has a number of advantages - including performance, and enabling standardising to a single toolset for linting and formatting
- ruff is monorepo-friendly, with hierarchical and cascading configuration - which is ideal for this repository: allowing configuration to be stored within the python directory while allowing pre-commit, IDEs and commands run from the root of the repository
- Aligns formatting development experience with pytest-bdd (Migrate black, flake8, isort, pyupgrade linters and formatters to ruff pytest-dev/pytest-bdd#758)
General refactorings to improve code quality and simplify code generation
- or False is redundant - assume added as a safeguard where the proceeding args are not present (defaults to just ‘False’ in that case) pots generation - however the unit tests protect against this, failing if the args aren’t generated
- Easier to generate any calls against a comma-separate iterable compared to a series of or statements where it can't be applied on every line
- superflous else is redundant (superflous-else-return - RET505)
- Generated parser code currently contains a useless-expression (B018) token.detach - updated code invokes appropriately though this doesn't actually alter behaviour (being a no-operation method) though conforms to intention and other parser implementation

🏷️ What kind of change is this?

🏦 Refactoring/debt/DX (improvement to code design, tooling, etc. without changing behaviour)

♻️ Anything particular you want feedback on?

Thoughts whether a CHANGELOG entry is warranted?
To evaluate changes are formatted correctly:

Open a codespace (or locally alternatively):

Regenerate all parser implementations:
```
docker build --tag berp-env .
docker run --rm --interactive --tty --volume ".:/app" berp-env
make clean-generate
make generate
```
From a separate terminal window, install the pre-commit hook and validate all Python code is formatted (including the generated parser code):
```
pip install pre-commit
pre-commit install
pre-commit run --all-files ruff-format
```
Run Python unit and acceptance tests to validate no regression in the generated parser implementation:
```
pip install pytest
pytest
cd python/
make acceptance
```
Have skipped formatting on state_comment string
- Challenging to generate formatted Python multiline string from C# - consider exclusion pragmatic
Intend to update GitHub Actions workflow in a follow up pull request to validate, or fully assess use of pre-commit-ci with this polyglot repo

📋 Checklist:

I agree to respect and uphold the Cucumber Community Code of Conduct
My change requires a change to the documentation.
- I have updated the documentation accordingly.
Users should know about my change
- I have added an entry to the "Unreleased" section of the CHANGELOG, linking to this pull request.

youtux · 2025-01-05T12:05:01Z

I’m 👎 for this change:

self.ast_builder = ast_builder if ast_builder is not None else AstBuilder() -> self.ast_builder = ast_builder or AstBuilder()

the reason is that you should be able to subclass AstParser and override __bool__ to return False, and the above code should still use your object.

python/gherkin/parser.py

- Removes configuring exclusion of any Python modules from formatting - Ensure entire library is formatted correctly - Drop redundant `or False` logic from conditionals - Drop redundant `else` conditionals after `return` within `if` clause - Skipped formatting on `state_comment` string - Challenging to generate formatted Python multiline string from C#

- Faster tooling - Better handling with linting rule conflicts - Enables extending to hundreds of linting rules - Aligns with pytest-bdd (pytest-dev/pytest-bdd#758)

kieran-ryan · 2025-01-05T22:58:52Z

I’m 👎 for this change:

self.ast_builder = ast_builder if ast_builder is not None else AstBuilder() -> self.ast_builder = ast_builder or AstBuilder()

the reason is that you should be able to subclass AstParser and override __bool__ to return False, and the above code should still use your object.

Makes sense - applied! Had been unsure on the same for a somewhat related reason - being too implicit by assessing "truthiness" rather than the explicit type declarations - which could become a consideration with changes down the line should users begin passing additional values. Accustomed to using in a local context without this consideration. Interesting, thank you.

python/gherkin/parser.py

luke-hill · 2025-02-08T15:47:00Z

Given this is a big refactor how far down the line is it. Is it close to being merge-worthy? @kieran-ryan as we're doing a lot of other big changes in other flavours so would be good to get this in if we can

python/gherkin-python.razor

# Conflicts: # python/gherkin/parser.py

python/gherkin-python.razor

mpkorstanje · 2025-06-30T11:16:43Z

I've fixed the remarks around not exiting from the loop early. The implementation is a bit awkward as python doesn't have a do-while construct, but this should work and seems to satisfy ruff.

kieran-ryan added the 🏦 debt Tech debt label Jan 4, 2025

kieran-ryan self-assigned this Jan 4, 2025

kieran-ryan mentioned this pull request Jan 4, 2025

ci: Pin .Net and code generation test workflows to ubuntu-22.04 #348

Merged

1 task

kieran-ryan force-pushed the debt/format-py-parser branch from c50567f to ab8bc7c Compare January 4, 2025 23:32

kieran-ryan marked this pull request as ready for review January 4, 2025 23:37

kieran-ryan requested review from youtux and jsa34 January 4, 2025 23:37

youtux reviewed Jan 5, 2025

View reviewed changes

python/gherkin/parser.py Show resolved Hide resolved

kieran-ryan added 3 commits January 5, 2025 22:52

fix: Invoke no-operation detach Python method

060f2c0

debt: Migrate Python formatter from black to ruff

52864a5

- Faster tooling - Better handling with linting rule conflicts - Enables extending to hundreds of linting rules - Aligns with pytest-bdd (pytest-dev/pytest-bdd#758)

kieran-ryan force-pushed the debt/format-py-parser branch from ab8bc7c to 52864a5 Compare January 5, 2025 22:53

mpkorstanje requested a review from youtux January 12, 2025 19:07

youtux reviewed Jan 12, 2025

View reviewed changes

python/gherkin/parser.py Outdated Show resolved Hide resolved

youtux reviewed Mar 12, 2025

View reviewed changes

python/gherkin-python.razor Show resolved Hide resolved

Merge remote-tracking branch 'origin/main' into debt/format-py-parser

16a024f

# Conflicts: # python/gherkin/parser.py

mpkorstanje reviewed Jun 30, 2025

View reviewed changes

python/gherkin-python.razor Outdated Show resolved Hide resolved

mpkorstanje force-pushed the debt/format-py-parser branch 2 times, most recently from 8ae3c0d to c026fe6 Compare June 30, 2025 10:54

mpkorstanje added 2 commits June 30, 2025 13:01

Do Continue early

e488d3e

Do Continue early

370f944

mpkorstanje force-pushed the debt/format-py-parser branch from c026fe6 to 370f944 Compare June 30, 2025 11:11

mpkorstanje merged commit c1911b0 into main Jun 30, 2025
6 checks passed

mpkorstanje deleted the debt/format-py-parser branch June 30, 2025 11:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Format generated Python parser module #347

Format generated Python parser module #347

Uh oh!

kieran-ryan commented Jan 4, 2025 •

edited

Loading

Uh oh!

youtux commented Jan 5, 2025

Uh oh!

Uh oh!

kieran-ryan commented Jan 5, 2025 •

edited

Loading

Uh oh!

Uh oh!

luke-hill commented Feb 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mpkorstanje commented Jun 30, 2025

Uh oh!

Uh oh!

Uh oh!

Format generated Python parser module #347

Format generated Python parser module #347

Uh oh!

Conversation

kieran-ryan commented Jan 4, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🤔 What's changed?

⚡️ What's your motivation?

🏷️ What kind of change is this?

♻️ Anything particular you want feedback on?

📋 Checklist:

Uh oh!

youtux commented Jan 5, 2025

Uh oh!

Uh oh!

kieran-ryan commented Jan 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

luke-hill commented Feb 8, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mpkorstanje commented Jun 30, 2025

Uh oh!

Uh oh!

kieran-ryan commented Jan 4, 2025 •

edited

Loading

kieran-ryan commented Jan 5, 2025 •

edited

Loading