Introducing the ability to skip empty text nodes in `LexborNode`! by pygarap · Pull Request #187 · rushter/selectolax

pygarap · 2025-11-20T05:01:30Z

This pull request enhances the LexborNode API in selectolax by introducing the ability to skip empty text nodes in traversal, iter, text, and by adding a property to check for empty text nodes. These changes improve control over HTML parsing, especially when handling whitespace or empty nodes. The update also includes expanded documentation and tests to ensure the new behaviors are correct.

API Enhancements for Skipping Empty Text Nodes:

Added a skip_empty parameter to the text, iter, and traverse methods in LexborNode, allowing users to exclude empty text nodes (as determined by lxb_dom_node_is_empty) from results. [1] [2] [3] [4] [5] [6] [7] [8] [9]
Updated the C extension interface to expose the lxb_dom_node_is_empty function for use in Python code.

New Property for Node Emptiness:

Introduced the is_empty_text_node property on LexborNode, providing a convenient way to check if a node is a text node and considered empty by the underlying DOM implementation. [1] [2]

Documentation and Usability Improvements:

Expanded docstrings for text, iter, and traverse methods to clearly describe new parameters and behaviors, improving developer understanding and usability. [1] [2] [3] [4] [5] [6]
Added __iter__ and __next__ methods for better iterator protocol support.

Testing:

Added new tests to verify the correct behavior of the skip_empty flag in text, iter, and traverse methods, and to check the is_empty_text_node property.

…ds (rushter#187)

…tion

… in `node` implementation

…ntation

…eter; update docstrings accordingly. Add `is_empty_text_node` property.

…or in `text`, `iter`, and `traverse` methods

…se` methods; adjust logic and docstrings accordingly

…logic and adding text content assertions

… inspection; remove unused `Iterator` import

…natures in type stubs for consistency; add minor formatting adjustments

…e redundant blank lines.

…er; update related methods and tests for consistency.

… add minor formatting adjustments in `node.pxi`

rushter · 2025-11-20T09:12:28Z

Thanks.

pygarap added 13 commits November 20, 2025 05:24

Add support for skipping empty text nodes in iterators and text metho…

e0c850d

…ds (rushter#187)

Fix parameter naming and correct property method in node implementa…

71cff3a

…tion

Refine docstrings for text method and is_empty_text_node property…

3d79261

… in `node` implementation

Refine docstrings for iter and traverse methods in node impleme…

15358f6

…ntation

Extend text, iter, and traverse methods with skip_empty param…

f1f502e

…eter; update docstrings accordingly. Add `is_empty_text_node` property.

Add lxb_dom_node_is_empty binding and tests for skip_empty behavi…

8d88d13

…or in `text`, `iter`, and `traverse` methods

Update skip_empty default to False in text, iter, and `traver…

0716adc

…se` methods; adjust logic and docstrings accordingly

Refine test_is_empty_text_node_property by updating node selection …

a6dcda5

…logic and adding text content assertions

Refactor is_empty_text_node logic to evaluate parent nodes for text…

72cfef1

… inspection; remove unused `Iterator` import

Refactor test_lexbor.py assertions for clarity; reformat method sig…

dec64fd

…natures in type stubs for consistency; add minor formatting adjustments

Simplify skip_empty logic in iterators by merging conditions; remov…

61049fa

…e redundant blank lines.

Refactor skip_empty logic by introducing _is_empty_text_node help…

4e03d8d

…er; update related methods and tests for consistency.

Refactor test_lexbor.py assertions for readability and consistency;…

0faf032

… add minor formatting adjustments in `node.pxi`

rushter merged commit 9d27a52 into rushter:master Nov 20, 2025
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introducing the ability to skip empty text nodes in `LexborNode`!#187

Introducing the ability to skip empty text nodes in `LexborNode`!#187
rushter merged 13 commits intorushter:masterfrom
pygarap:skip_empty_tags

pygarap commented Nov 20, 2025

Uh oh!

Uh oh!

rushter commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pygarap commented Nov 20, 2025

Uh oh!

Uh oh!

rushter commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants