test: add 422+ unit tests across 19 modules#261
Open
jacob-cotten wants to merge 1 commit intodeveloper0hye:mainfrom
Open
test: add 422+ unit tests across 19 modules#261jacob-cotten wants to merge 1 commit intodeveloper0hye:mainfrom
jacob-cotten wants to merge 1 commit intodeveloper0hye:mainfrom
Conversation
Adds comprehensive test coverage for pdfplumber-core and pdfplumber-parse: - encoding.rs: 48 tests (glyph name resolution, StandardEncoding, FontEncoding, EncodingResolver) - edges.rs: 33 tests (edge generation, orientation, degenerate paths) - search.rs: 12 tests (regex, unicode, bbox union, anchored patterns) - dedupe.rs: 14 tests (tolerance boundaries, font/size blocking, output ordering) - bidi.rs: 14 tests (RTL/LTR detection, neutral chars, field preservation) - shapes.rs: 18 tests (orientation, flip_y, rect construction, CTM transforms) - annotation.rs: 11 tests (subtype roundtrip, equality, bbox preservation) - struct_tree.rs: 9 tests (deep nesting, MCID extraction, child ordering) - page_regions.rs: 10 tests (unicode masking, thresholds, custom margins) - path.rs: 12 tests (builder reset, rectangle segments, clone independence) - html.rs: 35 tests (headings, escaping, bold/italic detection, lists, median) - color_space.rs: 23 tests (all color spaces, ICC delegation, indexed bounds) - standard_fonts.rs: 14 tests (all 14 fonts, monospace invariant, bbox sanity) - words.rs: 90+ tests (word split boundaries, vertical text, tolerance, CJK) - text.rs: tests for CTM transforms and CJK detection - layout.rs, table.rs, cmap.rs: additional coverage Also fixes word-split tolerance check to use >= (matching Python pdfplumber semantics) and corrects 3 test expectations for vertical text ordering (non-upright chars sort x0 descending, all vertical chars get Ttb direction). Signed-off-by: Jacob Cotten <jacob@stratesystems.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Thank you for building pdfplumber-rs — it's an impressive pure-Rust port and we've been using it extensively. We wanted to give back by contributing comprehensive test coverage.
This PR adds 422+ unit tests across 19 modules in pdfplumber-core and pdfplumber-parse, covering edge cases, boundary conditions, and correctness invariants:
Bug fix included
Also fixes the word-split tolerance check from
>to>=to match Python pdfplumber's semantics, and corrects 3 pre-existing test expectations for vertical text ordering.Test plan
cargo test -p pdfplumber-core -p pdfplumber-parse --lib)Thank you again for your work on this project. 🙏
🤖 Generated with Claude Code