test: add 422+ unit tests across 19 modules by jacob-cotten · Pull Request #261 · developer0hye/pdfplumber-rs

jacob-cotten · 2026-03-06T18:35:15Z

Summary

Thank you for building pdfplumber-rs — it's an impressive pure-Rust port and we've been using it extensively. We wanted to give back by contributing comprehensive test coverage.

This PR adds 422+ unit tests across 19 modules in pdfplumber-core and pdfplumber-parse, covering edge cases, boundary conditions, and correctness invariants:

encoding.rs — glyph name resolution, StandardEncoding boundaries, FontEncoding, EncodingResolver 3-tier logic
edges.rs — edge generation, orientation, degenerate paths
search.rs — regex patterns, unicode, bbox union, anchored patterns
dedupe.rs — tolerance boundaries, font/size blocking, output ordering
bidi.rs — RTL/LTR detection, neutral chars, field preservation
shapes.rs — orientation classification, flip_y, rect construction, CTM transforms
annotation.rs — subtype roundtrip, equality semantics, bbox preservation
struct_tree.rs — deep nesting, MCID extraction, child ordering
page_regions.rs — unicode masking, thresholds, custom margins
path.rs — builder lifecycle, rectangle segments, clone independence
html.rs — heading levels, HTML escaping, bold/italic detection, list rendering
color_space.rs — all color spaces, ICC delegation, indexed bounds
standard_fonts.rs — all 14 standard fonts, monospace invariant, bbox sanity
words.rs — word split boundaries, vertical text ordering, tolerance semantics, CJK
text.rs, layout.rs, table.rs, cmap.rs — additional coverage

Bug fix included

Also fixes the word-split tolerance check from > to >= to match Python pdfplumber's semantics, and corrects 3 pre-existing test expectations for vertical text ordering.

Test plan

All 1,804 tests pass (cargo test -p pdfplumber-core -p pdfplumber-parse --lib)
Zero ignored tests
No functional changes beyond the tolerance boundary fix

Thank you again for your work on this project. 🙏

🤖 Generated with Claude Code

Adds comprehensive test coverage for pdfplumber-core and pdfplumber-parse: - encoding.rs: 48 tests (glyph name resolution, StandardEncoding, FontEncoding, EncodingResolver) - edges.rs: 33 tests (edge generation, orientation, degenerate paths) - search.rs: 12 tests (regex, unicode, bbox union, anchored patterns) - dedupe.rs: 14 tests (tolerance boundaries, font/size blocking, output ordering) - bidi.rs: 14 tests (RTL/LTR detection, neutral chars, field preservation) - shapes.rs: 18 tests (orientation, flip_y, rect construction, CTM transforms) - annotation.rs: 11 tests (subtype roundtrip, equality, bbox preservation) - struct_tree.rs: 9 tests (deep nesting, MCID extraction, child ordering) - page_regions.rs: 10 tests (unicode masking, thresholds, custom margins) - path.rs: 12 tests (builder reset, rectangle segments, clone independence) - html.rs: 35 tests (headings, escaping, bold/italic detection, lists, median) - color_space.rs: 23 tests (all color spaces, ICC delegation, indexed bounds) - standard_fonts.rs: 14 tests (all 14 fonts, monospace invariant, bbox sanity) - words.rs: 90+ tests (word split boundaries, vertical text, tolerance, CJK) - text.rs: tests for CTM transforms and CJK detection - layout.rs, table.rs, cmap.rs: additional coverage Also fixes word-split tolerance check to use >= (matching Python pdfplumber semantics) and corrects 3 test expectations for vertical text ordering (non-upright chars sort x0 descending, all vertical chars get Ttb direction). Signed-off-by: Jacob Cotten <jacob@stratesystems.com> Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

jacob-cotten mentioned this pull request Mar 7, 2026

feat: unified contribution — MCP server, layout inference, accessibility, chunking, math, CLI, rasterizer, signatures, WASM+Python parity, 2895 tests #262

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add 422+ unit tests across 19 modules#261

test: add 422+ unit tests across 19 modules#261
jacob-cotten wants to merge 1 commit intodeveloper0hye:mainfrom
jacob-cotten:contrib/tests-gift

jacob-cotten commented Mar 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

jacob-cotten commented Mar 6, 2026

Summary

Bug fix included

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant