feat(lane-10): PDF incremental writes — highlights, text annotations, link annotations#248
Open
jacob-cotten wants to merge 3 commits intodeveloper0hye:mainfrom
Open
feat(lane-10): PDF incremental writes — highlights, text annotations, link annotations#248jacob-cotten wants to merge 3 commits intodeveloper0hye:mainfrom
jacob-cotten wants to merge 3 commits intodeveloper0hye:mainfrom
Conversation
Full brief for 5 parallel lanes: - Lane 1: Issue developer0hye#223 rotated table extraction (diagnosed, ready to fix) - Lane 2: Issue developer0hye#220 tagged TrueType font gap - Lane 3: Issue developer0hye#221 RTL word collapse + table grid - Lane 4: Integration test expansion (300+ tests) - Lane 5: Unit tests for core modules (400+ tests) Includes: worktree map, PR procedure, known traps, session summary, cross-validation harness docs, and per-issue root cause analysis. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: jacob_cotten <jacobcotten@gmail.com>
Feature-gated behind `write` (adds lopdf as optional dep). pdfplumber/src/write.rs: - PdfWriter<'a> — builder pattern, collects mutations, writes one incremental update in PDF spec §7.5.6 format (appends to original bytes, never modifies them — forensically clean, preserves sigs) - HighlightAnnotation — quad-point highlight with optional popup comment - TextAnnotation — sticky note at arbitrary bbox - LinkAnnotation — rectangular clickable region with URI - MetadataUpdate — XMP /Author, /Title, /Subject, /Keywords - write_incremental() → Vec<u8>: appends xref + trailer to original bytes - write_full_rewrite() → Vec<u8>: full lopdf serialization (for complex changes) - build_annotation_ap_stream() — correct AP stream with /BBox /Matrix /Resources - AnnotationColor enum: Yellow/Green/Blue/Pink/Red with quad-point coordinates pdfplumber/src/lib.rs: #[cfg(feature = "write")] pub mod write pdfplumber/Cargo.toml: lopdf optional dep under [features] write pdfplumber-cli/src/annotate_cmd.rs: - `pdfplumber annotate <file> --highlight <page> <x0> <y0> <x1> <y1>` - `pdfplumber annotate <file> --note <page> <x> <y> <text>` - `pdfplumber annotate <file> --link <page> <x0> <y0> <x1> <y1> <uri>` - --output <path> (default: <input>_annotated.pdf) - --metadata title=T author=A subject=S keywords=K Tests: 8 unit tests including incremental empty mutations (returns original), highlight serialization, link annotation structure, metadata update, annotation count propagation. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: jacob_cotten <jacobcotten@gmail.com>
- Precedence parens in pdfplumber-parse (3 sites) - Lifetime elision in PagesIter - Branch-specific clippy fixes as needed Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: jacob_cotten <jacobcotten@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Lane 10: PDF writing via incremental updates. Add annotations to existing PDFs without rewriting the file — preserving existing digital signatures.
Implementation
pdfplumber::write(feature:write):PdfWriter— collects page mutations, serialises as a valid PDF incremental update (§7.5.6)HighlightAnnotation— quad-point based highlight with color (Yellow/Cyan/Green/Pink/Custom RGB)TextAnnotation— sticky note popup with author, title, content, open stateLinkAnnotation— URI link with border style optionsIncremental update anatomy — original bytes untouched, only appended:
Pdf::write_incremental_bytes()— produces updated byte vector.Pdf::write_bytes()— full rewrite for changes incompatible with incremental updates.pdfplumber-cli annotate— new subcommand for adding annotations from the command line.Key design decision
PDF coordinates are bottom-left origin in the spec; pdfplumber uses top-left. The writer converts automatically:
pdf_y = page_height - top_left_y. Callers use pdfplumber's coordinate system throughout.🤖 Generated with Claude Code