Add structured PDF link support and MuPDF URI parsing#199
Add structured PDF link support and MuPDF URI parsing#199JustForFun88 wants to merge 3 commits intomessense:mainfrom
Conversation
84e6fd4 to
056fd46
Compare
056fd46 to
77f707d
Compare
There was a problem hiding this comment.
Pull request overview
This PR adds comprehensive structured PDF link support to the mupdf-rs crate, enabling bidirectional conversion between MuPDF's URI strings and typed Rust structs. The implementation provides a high-level API for extracting and creating PDF link annotations while maintaining full compatibility with MuPDF's internal link representation.
Changes:
- Introduces a new
pdf::linksmodule with typesPdfLink,PdfAction,PdfDestination, andFileSpecto represent PDF link annotations and actions - Implements URI parsing (
parse_external_link) to convert MuPDF URI strings into structuredPdfActionvariants - Implements URI formatting (
DisplayforPdfAction/DestinationKind) to serialize actions back to MuPDF-compatible URI strings - Adds
PdfPage::pdf_links()iterator for extracting typed link data from pages andPdfPage::add_links()for inserting link annotations - Refactors
Matrix::invert()to returnOption<Matrix>instead of falling back to identity (based on PR #198) - Moves
encode_intofromDestinationtoDestinationKindand addsDefaultimplementation - Fixes memory leak in
LinkIterby addingDropimplementation (based on PR #200)
Reviewed changes
Copilot reviewed 15 out of 15 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
| src/pdf/links/mod.rs | Module entry point defining public types (PdfLink, PdfAction, PdfDestination, FileSpec) with Display implementation for URI formatting |
| src/pdf/links/extraction.rs | URI parsing logic to convert MuPDF URI strings to structured PdfAction types |
| src/pdf/links/build.rs | Annotation building logic to construct PDF link annotation dictionaries from PdfLink structs |
| src/pdf/links/tests_format.rs | Comprehensive tests for URI formatting (Display implementation) |
| src/pdf/links/tests_extraction.rs | Comprehensive tests for URI parsing with edge cases |
| src/pdf/links/tests_build.rs | Round-trip tests verifying write/read symmetry and MuPDF compatibility |
| src/pdf/page.rs | Page-level API with pdf_links() iterator and add_links() methods, including PdfLinkIter implementation |
| src/pdf/object.rs | Adds array_push_ref and dict_put_ref helpers and refactors dict_put to delegate to dict_put_ref |
| src/pdf/document.rs | Adds new_array_with_capacity, new_dict_with_capacity, load_pdf_page helpers and updates existing code to handle Matrix::invert() returning Option |
| src/rect.rs | Adds encode_into method for encoding rectangles into PDF arrays |
| src/destination.rs | Moves encode_into from Destination to DestinationKind, adds Default impl, and adds Display impl for URI fragment formatting |
| src/matrix.rs | Changes invert() to return Option<Matrix> instead of Matrix::IDENTITY on singular matrices |
| src/page.rs | Adds Drop implementation for LinkIter to fix memory leak |
| Cargo.toml | Adds percent-encoding = "2.3.1" dependency for URI encoding/decoding |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| impl Default for DestinationKind { | ||
| fn default() -> Self { | ||
| // This analogue of MuPDF's `fz_make_link_dest_none` function |
There was a problem hiding this comment.
Grammar: "This analogue" should be "This is an analogue" or "Analogue".
| // This analogue of MuPDF's `fz_make_link_dest_none` function | |
| // This is an analogue of MuPDF's `fz_make_link_dest_none` function |
| pub(super) fn is_pdf_path(file_name: &str) -> bool { | ||
| file_name | ||
| .get(file_name.len().saturating_sub(4)..) | ||
| .is_some_and(|extention| extention.eq_ignore_ascii_case(".pdf")) |
There was a problem hiding this comment.
Typo in variable name: "extention" should be "extension".
| .is_some_and(|extention| extention.eq_ignore_ascii_case(".pdf")) | |
| .is_some_and(|extension| extension.eq_ignore_ascii_case(".pdf")) |
| // https://github.com/ArtifexSoftware/mupdf/blob/60bf95d09f496ab67a5e4ea872bdd37a74b745fe/source/pdf/pdf-link.c#L1325 | ||
| dest.array_push_ref(dest_page_obj)?; | ||
|
|
||
| // MuPDF uses inv_ctm to transform coodinates |
There was a problem hiding this comment.
Typo in comment: "coodinates" should be "coordinates".
| // MuPDF uses inv_ctm to transform coodinates | |
| // MuPDF uses inv_ctm to transform coordinates |
This PR adds structured PDF link support, enabling parsing of MuPDF URI strings into typed Rust structs and the construction of PDF annotation dictionaries.
Based on #198 and #200.
What's new:
pdf::linksmodule with types:parse_external_link): converts MuPDF's URI strings back into structured action typesDisplayforPdfAction/DestinationKind): serializes actions back to MuPDF-compatible URI stringsbuild_link_annotation): constructs PDF link annotation dictionaries fromPdfLinkstructsPdfPage::pdf_links()iterator for extracting typed link data from pagesPdfPage::insert_links()for adding link annotations with Fitz-to-PDF coordinate transformsencode_intofromDestinationtoDestinationKindand addedDefaultimplementationTesting:
PdfActionstring formatting.PdfAction-> writing it into a PDF -> reading back via MuPDF -> verifying that:PdfActionmatches the originalDisplayoutput of the originalPdfActionP.S. Although the PR looks large (~4000 lines changed), it's mostly tests (60%) and docs. The actual logic changes are relatively small.