-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Description
Summary
Add an optional PDF-focused enhancement to the GUI wrapper so PDF conversions can preserve extracted images, preview them before save, and optionally use a PyMuPDF-based PDF pipeline while keeping MarkItDown as the default behavior.
Scope
- Keep the GUI as a wrapper of MarkItDown by default.
- Add a PDF pipeline toggle:
markitdownorpymupdf. - Add optional PDF image preservation in Markdown.
- Support asset layouts
separateandsingle, independent from combined/separate Markdown save mode. - Preserve preview/save flows and batch conversion behavior.
Acceptance Criteria
- Non-PDF formats keep existing behavior.
- PDFs keep existing behavior when image preservation is disabled.
- PDFs can preserve extracted images as files and link them from Markdown.
- The rendered preview can resolve extracted PDF image assets before save.
- The
pymupdfpipeline can place extracted images near the closest preceding text block on a best-effort basis. - Combined and separate save modes both rewrite image links correctly.
- Test coverage includes conversion routing, runtime UI paths, real generated PDFs, and packaging smoke coverage.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels