Skip to content

v1.2.0 - Content Ordering Based on Y-Coordinates

Choose a tag to compare

@shtse8 shtse8 released this 31 Oct 18:11
· 46 commits to main since this release

Features

Content Ordering: Preserve exact text and image order based on Y-coordinates

  • Content items within each page are now sorted by their vertical position
  • Enables AI to see content in the same order as it appears in the PDF
  • Text and images are interleaved based on document layout
  • Example: page 1 [text, image, text, image, image, text]
  • Uses PDF.js transform matrices to extract Y-coordinates
  • Automatically groups text items on the same line
  • Returns ordered content parts for optimal AI consumption

Internal Changes

  • New extractPageContent() function combines text and image extraction with positioning
  • New PageContentItem interface tracks content type, position, and data
  • Handler updated to generate content parts in document-reading order
  • Improved error handling to return descriptive error messages as text content

Code Quality

  • All tests passing (91 tests)
  • Coverage maintained at 97.76% statements, 90.95% branches
  • TypeScript strict mode compliance
  • Zero linting errors

Install

npm install -g @sylphx/pdf-reader-mcp
# or
pnpm add -g @sylphx/pdf-reader-mcp