Skip to content

v0.0.24

Choose a tag to compare

@ali6parmak ali6parmak released this 11 Aug 11:13
· 62 commits to main since this release

What's Changed

Support for PDF-to-markdown and PDF-to-HTML:

  • Different sizes of titles

  • Superscripts/Subscripts

  • Bold/Italic text

  • Tables in HTML format

  • Formulas in LaTeX format

  • List items with different indentations

  • Hyperlinks

  • In-document references

  • Pictures

  • Table of contents information (optional with extract_toc parameter)

  • Restructured & refactored all the project to clean architecture.

  • Updated formula extraction model to a better one

  • Updated table extraction model to a better & much faster one

New Contributors

Full Changelog: v0.0.23...v0.0.24