Skip to content

Releases: ArtifexSoftware/pdf2docx

v0.5.9

14 Feb 03:57

Choose a tag to compare

  • Fixed bug when extracting tables with empty cells.
  • Update requirements.txt to use PyMuPDF>=1.26.7
  • Update setup.py python_requires to >=3.10

v0.5.8

22 Jan 17:23

Choose a tag to compare

v0.5.7

07 Jan 11:06

Choose a tag to compare

  • Support in-memory file such as BytesIO when reading PDF or writing docx: #108, #177, #223
  • Fixed tabulation issue when multiple tabs exist: #157
  • Fixed cannot find builtin font with name 'Arial' issue: #216, #235, #237, #241
  • Fixed importing issue caused by python-docx 1.0.0: #233, #234
  • Fixed font name encoding issue: #194, #246
  • Fixed duplicated columns (xml) issue: #245

v0.5.6

11 Aug 03:06

Choose a tag to compare

  • Workaround for UnicodeDecodeError issue: #144, #155
  • Fixed table parsing (small images as dashed borders) issue: #138, #158
  • Improved table parsing (allow one-cell-table with shading): #149

v0.5.5

10 Jul 15:42

Choose a tag to compare

  • fixed closePath issue for PyMuPDF 1.20+: #146, #147

v0.5.4

22 Jun 10:25

Choose a tag to compare

  • Ignore hidden text #132
  • Fixed property name deprecated by PyMuPDF 1.20+ #139, #141, #143

v0.5.3

21 Feb 03:44

Choose a tag to compare

  • section parsing enhancement
  • table parsing enhancement: #105, #107
  • vector graphic parsing enhancement: recursive xy cut
  • upgrade with PyMuPDF>=1.19.0
  • supported Python 3.10
  • fixed images parsing issues: #110, #125, #123
  • fixed importing GUI issue for non-GUI platform: #118, #121

v0.5.2

30 May 16:38

Choose a tag to compare

  • new layout structure: Page -> Section -> Column -> blocks
  • extended parsing scope: document -> page -> section
  • corrected font name with font-tools
  • changed from exact line spacing to relative line spacing
  • supported encrypted pdf #86
  • added user interface #88

v0.5.1

15 Jan 09:23

Choose a tag to compare

v0.5.0

31 Dec 18:00

Choose a tag to compare

  • extract pdf path with PyMuPDF (>=1.18.0) API
  • support floating picture
  • enhance paragraph alignment and vertical spacing
  • global settings on page parsing
  • enhance CLI commands