Skip to content

Releases: PrinsFrank/pdfparser

v2.3.1Minor parsing bugfixes, gz decode fallback

10 Aug 20:42
4133a51

Choose a tag to compare

What's Changed

Full Changelog: v2.3.0...v2.3.1

v2.3.0 More extensive image support: JBIG2, TIFF, CMYK, several other formats and bugfixes

09 Aug 14:23

Choose a tag to compare

What's Changed

  • DecodeParams can be array, Pattern can be string by @PrinsFrank in #159
  • Add support for array filters in images by @PrinsFrank in #160
  • Accept space after byte offset last crossReference source by @PrinsFrank in #161
  • Accept space after start crossReference source by @PrinsFrank in #162
  • Accept reference value for MediaBox and CropBox by @PrinsFrank in #163
  • Scale and font size can both be negative by @PrinsFrank in #165
  • Increase logging for incorrect matrix transformation by @PrinsFrank in #166
  • Add support for padding and extra whitespace in reference array values by @PrinsFrank in #167
  • Add support for TIFF images (CCITTFaxDecode) by @PrinsFrank in #169
  • DecodeParams cannot be plain array by @PrinsFrank in #170
  • Accept subdictionary for decodeParams by @PrinsFrank in #171
  • Dictionary arrays can contain null by @PrinsFrank in #173
  • Properly support array filter types when retrieving image content by @PrinsFrank in #174
  • Properly handle newlines in CIDFonts widths by @PrinsFrank in #175
  • Add support for JBIG2 images by @PrinsFrank in #176
  • Add FillSignData as valid TypeNameValue by @PrinsFrank in #182
  • Accept scale values that are floats besides ints by @PrinsFrank in #184
  • Recover from invalid byte offset last cross reference section by looking for stream/table markers at the end of the document instead by @PrinsFrank in #186
  • Add support for PNG predictor algorithm None by @PrinsFrank in #187
  • Add support for CMYK rasterized images by @PrinsFrank in #189
  • Handle spaces between components of matrix transformation in graphics state operator by @PrinsFrank in #192
  • CONTENTS can be a ReferenceValueArray, not a simple ArrayValue by @PrinsFrank in #193
  • Don't parse characters in resource names as operators by @PrinsFrank in #195
  • Array values can be multiple resource names not seperated by space by @PrinsFrank in #196

Full Changelog: v2.2.1...v2.3.0

v2.2.1 Several small parsing related bugfixes, add JPEG2000 support

21 Jul 18:29
82c5e25

Choose a tag to compare

What's Changed

  • Simplify working with underlying positioned text elements by moving retrieval of fonts to PositionedTextElement and presorting positionedTextElements by @PrinsFrank in #136
  • Properly handle newline between dictionary key and value by @PrinsFrank in #141
  • Use the length from a crossReferenceStream when it is available by @PrinsFrank in #143
  • Don't try to decode values with JPX_DECODE filter by @PrinsFrank in #145

Full Changelog: v2.2.0...v2.2.1

v2.2.0 Add support for extraction of Rasterized Images

20 Jul 14:09
1d44968

Choose a tag to compare

What's Changed

  • Delete stale workflow to prevent issues from being marked as stale by @PrinsFrank in #130
  • Implement rasterized image extraction by @PrinsFrank in #132
  • Add support for other colorspaces that are not DeviceColorSpaces but are stored as luts in objects by @PrinsFrank in #134

Full Changelog: v2.1.1...v2.2.0

v2.1.1 Fixes bug when parsing PDFs where trailer marker is not followed by newline

13 Jun 18:20
41724b1

Choose a tag to compare

What's Changed

Full Changelog: v2.1.0...v2.1.1

v2.1.0 Official PDF2.0 support, classes now non final to allow extension and mocking

28 May 17:36
6e345cb

Choose a tag to compare

What's Changed

Full Changelog: v2.0.0...v2.1.0

v2.0.0 Multiline/positional text extraction, image extraction, many bugfixes

19 May 18:22
921efdd

Choose a tag to compare

What's Changed

  • More strictly parse dictionary arrays by @PrinsFrank in #53
  • Add new PDF2.0 tabs nameValues by @PrinsFrank in #54
  • Fix issue where nested arrays in dictionaries were closed too early by @PrinsFrank in #56
  • Fix comment state in dictionary parsing by @PrinsFrank in #58
  • Fix nesting issues in text operator parsing by @PrinsFrank in #60
  • Correctly parse dictionary array values where items are seperated over several lines by @PrinsFrank in #62
  • Add missing public key security handlers to FilterNameValue by @PrinsFrank in #65
  • Add support for multiple codespaceranges by @PrinsFrank in #67
  • Remove extra decoding in uncompressed object by @PrinsFrank in #68
  • Allow dashes in font names by @PrinsFrank in #70
  • Fix issues where font operator contains font names by @PrinsFrank in #72
  • Rename TextObjects and collection to contentStream to better reflect expected content by @PrinsFrank in #73
  • Tc operator sets char space not char size by @PrinsFrank in #74
  • Keep track of content stream commands outside text objects by @PrinsFrank in #75
  • Remove dependency to internal phpunit method and ignore non-testing tests by @PrinsFrank in #76
  • Organize content stream classes by @PrinsFrank in #77
  • Organize text operators by @PrinsFrank in #78
  • Parse textObjects to intermediate PositionedTextElement by @PrinsFrank in #79
  • Implement remaining content stream operators by @PrinsFrank in #81
  • Fix false positives on content stream commands by @PrinsFrank in #82
  • Implement a transformation state stack to fix text lines appearing out of order by @PrinsFrank in #83
  • Fix a typo in CONTRIBUTING by @szepeviktor in #84
  • Fix content stream unit test after matrix multiplication fix resulting in correct x and y offsets by @PrinsFrank in #85
  • Add test for compressed byte offsets by @PrinsFrank in #87
  • Parse CIDFontWidths in dictionary values by @PrinsFrank in #89
  • Implement space insertion based on text width by @PrinsFrank in #90
  • Update samples dependency and increase space insertion threshold by @PrinsFrank in #91
  • Don't try to parse EMCs and other content stream content as dictionaries by @PrinsFrank in #92
  • Don't continue matching operators when in dictionary key or escaped string by @PrinsFrank in #94
  • Implement encryption detection by @PrinsFrank in #97
  • Use object length instead of searching for end marker when set in dictionary by @PrinsFrank in #95
  • Add missing tests for inMemoryStream by @PrinsFrank in #98
  • Implement missing features for literal string escape sequences by @PrinsFrank in #104
  • Implement toUnicodeCMap unicode mappings for string literals by @PrinsFrank in #106
  • Fix issue where string literals ended up as part of multibyte character groups by @PrinsFrank in #110
  • Fix issue when text state is set outside of text object by @PrinsFrank in #112
  • When in a string literal, don't keep track of other nesting levels as that is not possible, fixes several array delimiter issues in string literals by @PrinsFrank in #113
  • Add missing resource TypeNameValue by @PrinsFrank in #114
  • Fix: mb_convert_encoding can output false by @PrinsFrank in #115
  • Implement image extraction by @PrinsFrank in #118

New Contributors

Full Changelog: v1.1.0...v2.0.0

v2.0.0 Alpha 5

16 May 19:35
921efdd

Choose a tag to compare

v2.0.0 Alpha 5 Pre-release
Pre-release

What's Changed

Full Changelog: v2.0.0-alpha.4...v2.0.0-alpha.5

v2.0.0 Alpha 4

15 May 19:25
6fddbb3

Choose a tag to compare

v2.0.0 Alpha 4 Pre-release
Pre-release

What's Changed

  • Fix issue where string literals ended up as part of multibyte character groups by @PrinsFrank in #110
  • Fix issue when text state is set outside of text object by @PrinsFrank in #112
  • When in a string literal, don't keep track of other nesting levels as that is not possible, fixes several array delimiter issues in string literals by @PrinsFrank in #113
  • Add missing resource TypeNameValue by @PrinsFrank in #114
  • Fix: mb_convert_encoding can output false by @PrinsFrank in #115

Full Changelog: v2.0.0-alpha.3...v2.0.0-alpha.4

v2.0.0 Alpha 3

07 May 18:59
38e2a46

Choose a tag to compare

v2.0.0 Alpha 3 Pre-release
Pre-release

What's Changed

  • Implement toUnicodeCMap unicode mappings for string literals by @PrinsFrank in #106

Full Changelog: v2.0.0-alpha.2...v2.0.0-alpha.3