v2.0.0 Multiline/positional text extraction, image extraction, many bugfixes
What's Changed
- More strictly parse dictionary arrays by @PrinsFrank in #53
- Add new PDF2.0 tabs nameValues by @PrinsFrank in #54
- Fix issue where nested arrays in dictionaries were closed too early by @PrinsFrank in #56
- Fix comment state in dictionary parsing by @PrinsFrank in #58
- Fix nesting issues in text operator parsing by @PrinsFrank in #60
- Correctly parse dictionary array values where items are seperated over several lines by @PrinsFrank in #62
- Add missing public key security handlers to FilterNameValue by @PrinsFrank in #65
- Add support for multiple codespaceranges by @PrinsFrank in #67
- Remove extra decoding in uncompressed object by @PrinsFrank in #68
- Allow dashes in font names by @PrinsFrank in #70
- Fix issues where font operator contains font names by @PrinsFrank in #72
- Rename TextObjects and collection to contentStream to better reflect expected content by @PrinsFrank in #73
- Tc operator sets char space not char size by @PrinsFrank in #74
- Keep track of content stream commands outside text objects by @PrinsFrank in #75
- Remove dependency to internal phpunit method and ignore non-testing tests by @PrinsFrank in #76
- Organize content stream classes by @PrinsFrank in #77
- Organize text operators by @PrinsFrank in #78
- Parse textObjects to intermediate PositionedTextElement by @PrinsFrank in #79
- Implement remaining content stream operators by @PrinsFrank in #81
- Fix false positives on content stream commands by @PrinsFrank in #82
- Implement a transformation state stack to fix text lines appearing out of order by @PrinsFrank in #83
- Fix a typo in CONTRIBUTING by @szepeviktor in #84
- Fix content stream unit test after matrix multiplication fix resulting in correct x and y offsets by @PrinsFrank in #85
- Add test for compressed byte offsets by @PrinsFrank in #87
- Parse CIDFontWidths in dictionary values by @PrinsFrank in #89
- Implement space insertion based on text width by @PrinsFrank in #90
- Update samples dependency and increase space insertion threshold by @PrinsFrank in #91
- Don't try to parse EMCs and other content stream content as dictionaries by @PrinsFrank in #92
- Don't continue matching operators when in dictionary key or escaped string by @PrinsFrank in #94
- Implement encryption detection by @PrinsFrank in #97
- Use object length instead of searching for end marker when set in dictionary by @PrinsFrank in #95
- Add missing tests for inMemoryStream by @PrinsFrank in #98
- Implement missing features for literal string escape sequences by @PrinsFrank in #104
- Implement toUnicodeCMap unicode mappings for string literals by @PrinsFrank in #106
- Fix issue where string literals ended up as part of multibyte character groups by @PrinsFrank in #110
- Fix issue when text state is set outside of text object by @PrinsFrank in #112
- When in a string literal, don't keep track of other nesting levels as that is not possible, fixes several array delimiter issues in string literals by @PrinsFrank in #113
- Add missing resource TypeNameValue by @PrinsFrank in #114
- Fix: mb_convert_encoding can output false by @PrinsFrank in #115
- Implement image extraction by @PrinsFrank in #118
New Contributors
- @szepeviktor made their first contribution in #84
Full Changelog: v1.1.0...v2.0.0