fix: accept binary bytes on the PDF header line in non strict mode by rth · Pull Request #481 · J-F-Liu/lopdf

rth · 2026-03-19T07:13:45Z

Closes #480

Only capture version-like characters (digits and '.') in the header, then skip any remaining bytes on the line. This matches the approach used by pdf.js (read until whitespace or 7 chars max) and qpdf (regex for digits only).

Also added more unit tests for parsing various PDF headers I saw in the dataset I was working on.

In lenient mode (default), only capture version digits from the header line, skipping any trailing binary marker bytes that some generators (e.g. ImageMill) place before the newline. In strict mode, reject headers with trailing bytes after the version string.

rth · 2026-03-21T13:52:42Z

@J-F-Liu this should be ready for a review.

This was referenced Mar 19, 2026

Accept binary bytes on the PDF header line #480

Closed

add LoadOptions and Document::load_with_options #482

Merged

rth force-pushed the fix-header-binary-bytes branch from 1f104cd to d56765e Compare March 21, 2026 13:27

rth force-pushed the fix-header-binary-bytes branch from d56765e to a52c7f9 Compare March 21, 2026 13:31

rth mentioned this pull request Mar 21, 2026

Better support for the non strict mode #484

Open

rth changed the title ~~fix(parser): accept binary bytes on the PDF header line~~ fix: accept binary bytes on the PDF header line in non strict mode Mar 21, 2026

J-F-Liu merged commit 0526740 into J-F-Liu:main Mar 22, 2026
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: accept binary bytes on the PDF header line in non strict mode#481

fix: accept binary bytes on the PDF header line in non strict mode#481
J-F-Liu merged 1 commit intoJ-F-Liu:mainfrom
rth:fix-header-binary-bytes

rth commented Mar 19, 2026

Uh oh!

rth commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rth commented Mar 19, 2026

Uh oh!

rth commented Mar 21, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants