Skip to content

fix(stream): return raw content for uncompressed streams in decompressed_content#483

Merged
J-F-Liu merged 1 commit intoJ-F-Liu:mainfrom
abimaelmartell:firecrawl/fix-uncompressed-stream
Mar 21, 2026
Merged

fix(stream): return raw content for uncompressed streams in decompressed_content#483
J-F-Liu merged 1 commit intoJ-F-Liu:mainfrom
abimaelmartell:firecrawl/fix-uncompressed-stream

Conversation

@abimaelmartell
Copy link
Copy Markdown
Contributor

Summary

When a stream has no /Filter entry, decompressed_content() would fail because filters() propagates the dictionary lookup error. Unfiltered streams are already uncompressed, so the correct behavior is to return the raw content as-is.

This fixes text extraction from Form XObjects generated by pdfrw and similar tools that use uncompressed streams (no /Filter, just raw content like "/FullPage Do").

Thanks!

…sed_content()

When a stream has no /Filter entry, decompressed_content() would fail
because filters() propagates the dictionary lookup error. Unfiltered
streams are already uncompressed, so the correct behavior is to return
the raw content as-is.

This fixes text extraction from Form XObjects generated by pdfrw and
similar tools that use uncompressed streams (no /Filter, just raw
content like "/FullPage Do").
@J-F-Liu J-F-Liu merged commit e7eb667 into J-F-Liu:main Mar 21, 2026
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants