Skip to content

Commit ae73cf8

Browse files
authored
fix: Fix pdfminer error when using process_data_with_model (#178)
When a pdf page doesn't have much data, it may get buffered in the write to a tempfile. If this happens, we'll hit an error reading the file back. This fixes the error by flushing the temp buffer.
1 parent 4b7276a commit ae73cf8

File tree

3 files changed

+6
-1
lines changed

3 files changed

+6
-1
lines changed

CHANGELOG.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,7 @@
1+
## 0.5.12
2+
3+
* Fix a pdfminer error when using `process_data_with_model`
4+
15
## 0.5.11
26

37
* Add warning when chipper is used with < 300 DPI
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.5.11" # pragma: no cover
1+
__version__ = "0.5.12" # pragma: no cover

unstructured_inference/inference/layout.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -373,6 +373,7 @@ def process_data_with_model(
373373
DocumentLayout by using a model identified by model_name."""
374374
with tempfile.NamedTemporaryFile() as tmp_file:
375375
tmp_file.write(data.read())
376+
tmp_file.flush() # Make sure the file is written out
376377
layout = process_file_with_model(
377378
tmp_file.name,
378379
model_name,

0 commit comments

Comments
 (0)