Skip to content

Commit 555b95b

Browse files
authored
Fixing test for unstructured-api (#425)
Ran into an error in tests for unstructured-api (see below for output). Somewhere along the lines we were reading a txt file into bytes and then the PARAGRAPH_PATTERN (a string) was not able to be compared to the bytes file.
1 parent 533241c commit 555b95b

File tree

4 files changed

+21
-1
lines changed

4 files changed

+21
-1
lines changed

CHANGELOG.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,13 @@
1+
## 0.5.9
2+
3+
### Enhancements
4+
5+
### Features
6+
7+
### Fixes
8+
9+
* Convert file to str in helper `split_by_paragraph` for `partition_text`
10+
111
## 0.5.8
212

313
### Enhancements

test_unstructured/partition/test_text.py

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -50,6 +50,14 @@ def test_partition_text_from_file():
5050
assert elements == EXPECTED_OUTPUT
5151

5252

53+
def test_partition_text_from_bytes_file():
54+
filename = os.path.join(DIRECTORY, "..", "..", "example-docs", "fake-text.txt")
55+
with open(filename, "rb") as f:
56+
elements = partition_text(file=f)
57+
assert len(elements) > 0
58+
assert elements == EXPECTED_OUTPUT
59+
60+
5361
def test_partition_text_from_text():
5462
filename = os.path.join(DIRECTORY, "..", "..", "example-docs", "fake-text.txt")
5563
with open(filename) as f:

unstructured/__version__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.5.8" # pragma: no cover
1+
__version__ = "0.5.9" # pragma: no cover

unstructured/partition/text.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -58,6 +58,8 @@ def partition_text(
5858

5959
elif file is not None:
6060
file_text = file.read()
61+
if isinstance(file_text, bytes):
62+
file_text = file_text.decode(encoding)
6163

6264
elif text is not None:
6365
file_text = str(text)

0 commit comments

Comments
 (0)