Skip to content

Commit 6866fda

Browse files
authored
fix: use encoding in context class (#4030)
Given the fact that the `_CsvPartitioningContext` defines an `_encoding` property, this property was meant to be used. Behaviorally this change should be a no-op, but supports future efforts where the partitioning context applies internal logic.
1 parent 2aca876 commit 6866fda

File tree

3 files changed

+11
-5
lines changed

3 files changed

+11
-5
lines changed

CHANGELOG.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,18 @@
1-
## 0.18.0
1+
## 0.18.1-dev0
22

33
### Enhancements
44

55
### Features
6-
- Upgraded Python version to 3.12
76

87
### Fixes
8+
- The `encoding` property of the `_CsvPartitioningContext` is now properly used.
99

10-
## 0.17.11-dev3
10+
## 0.18.0
1111

1212
### Enhancements
1313

1414
### Features
15+
- Upgraded Python version to 3.12
1516

1617
### Fixes
1718
- Fix type error when `result_file_type` is expected to be a `FileType` but is `None`

unstructured/__version__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.18.0" # pragma: no cover
1+
__version__ = "0.18.1-dev0" # pragma: no cover

unstructured/partition/csv.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ def partition_csv(
5555
)
5656

5757
with ctx.open() as file:
58-
dataframe = pd.read_csv(file, header=ctx.header, sep=ctx.delimiter, encoding=encoding)
58+
dataframe = pd.read_csv(file, header=ctx.header, sep=ctx.delimiter, encoding=ctx.encoding)
5959

6060
html_table = HtmlTable.from_html_text(
6161
dataframe.to_html(index=False, header=include_header, na_rep="")
@@ -135,6 +135,11 @@ def header(self) -> int | None:
135135
"""Identifies the header row, if any, to Pandas, by idx."""
136136
return 0 if self._include_header else None
137137

138+
@lazyproperty
139+
def encoding(self) -> str | None:
140+
"""The encoding to use for reading the file."""
141+
return self._encoding
142+
138143
@lazyproperty
139144
def last_modified(self) -> str | None:
140145
"""The best last-modified date available, None if no sources are available."""

0 commit comments

Comments
 (0)