Skip to content

Commit 4b827f0

Browse files
fix: local connector output filename when a single file is being processed (#879)
* fix string processing error for _output_filename * Add docstring and type hint, update CHANGELOG, update version * update test fixture * simple code change commit to retrigger ci checks * update test fixture - after brew install tesseract-lang * Update ingest test fixtures (#882) Co-authored-by: ahmetmeleq <[email protected]> * correct CHANGELOG * correct CHANGELOG --------- Co-authored-by: Unstructured-DevOps <[email protected]> Co-authored-by: ahmetmeleq <[email protected]>
1 parent 24dad24 commit 4b827f0

File tree

5 files changed

+15
-6
lines changed

5 files changed

+15
-6
lines changed

.pre-commit-config.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,7 @@ repos:
88
- id: check-json
99
- id: check-xml
1010
- id: end-of-file-fixer
11+
exclude: \.json$
1112
include: \.py$
1213
- id: trailing-whitespace
1314
- id: mixed-line-ending

CHANGELOG.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
## 0.8.0-dev0
1+
## 0.8.0-dev1
22

33
### Enhancements
44

@@ -8,6 +8,7 @@
88

99
### Fixes
1010
* Fix KeyError when `isd_to_elements` doesn't find a type
11+
* Fix _output_filename for local connector, allowing single files to be written correctly to the disk
1112

1213
### BREAKING CHANGES
1314

test_unstructured_ingest/expected-structured-output/local-single-file/example-docs/english-and-korean.png.json renamed to test_unstructured_ingest/expected-structured-output/local-single-file/english-and-korean.png.json

File renamed without changes.

unstructured/__version__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.8.0-dev0" # pragma: no cover
1+
__version__ = "0.8.0-dev1" # pragma: no cover

unstructured/ingest/connector/local.py

Lines changed: 11 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,11 +51,18 @@ def get_file(self):
5151
pass
5252

5353
@property
54-
def _output_filename(self):
55-
return (
56-
Path(self.standard_config.output_dir)
57-
/ f"{self.path.replace(f'{self.config.input_path}/', '')}.json"
54+
def _output_filename(self) -> Path:
55+
"""Returns output filename for the doc
56+
If input path argument is a file itself, it returns the filename of the doc.
57+
If input path argument is a folder, it returns the relative path of the doc.
58+
"""
59+
input_path = Path(self.config.input_path)
60+
basename = (
61+
f"{Path(self.path).name}.json"
62+
if input_path.is_file()
63+
else f"{Path(self.path).relative_to(input_path)}.json"
5864
)
65+
return Path(self.standard_config.output_dir) / basename
5966

6067

6168
class LocalConnector(BaseConnector):

0 commit comments

Comments
 (0)