Skip to content

Commit 8317021

Browse files
authored
fix: remove parent_id until backwards compat bug is addressed (#252)
Original issue: #237 Core library fix: Unstructured-IO/unstructured#1526 Anyone who calls `partition_via_api` will hit this bug until they upgrade `unstructured`, which includes any Langchain users of `UnstructuredAPIFileLoader`. The immediate fix is to remove `parent_id` from the hosted api. Next, we can ensure that [langchain users](langchain-ai/langchain#11025) are up to date. Finally, the core library fix above will address any new fields going forward. It will be safe to readd the `parent_id` once users are generally on `unstructured>=0.10.15`.
1 parent b1a6cae commit 8317021

File tree

2 files changed

+7
-1
lines changed

2 files changed

+7
-1
lines changed

CHANGELOG.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,7 @@
1-
## 0.0.47-dev0
1+
## 0.0.47-dev1
22

33
* **Adds `chunking_strategy` kwarg and associated params** These params allow users to "chunk" elements into larger or smaller `CompositeElement`s
4+
* **Remove `parent_id` from the element metadata**. New metadata fields are causing errors with existing installs. We'll readd this once a fix is widely available.
45

56
## 0.0.46
67

prepline_general/api/general.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -436,6 +436,11 @@ def pipeline_api(
436436
if element.metadata.detection_class_prob:
437437
elements[i].metadata.detection_class_prob = None
438438

439+
# Remove this until new md fields aren't breaking users
440+
# See https://github.com/Unstructured-IO/unstructured/pull/1526
441+
if element.metadata.parent_id:
442+
elements[i].metadata.parent_id = None
443+
439444
if response_type == "text/csv":
440445
df = convert_to_dataframe(elements)
441446
return df.to_csv(index=False)

0 commit comments

Comments
 (0)