Skip to content

Commit 387e71d

Browse files
authored
Open source: remove attachment_partitioner from partition_msg and partition_email (#261)
1 parent a473e2e commit 387e71d

File tree

1 file changed

+4
-12
lines changed

1 file changed

+4
-12
lines changed

open-source/core-functionality/partitioning.mdx

Lines changed: 4 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -229,17 +229,13 @@ elements = partition_email(text=text, include_headers=True)
229229

230230
`partition_email` includes a `max_partition` parameter that indicates the maximum character length for a document element. This parameter only applies if `"text/plain"` is selected as the `content_source`. The default value is `1500`, which roughly corresponds to the average character length for a paragraph. You can disable `max_partition` by setting it to `None`.
231231

232-
You can optionally partition e-mail attachments by setting `process_attachments=True`. If you set `process_attachments=True`, you’ll also need to pass in a partitioning function to `attachment_partitioner`. The following is an example of what the workflow looks like:
232+
You can optionally partition e-mail attachments by setting `process_attachments=True`. The following is an example of what the workflow looks like:
233233

234234
```python
235-
from unstructured.partition.auto import partition
236235
from unstructured.partition.email import partition_email
237236

238237
filename = "example-docs/eml/fake-email-attachment.eml"
239-
elements = partition_email(
240-
filename=filename, process_attachments=True, attachment_partitioner=partition
241-
)
242-
238+
elements = partition_email(filename=filename, process_attachments=True)
243239
```
244240

245241

@@ -377,17 +373,13 @@ elements = partition_msg(filename="example-docs/fake-email.msg")
377373

378374
`partition_msg` includes a `max_partition` parameter that indicates the maximum character length for a document element. This parameter only applies if `"text/plain"` is selected as the `content_source`. The default value is `1500`, which roughly corresponds to the average character length for a paragraph. You can disable `max_partition` by setting it to `None`.
379375

380-
You can optionally partition e-mail attachments by setting `process_attachments=True`. If you set `process_attachments=True`, you’ll also need to pass in a partitioning function to `attachment_partitioner`. The following is an example of what the workflow looks like:
376+
You can optionally partition e-mail attachments by setting `process_attachments=True`. The following is an example of what the workflow looks like:
381377

382378
```python
383-
from unstructured.partition.auto import partition
384379
from unstructured.partition.msg import partition_msg
385380

386381
filename = "example-docs/fake-email-attachment.msg"
387-
elements = partition_msg(
388-
filename=filename, process_attachments=True, attachment_partitioner=partition
389-
)
390-
382+
elements = partition_msg(filename=filename, process_attachments=True)
391383
```
392384

393385

0 commit comments

Comments
 (0)