Skip to content

Commit bdc27e6

Browse files
authored
Add missing Fast and High Res partition settings options (#781)
1 parent a74ab7a commit bdc27e6

File tree

1 file changed

+14
-0
lines changed

1 file changed

+14
-0
lines changed

ui/workflows.mdx

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -257,6 +257,20 @@ import DeprecatedModelsUI from '/snippets/general-shared-text/deprecated-models-
257257
these files are processed. These errors typically occur when these larger PDF files have lots of tables and high-resolution images.
258258
</Note>
259259

260+
If you choose the **Fast** strategy, you can also choose from among the following additional settings:
261+
262+
- **Include Page breaks**: Check this box to include distinct `PageBreak` document elements in the output, if the file type supports it.
263+
- **Infer Table Structure**: Check this box to add, for each table in a PDF file, a metadata field named `text_as_html` to the output for that table's document element. This field will contain an HTML representation of the table.
264+
- **Elements to Exclude**: Select the name of each available type of [document element](/ui/document-elements) to exclude from the output.
265+
266+
If you choose the **High Res** strategy, you can also choose from among the following additional settings:
267+
268+
- **Include Page breaks**: Check this box to include distinct `PageBreak` document elements in the output, if the file type supports it.
269+
- **Infer Table Structure**: Check this box to add, for each table in a PDF file, a metadata field named `text_as_html` to the output for that table's document element. This field will contain an HTML representation of the table.
270+
- **Include Coordinates**: Check this box to add, for each [document element](/ui/document-elements) in the output, a metadata field named `coordinates` to the output for that document element. This field will contain the bounding box coordinates of the document element's content on the page, as well as the bounding box's width and height in pixels.
271+
- **Extract Image Block Types**: Select the name of each available type of document element to add a metadata field named `image_base64` to the output for that document element. This field will contain a Base64-encoded representation of the document element's content. A Base64-to-image decoding of this field's value will return an image representing the document element's original content.
272+
- **Elements to Exclude**: Select the name of each available type of document element to exclude from the output.
273+
260274
[Learn more](/ui/partitioning).
261275
</Accordion>
262276
<Accordion title="Chunker node">

0 commit comments

Comments
 (0)