Skip to content

Commit ec735cd

Browse files
authored
Platform: Preconfigured workflow settings (#361)
1 parent 36899f5 commit ec735cd

File tree

5 files changed

+97
-16
lines changed

5 files changed

+97
-16
lines changed

platform/chunking.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ Here are a few examples:
6161

6262
The following sections provide information about the available chunking strategies and their settings.
6363

64-
<Note>You can change a workflow's predefined strategy only through [Custom](/platform/workflows#create-a-custom-workflow) workflow settings.</Note>
64+
<Note>You can change a workflow's preconfigured strategy only through [Custom](/platform/workflows#create-a-custom-workflow) workflow settings.</Note>
6565

6666
## Basic chunking strategy
6767

platform/embedding.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -61,7 +61,7 @@ on Hugging Face:
6161

6262
To generate embeddings, choose one of the following embedding providers and models in the **Providers** section of an **Embedder** node in a workflow:
6363

64-
<Note>You can change a workflow's predefined provider only through [Custom](/platform/workflows#create-a-custom-workflow) workflow settings.</Note>
64+
<Note>You can change a workflow's preconfigured provider only through [Custom](/platform/workflows#create-a-custom-workflow) workflow settings.</Note>
6565

6666
- **OpenAI**: Use [OpenAI](https://openai.com) to generate embeddings. Also, choose the model to use:
6767

platform/partitioning.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,7 +17,7 @@ For example, the **Fast** strategy can be about 100 times faster than leading im
1717

1818
To choose one of these strategies, select one of the **Partition Strategy** options in the **Partitioner** node of a workflow:
1919

20-
<Note>You can change a workflow's predefined strategy only through [Custom](/platform/workflows#create-a-custom-workflow) workflow settings.</Note>
20+
<Note>You can change a workflow's preconfigured strategy only through [Custom](/platform/workflows#create-a-custom-workflow) workflow settings.</Note>
2121

2222
- **Fast**: This strategy is ideal for simple, text-based documents.
2323
- **High Res**: This strategy is best for PDFs, images, and complex file types.

platform/workflows.mdx

Lines changed: 93 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -50,21 +50,102 @@ To create an automatic workflow:
5050
<Note>You can select multiple source and destination locations. Files will be ingested from all of the selected source locations, and the processed data will be delivered to all of the selected destination locations.</Note>
5151

5252
7. Click **Continue**.
53-
8. In the **Optimize for** section, select the option to choose one of these predefined workflow settings groups:
53+
8. In the **Optimize for** section, select the option to choose one of these preconfigured workflow settings groups. Expand any or all
54+
of the following options to learn more about these preconfigured settings:
5455

55-
- **Basic** Ideal for simple, text-only documents.
56-
- **Advanced** Best for PDFs, images, and complex file types.
56+
<AccordionGroup>
57+
<Accordion title="Basic">
58+
This option is ideal for simple, text-only documents.
5759

58-
<Note>
59-
During **Advanced** processing, any detected text-based files are processed and billed at the **Basic** rate instead.
60-
</Note>
61-
62-
- **Platinum** For your most challenging documents, including scanned and handwritten content.
60+
The **Basic** option uses the following preconfigured workflow settings:
6361

64-
<Note>
65-
During **Platinum** processing, any detected files that are not PDFs or images are processed and billed at either the **Advanced** or **Basic** rate instead.
66-
Of those non-PDF and non-image files, all text-based files are processed and billed at the **Basic** rate instead. The other files are processed and billed at the **Advanced** rate instead.
67-
</Note>
62+
- **Strategy**: Fast
63+
- **Image Summarizer**: None
64+
- **Table Summarizer**: None
65+
- **Include Page Breaks**: No
66+
- **Infer Table Structure**: No
67+
- **Elements to Exclude**: None
68+
- **Chunk**:
69+
70+
- **Chunker Type**: Chunk By Character
71+
- **Chunk Options**:
72+
73+
- **Include Original Elements**: No
74+
- **Max Characters**: 2048
75+
- **New After N Characters**: 1500
76+
- **Overlap**: 160
77+
- **Overlap All**: No
78+
79+
- **Embed**:
80+
81+
- **Provider**: Azure OpenAI
82+
- **Model**: text-embedding-3-large (3072 dimensions)
83+
84+
</Accordion>
85+
<Accordion title="Advanced">
86+
This option is best for PDFs, images, and complex file types.
87+
88+
<Note>
89+
During **Advanced** processing, any detected text-based files are processed and billed at the **Basic** rate instead.
90+
</Note>
91+
92+
The **Advanced** option uses the following preconfigured workflow settings:
93+
94+
- **Strategy**: High-Res
95+
- **Image Summarizer**: GPT-4o
96+
- **Table Summarizer**: Claude 3.5 Sonnet
97+
- **Include Page Breaks**: No
98+
- **Infer Table Structure**: No
99+
- **Elements to Exclude**: None
100+
- **Chunk**:
101+
102+
- **Chunker Type**: Chunk By Title
103+
- **Chunk Options**:
104+
105+
- **Combine Text Under N Characters**: 0
106+
- **Include Original Elements**: No
107+
- **Max Characters**: 2048
108+
- **New After N Characters**: 1500
109+
- **Overlap**: 160
110+
- **Overlap All**: No
111+
112+
- **Embed**:
113+
114+
- **Provider**: Azure OpenAI
115+
- **Model**: text-embedding-3-large (3072 dimensions)
116+
117+
</Accordion>
118+
<Accordion title="Platinum">
119+
This option is for your most challenging documents, including scanned and handwritten content.
120+
121+
<Note>
122+
During **Platinum** processing, any detected files that are not PDFs or images are processed and billed at either the **Advanced** or **Basic** rate instead.
123+
Of those non-PDF and non-image files, all text-based files are processed and billed at the **Basic** rate instead. The other files are processed and billed at the **Advanced** rate instead.
124+
</Note>
125+
126+
The **Platinum** option uses the following preconfigured workflow settings:
127+
128+
- **Strategy**: VLM
129+
- **Chunk**:
130+
131+
- **Chunker Type**: Chunk By Title
132+
- **Chunk Options**:
133+
134+
- **Combine Text Under N Characters**: 0
135+
- **Include Original Elements**: No
136+
- **Max Characters**: 2048
137+
= **Multipage Sections**: No
138+
- **New After N Characters**: 1500
139+
- **Overlap**: 160
140+
- **Overlap All**: No
141+
142+
- **Embed**:
143+
144+
- **Provider**: Azure OpenAI
145+
- **Model**: text-embedding-3-large (3072 dimensions)
146+
147+
</Accordion>
148+
</AccordionGroup>
68149

69150
9. The **Reprocess all** box applies only to the Amazon S3 and Azure Blob Storage source connectors:
70151

snippets/quickstarts/platform.mdx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ allowfullscreen
9292
<Note>You can select multiple source and destination locations. Files will be ingested from all of the selected source locations, and the processed data will be delivered to all of the selected destination locations.</Note>
9393

9494
7. Click **Continue**.
95-
8. In the **Optimize for** section, select the option to choose one of these predefined workflow settings groups:
95+
8. In the **Optimize for** section, select the option to choose one of these preconfigured workflow settings groups:
9696

9797
- **Basic**: Ideal for simple, text-only documents.
9898
- **Advanced**: Best for PDFs, images, and complex file types.

0 commit comments

Comments
 (0)