You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: platform/workflows.mdx
+17-3Lines changed: 17 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -55,7 +55,11 @@ To create an automatic workflow:
55
55
-**Basic** is a good choice if you have text-only documents that have no images or tables in them.
56
56
-**Advanced** is a good choice if you have complex documents that have images or tables or both in them.
57
57
58
-
9. If you want to overwrite any files in the destination location that might have been previously processed, check the **Reprocess all** box.
58
+
9. The **Reprocess all** box applies only to the Amazon S3 and Azure Blob Storage source connectors:
59
+
60
+
- Checking this box reprocesses all documents in the source location on every workflow run.
61
+
- Unchecking this box causes only new documents that are added to the source location since the last workflow run to be processed on future runs. Previously processed documents are not processed again, even if those documents' contents change.
62
+
59
63
10. If you want to retry processing any documents that failed to process, check the **Retry Failed Documents** box.
60
64
11. Click **Continue**.
61
65
12. If you want this workflow to run on a schedule, in the **Repeat Run** dropdown list, select one of the scheduling options, and fill in the scheduling settings. Otherwise, select **Don't repeat**.
@@ -186,8 +190,12 @@ There are two ways to create a custom workflow:
186
190
-[Embedding overview](/platform/embedding)
187
191
-[Understanding embedding models: make an informed choice for your RAG](https://unstructured.io/blog/understanding-embedding-models-make-an-informed-choice-for-your-rag).
188
192
189
-
17. Check the **Reprocess all** box if you want to overwrite any files in the destination location that might have been previously processed,
190
-
18. Check the **Retry Failed Documents** box if you want to retry processing any documents that failed to process,
193
+
17. The **Reprocess all** box applies only to the Amazon S3 and Azure Blob Storage source connectors:
194
+
195
+
- Checking this box reprocesses all documents in the source location on every workflow run.
196
+
- Unchecking this box causes only new documents that are added to the source location since the last workflow run to be processed on future runs. Previously processed documents are not processed again, even if those documents' contents change.
197
+
198
+
18. Check the **Retry Failed Documents** box if you want to retry processing any documents that failed to process.
191
199
19. Click **Continue**.
192
200
20. If you want this workflow to run on a schedule, in the **Repeat Run** dropdown list, select one of the scheduling options, and fill in the scheduling settings. Otherwise, select **Don't repeat**.
193
201
21. Click **Complete**.
@@ -212,6 +220,12 @@ There are two ways to create a custom workflow:
212
220
5. Next to **Name**, click the pencil icon, enter some unique name for this workflow, and then click the check mark icon.
213
221
6. If you want this workflow to run on a schedule, click the **Schedule** button. In the **Repeat Run** dropdown list, select one of the scheduling options, and fill in the scheduling settings.
214
222
7. To overwrite any previously processed files, or to retry any documents that fail to process, click the **Settings** button, and check either or both of the boxes.
223
+
224
+
The **Reprocess all** box applies only to the Amazon S3 and Azure Blob Storage source connectors:
225
+
226
+
- Checking this box reprocesses all documents in the source location on every workflow run.
227
+
- Unchecking this box causes only new documents that are added to the source location since the last workflow run to be processed on future runs. Previously processed documents are not processed again, even if those documents' contents change.
228
+
215
229
8. In the pipeline designer, click the **Source** node. In the **Source** pane, select the source location. Then click **Save**.
Copy file name to clipboardExpand all lines: snippets/quickstarts/platform.mdx
+5-1Lines changed: 5 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -77,7 +77,11 @@ allowfullscreen
77
77
-**Basic** is a good choice if you have text-only documents that have no images or tables in them.
78
78
-**Advanced** is a good choice if you have complex documents that have images or tables or both in them.
79
79
80
-
9. If you want to overwrite any files in the destination location that might have been previously processed, check the **Reprocess all** box.
80
+
9. The **Reprocess all** box applies only to the Amazon S3 and Azure Blob Storage source connectors:
81
+
82
+
- Checking this box reprocesses all documents in the source location on every workflow run.
83
+
- Unchecking this box causes only new documents that are added to the source location since the last workflow run to be processed on future runs. Previously processed documents are not processed again, even if those documents' contents change.
84
+
81
85
10. If you want to retry processing any documents that failed to process, check the **Retry Failed Documents** box.
82
86
11. Click **Continue**.
83
87
12. If you want this workflow to run on a schedule, in the **Repeat Run** dropdown list, select one of the scheduling options, and fill in the scheduling settings. Otherwise, select **Don't repeat**.
0 commit comments