Skip to content

Commit ac2381f

Browse files
authored
Platinum/VLM strategy: note about PDF 200+ page files (#359)
1 parent ec735cd commit ac2381f

File tree

3 files changed

+17
-0
lines changed

3 files changed

+17
-0
lines changed

platform/partitioning.mdx

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -31,5 +31,8 @@ To choose one of these strategies, select one of the **Partition Strategy** opti
3131
<Note>
3232
During **VLM** processing, any detected files that are not PDFs or images are processed and billed at either the **High Res** or **Fast** rate instead.
3333
Of those non-PDF and non-image files, all text-based files are processed and billed at the **Fast** rate instead. The other files are processed and billed at the **High Res** rate instead.
34+
35+
When you use the **VLM** strategy with embeddings for PDF files of 200 or more pages, you might notice some errors when
36+
these files are processed. These errors typically occur when these larger PDF files have lots of tables and high-resolution images.
3437
</Note>
3538

platform/workflows.mdx

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -121,6 +121,9 @@ To create an automatic workflow:
121121
<Note>
122122
During **Platinum** processing, any detected files that are not PDFs or images are processed and billed at either the **Advanced** or **Basic** rate instead.
123123
Of those non-PDF and non-image files, all text-based files are processed and billed at the **Basic** rate instead. The other files are processed and billed at the **Advanced** rate instead.
124+
125+
When you use the **Platinum** strategy for PDF files of 200 or more pages, you might notice some errors when
126+
these files are processed. These errors typically occur when these larger PDF files have lots of tables and high-resolution images.
124127
</Note>
125128

126129
The **Platinum** option uses the following preconfigured workflow settings:
@@ -210,6 +213,9 @@ There are two ways to create a custom workflow:
210213
<Note>
211214
During **VLM** processing, any detected files that are not PDFs or images are processed and billed at either the **High Res** or **Fast** rate instead.
212215
Of those non-PDF and non-image files, all text-based files are processed and billed at the **Fast** rate instead. The other files are processed and billed at the **High Res** rate instead.
216+
217+
When you use the **VLM** strategy with embeddings for PDF files of 200 or more pages, you might notice some errors when
218+
these files are processed. These errors typically occur when these larger PDF files have lots of tables and high-resolution images.
213219
</Note>
214220

215221
[Learn more](/platform/partitioning).
@@ -398,6 +404,9 @@ There are two ways to create a custom workflow:
398404
<Note>
399405
During **VLM** processing, any detected files that are not PDFs or images are processed and billed at either the **High Res** or **Fast** rate instead.
400406
Of those non-PDF and non-image files, all text-based files are processed and billed at the **Fast** rate instead. The other files are processed and billed at the **High Res** rate instead.
407+
408+
When you use the **VLM** strategy with embeddings for PDF files of 200 or more pages, you might notice some errors when
409+
these files are processed. These errors typically occur when these larger PDF files have lots of tables and high-resolution images.
401410
</Note>
402411

403412
[Learn more](/platform/partitioning).

snippets/quickstarts/platform.mdx

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -99,6 +99,11 @@ allowfullscreen
9999
- **Platinum**: For your most challenging documents, including scanned and handwritten content. It uses vision language models (VLMs).
100100
During processing, files that are not PDFs or images are processed by using the **Advanced** strategy and are charged at the **Advanced** rate instead.
101101

102+
<Note>
103+
When you use the **Platinum** strategy for PDF files of 200 or more pages, you might notice some errors when
104+
these files are processed. These errors typically occur when these larger PDF files have lots of tables and high-resolution images.
105+
</Note>
106+
102107
9. The **Reprocess all** box applies only to the Amazon S3 and Azure Blob Storage source connectors:
103108

104109
- Checking this box reprocesses all documents in the source location on every workflow run.

0 commit comments

Comments
 (0)