Skip to content

Commit 09a8cdf

Browse files
Merge pull request #283931 from jaep3347/patch-43
Update concept-add-on-capabilities.md
2 parents dae4aa0 + 6e307fd commit 09a8cdf

File tree

1 file changed

+44
-1
lines changed

1 file changed

+44
-1
lines changed

articles/ai-services/document-intelligence/concept-add-on-capabilities.md

Lines changed: 44 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,11 @@ Document Intelligence supports more sophisticated and modular analysis capabilit
4848

4949
* [`languages`](#language-detection)
5050

51+
Starting with `2024-07-31-preview` release, the Read model supports searchable PDF output:
52+
53+
* [`Searchable PDF](#searchable-pdf)
54+
55+
5156
:::moniker-end
5257

5358
:::moniker range="doc-intel-4.0.0"
@@ -58,7 +63,7 @@ Document Intelligence supports more sophisticated and modular analysis capabilit
5863
>
5964
> * Add-on capabilities are currently not supported for Microsoft Office file types.
6065
61-
The following add-on capabilities are available for`2024-02-29-preview`, `2024-02-29-preview`, and later releases:
66+
Document Intelligence supports optional features that can be enabled and disabled depending on the document extraction scenario. The following add-on capabilities are available for `2023-10-31-preview`, and later releases:
6267

6368
* [`keyValuePairs`](#key-value-pairs)
6469

@@ -927,6 +932,44 @@ for lang_idx, lang in enumerate(result.languages):
927932

928933
::: moniker range="doc-intel-4.0.0"
929934

935+
## Searchable PDF
936+
937+
The searchable PDF capability enables you to convert an analog PDF, such as scanned-image PDF files, to a PDF with embedded text. The embedded text enables deep text search within the PDF's extracted content by overlaying the detected text entities on top of the image files.
938+
939+
> [!IMPORTANT]
940+
>
941+
> * Currently, the searchable PDF capability is only supported by Read OCR model `prebuilt-read`. When using this feature, please specify the `modelId` as `prebuilt-read`, as other model types will return error for this preview version.
942+
> * Searchable PDF is included with the 2024-07-31-preview `prebuilt-read` model with no usage cost for general PDF consumption.
943+
944+
### Use searchable PDF
945+
946+
To use searchable PDF, make a `POST` request using the `Analyze` operation and specify the output format as `pdf`:
947+
948+
```bash
949+
950+
POST /documentModels/prebuilt-read:analyze?output=pdf
951+
{...}
952+
202
953+
```
954+
955+
Once the `Analyze` operation is complete, make a `GET` request to retrieve the `Analyze` operation results.
956+
957+
Upon successful completion, the PDF can be retrieved and downloaded as `application/pdf`. This operation allows direct downloading of the embedded text form of PDF instead of Base64-encoded JSON.
958+
959+
```bash
960+
961+
// Monitor the operation until completion.
962+
GET /documentModels/prebuilt-read/analyzeResults/{resultId}
963+
200
964+
{...}
965+
966+
// Upon successful completion, retrieve the PDF as application/pdf.
967+
GET /documentModels/prebuilt-read/analyzeResults/{resultId}/pdf
968+
200 OK
969+
Content-Type: application/pdf
970+
```
971+
972+
930973
## Key-value Pairs
931974
932975
In earlier API versions, the prebuilt-document model extracted key-value pairs from forms and documents. With the addition of the `keyValuePairs` feature to prebuilt-layout, the layout model now produces the same results.

0 commit comments

Comments
 (0)