Skip to content

Commit 9dcd4c7

Browse files
Merge pull request #2824 from laujan/patch-1
Update read.md
2 parents 83c3ae2 + e9928e6 commit 9dcd4c7

File tree

1 file changed

+162
-17
lines changed
  • articles/ai-services/document-intelligence/prebuilt

1 file changed

+162
-17
lines changed

articles/ai-services/document-intelligence/prebuilt/read.md

Lines changed: 162 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ ms.author: lajanuar
2626

2727
> [!NOTE]
2828
>
29-
> For extracting text from external images like labels, street signs, and posters, use the [Azure AI Image Analysis v4.0 Read](../../Computer-vision/concept-ocr.md) feature optimized for general, non-document images with a performance-enhanced synchronous API that makes it easier to embed OCR in real-time user experience scenarios.
29+
> To extract text from external images like labels, street signs, and posters, use the [Azure AI Image Analysis v4.0 Read](../../Computer-vision/concept-ocr.md) feature optimized for general (not document) images with a performance-enhanced synchronous API. This capability makes it easier to embed OCR in real-time user experience scenarios.
3030
>
3131
3232
Document Intelligence Read Optical Character Recognition (OCR) model runs at a higher resolution than Azure AI Vision Read and extracts print and handwritten text from PDF documents and scanned images. It also includes support for extracting text from Microsoft Word, Excel, PowerPoint, and HTML documents. It detects paragraphs, text lines, words, locations, and languages. The Read model is the underlying OCR engine for other Document Intelligence prebuilt models like Layout, General Document, Invoice, Receipt, Identity (ID) document, Health insurance card, W2 in addition to custom models.
@@ -83,11 +83,11 @@ See our [Language Support—document analysis models](../language-support/ocr.md
8383
## Data extraction (v4)
8484

8585
> [!NOTE]
86-
> Microsoft Word and HTML file are supported in v4.0. Compared with PDF and images, below features are not supported:
86+
> Microsoft Word and HTML file are supported in v4.0. The following capabilities are currently not supported:
8787
>
88-
> * There are no angle, width/height and unit with each page object.
89-
> * For each object detected, there is no bounding polygon or bounding region.
90-
> * Page range (`pages`) is not supported as a parameter.
88+
> * No angle, width/height, and unit returned with each page object.
89+
> * No bounding polygon or bounding region for each object detected.
90+
> * No page range (`pages`) as a parameter returned.
9191
> * No `lines` object.
9292
9393
## Searchable PDFs
@@ -96,16 +96,16 @@ The searchable PDF capability enables you to convert an analog PDF, such as scan
9696

9797
> [!IMPORTANT]
9898
>
99-
> * Currently, the searchable PDF capability is only supported by Read OCR model `prebuilt-read`. When using this feature, please specify the `modelId` as `prebuilt-read`, as other model types will return error for this preview version.
100-
> * Searchable PDF is included with the 2024-11-30 GA `prebuilt-read` model with no additional cost for generating a searchable PDF output.
99+
> * Currently, only the Read OCR model `prebuilt-read` supports the searchable PDF capability. When using this feature, specify the `modelId` as `prebuilt-read`. Other model types return an error for this preview version.
100+
> * Searchable PDF is included with the `2024-11-30` GA `prebuilt-read` model with no added cost for generating a searchable PDF output.
101101
102102
### Use searchable PDFs
103103

104104
To use searchable PDF, make a `POST` request using the `Analyze` operation and specify the output format as `pdf`:
105105

106106
```bash
107107

108-
POST /documentModels/prebuilt-read:analyze?output=pdf
108+
POST {endpoint}/documentintelligence/documentModels/prebuilt-read:analyze?_overload=analyzeDocument&api-version=2024-11-30&output=pdf
109109
{...}
110110
202
111111
```
@@ -122,7 +122,152 @@ Upon successful completion, the PDF can be retrieved and downloaded as `applicat
122122
{...}
123123

124124
// Upon successful completion, retrieve the PDF as application/pdf.
125-
GET /documentModels/prebuilt-read/analyzeResults/{resultId}/pdf
125+
GET {endpoint}/documentintelligence/documentModels/prebuilt-read/analyzeResults/{resultId}/pdf?api-version=2024-11-30
126+
URI Parameters
127+
Name In Required Type Description
128+
endpoint path True
129+
string
130+
131+
uri
132+
The Document Intelligence service endpoint.
133+
134+
modelId path True
135+
string
136+
137+
Unique document model name.
138+
139+
Regex pattern: ^[a-zA-Z0-9][a-zA-Z0-9._~-]{1,63}$
140+
141+
resultId path True
142+
string
143+
144+
uuid
145+
Analyze operation result ID.
146+
147+
api-version query True
148+
string
149+
150+
The API version to use for this operation.
151+
152+
Responses
153+
Name Type Description
154+
200 OK
155+
file
156+
157+
The request has succeeded.
158+
159+
Media Types: "application/pdf", "application/json"
160+
161+
Other Status Codes
162+
DocumentIntelligenceErrorResponse
163+
164+
An unexpected error response.
165+
166+
Media Types: "application/pdf", "application/json"
167+
168+
Security
169+
Ocp-Apim-Subscription-Key
170+
Type: apiKey
171+
In: header
172+
173+
OAuth2Auth
174+
Type: oauth2
175+
Flow: accessCode
176+
Authorization URL: https://login.microsoftonline.com/common/oauth2/authorize
177+
Token URL: https://login.microsoftonline.com/common/oauth2/token
178+
179+
Scopes
180+
Name Description
181+
https://cognitiveservices.azure.com/.default
182+
Examples
183+
Get Analyze Document Result PDF
184+
Sample request
185+
HTTP
186+
HTTP
187+
188+
Copy
189+
GET https://myendpoint.cognitiveservices.azure.com/documentintelligence/documentModels/prebuilt-invoice/analyzeResults/3b31320d-8bab-4f88-b19c-2322a7f11034/pdf?api-version=2024-11-30
190+
Sample response
191+
Status code:
192+
200
193+
JSON
194+
195+
Copy
196+
"{pdfBinary}"
197+
Definitions
198+
Name Description
199+
DocumentIntelligenceError
200+
The error object.
201+
202+
DocumentIntelligenceErrorResponse
203+
Error response object.
204+
205+
DocumentIntelligenceInnerError
206+
An object containing more specific information about the error.
207+
208+
DocumentIntelligenceError
209+
The error object.
210+
211+
Name Type Description
212+
code
213+
string
214+
215+
One of a server-defined set of error codes.
216+
217+
details
218+
DocumentIntelligenceError[]
219+
220+
An array of details about specific errors that led to this reported error.
221+
222+
innererror
223+
DocumentIntelligenceInnerError
224+
225+
An object containing more specific information than the current object about the error.
226+
227+
message
228+
string
229+
230+
A human-readable representation of the error.
231+
232+
target
233+
string
234+
235+
The target of the error.
236+
237+
DocumentIntelligenceErrorResponse
238+
Error response object.
239+
240+
Name Type Description
241+
error
242+
DocumentIntelligenceError
243+
244+
Error info.
245+
246+
DocumentIntelligenceInnerError
247+
An object containing more specific information about the error.
248+
249+
Name Type Description
250+
code
251+
string
252+
253+
One of a server-defined set of error codes.
254+
255+
innererror
256+
DocumentIntelligenceInnerError
257+
258+
Inner error.
259+
260+
message
261+
string
262+
263+
A human-readable representation of the error.
264+
265+
In this article
266+
URI Parameters
267+
Responses
268+
Security
269+
Examples
270+
126271
200 OK
127272
Content-Type: application/pdf
128273
```
@@ -294,7 +439,7 @@ Find more samples on GitHub:
294439
295440
> [!NOTE]
296441
>
297-
> For extracting text from external images like labels, street signs, and posters, use the [Azure AI Image Analysis v4.0 Read](../..//Computer-vision/concept-ocr.md) feature optimized for general, non-document images with a performance-enhanced synchronous API that makes it easier to embed OCR in your user experience scenarios.
442+
> To extract text from external images like labels, street signs, and posters, use the [Azure AI Image Analysis v4.0 Read](../../Computer-vision/concept-ocr.md) feature optimized for general (not document) images with a performance-enhanced synchronous API. This capability makes it easier to embed OCR in real-time user experience scenarios.
298443
>
299444
300445
Document Intelligence Read Optical Character Recognition (OCR) model runs at a higher resolution than Azure AI Vision Read and extracts print and handwritten text from PDF documents and scanned images. It also includes support for extracting text from Microsoft Word, Excel, PowerPoint, and HTML documents. It detects paragraphs, text lines, words, locations, and languages. The Read model is the underlying OCR engine for other Document Intelligence prebuilt models like Layout, General Document, Invoice, Receipt, Identity (ID) document, Health insurance card, W2 in addition to custom models.
@@ -368,11 +513,11 @@ See our [Language Support—document analysis models](../language-support/ocr.md
368513
## Data extraction
369514
370515
> [!NOTE]
371-
> Microsoft Word and HTML file are supported in v3.1 and later versions. Compared with PDF and images, below features are not supported:
516+
> Microsoft Word and HTML file are supported in v4.0. The following capabilities are currently not supported:
372517
>
373-
> * There are no angle, width/height and unit with each page object.
374-
> * For each object detected, there is no bounding polygon or bounding region.
375-
> * Page range (`pages`) is not supported as a parameter.
518+
> * No angle, width/height, and unit returned with each page object.
519+
> * No bounding polygon or bounding region for each object detected.
520+
> * No page range (`pages`) as a parameter returned.
376521
> * No `lines` object.
377522
378523
## Searchable PDF
@@ -381,9 +526,9 @@ The searchable PDF capability enables you to convert an analog PDF, such as scan
381526
382527
> [!IMPORTANT]
383528
>
384-
> * Currently, the searchable PDF capability is only supported by Read OCR model `prebuilt-read`. When using this feature, please specify the `modelId` as `prebuilt-read`, as other model types will return an error.
385-
> * Searchable PDF is included with the 2024-11-30 `prebuilt-read` model with no additional cost for generating a searchable PDF output.
386-
> * Searchable PDF currently only supports PDF files as input. Support for other file types, such as image files, will be available later.
529+
> * Currently, only Read OCR model `prebuilt-read` supports the searchable PDF capability. When using this feature, specify the `modelId` as `prebuilt-read`. Other model types return an error.
530+
> * Searchable PDF is included with the `2024-11-30` `prebuilt-read` model with no added cost for generating a searchable PDF output.
531+
> * Searchable PDF currently only supports PDF files as input.
387532
388533
### Use searchable PDF
389534

0 commit comments

Comments
 (0)