Skip to content

Commit a491c56

Browse files
authored
[DocumentIntelligence] 1.0.0: finishing docs and samples updates (Azure#47585)
1 parent 54ccc80 commit a491c56

21 files changed

+142
-145
lines changed

sdk/documentintelligence/Azure.AI.DocumentIntelligence/CHANGELOG.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Release History
22

3-
## 1.0.0 (2024-12-17)
3+
## 1.0.0 (2024-12-16)
44

55
### Features Added
66
- Added methods `GetAnalyzeBatchResult`, `GetAnalyzeBatchResults`, `DeleteAnalyzeBatchResult`, and `DeleteAnalyzeResult` to `DocumentIntelligenceClient`.

sdk/documentintelligence/Azure.AI.DocumentIntelligence/MigrationGuide.md

Lines changed: 19 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
This guide is intended to assist in the migration to `Azure.AI.DocumentIntelligence (1.0.0)` from `Azure.AI.FormRecognizer (4.1.0 or 4.0.0)`. It will focus on side-by-side comparisons for similar operations between libraries. Please note that version `1.0.0` will be used for comparison with `4.1.0`.
44

5-
Familiarity with the `Azure.AI.FormRecognizer` package is assumed. For those new to the Document Intelligence and the Form Recognizer client libraries for .NET, please refer to the [README][readme] rather than this guide. For an exhaustive list of breaking changes between the packages, see the [CHANGELOG][changelog].
5+
Familiarity with the `Azure.AI.FormRecognizer` package is assumed. For those new to the Document Intelligence and the Form Recognizer client libraries for .NET, please refer to the [README][readme] rather than this guide.
66

77
## Table of Contents
88
- [Migration benefits](#migration-benefits)
@@ -26,11 +26,12 @@ There are many benefits to using the new `Azure.AI.DocumentIntelligence` library
2626

2727
New features provided by the `Azure.AI.DocumentIntelligence` library include:
2828
- **Markdown content format:** support to output with Markdown content format along with the default plain text. This is only supported for the "prebuilt-layout" model. Markdown content format is deemed a more friendly format for LLM consumption in a chat or automation use scenario.
29-
- **Query fields:** query fields are reintroduced as a premium add-on feature. When the `DocumentAnalysisFeature.QueryFields` argument is passed to a document analysis request, the service will further extract the values of the fields specified via the parameter `queryFields` to supplement any existing fields defined by the model as fallback.
30-
- **Split options:** in previous API versions, the document splitting and classification operation always tried to split the input file into multiple documents. To enable a wider set of scenarios, `ClassifyDocument` now supports a `split` parameter. The following values are supported:
29+
- **Query fields:** query fields are reintroduced as a premium add-on feature. When the `DocumentAnalysisFeature.QueryFields` argument is passed to a document analysis request, the service will further extract the values of the fields specified via the option `QueryFields` to supplement any existing fields defined by the model as fallback.
30+
- **Split options:** in previous API versions, the document splitting and classification operation always tried to split the input file into multiple documents. To enable a wider set of scenarios, `ClassifyDocument` now supports a `Split` option. The following values are supported:
3131
- `Auto`: let the service determine where to split.
3232
- `None`: the entire file is treated as a single document. No splitting is performed.
3333
- `PerPage`: each page is treated as a separate document. Each empty page is kept as its own document.
34+
- **Batch analysis:** allows you to bulk process multiple documents using a single request. Rather than having to submit documents individually, you can analyze a collection of documents like invoices, a series of a loan documents, or a group of custom documents simultaneously.
3435

3536
The table below describes the relationship of each client and its supported API version(s):
3637

@@ -57,10 +58,7 @@ Some terminology has changed to reflect the enhanced capabilities of the latest
5758

5859
### Client usage
5960

60-
We continue to support API key and AAD authentication methods when creating the clients. Below are the differences between the two versions:
61-
62-
- In `Azure.AI.DocumentIntelligence`, we have `DocumentIntelligenceClient` and `DocumentIntelligenceAdministrationClient` which support API version `2024-11-30` and higher.
63-
- Some client methods have been renamed. See the [CHANGELOG][changelog] for an exhaustive list of changes.
61+
In `Azure.AI.DocumentIntelligence`, we have `DocumentIntelligenceClient` and `DocumentIntelligenceAdministrationClient` which can only be used with API version `2024-11-30` and higher. We continue to support Microsoft Entra ID and API key authentication methods when creating the clients:
6462

6563
Creating new clients in `Azure.AI.FormRecognizer`:
6664
```C#
@@ -83,10 +81,10 @@ var documentIntelligenceAdministrationClient = new DocumentIntelligenceAdministr
8381
### Analyzing documents
8482

8583
Differences between the versions:
86-
- The former `AnalyzeDocument` method taking a `Stream` as the input document is still not supported in `Azure.AI.DocumentIntelligence` 1.0.0. As a workaround you will need to use a URI input or the new Base64 input option, which is described later in this guide ([Analyzing and classifying documents from a stream](#analyzing-and-classifying-documents-from-a-stream)).
87-
- `AnalyzeDocumentFromUri` has been renamed to `AnalyzeDocument` and its input arguments have been reorganized:
88-
- The `documentUri` parameter has been removed. Instead, an `AnalyzeDocumentContent` object must be passed to the method to select the desired input type: URI or Base64 binary data.
89-
- The `options` parameter has been removed. Instead, `pages`, `locale`, and `features` options can be passed directly as method parameters.
84+
- The former `AnalyzeDocument` method taking a `Stream` as the input document is still not supported in `Azure.AI.DocumentIntelligence` 1.0.0. As a workaround you will need to use a URI input or the new binary data input option, which is described later in this guide ([Analyzing and classifying documents from a stream](#analyzing-and-classifying-documents-from-a-stream)).
85+
- `AnalyzeDocumentFromUri` has been renamed to `AnalyzeDocument`.
86+
- The `modelId` and the `documentUri` parameters have been moved into `AnalyzeDocumentOptions`, which is now required. The desired input type must be selected when creating the options object: URI or binary data.
87+
- Overloads of `AnalyzeDocument` have been added to support simpler scenarios without creating an `AnalyzeDocumentOptions` object.
9088
- The property `DocumentField.Value` has been removed. A field's value can now be extracted from one of the its new value properties, depending on the type of the field: `ValueAddress` for type `Address`, `ValueBoolean` for type `Boolean`, and so on.
9189

9290
Analyzing documents with `Azure.AI.FormRecognizer`:
@@ -214,10 +212,7 @@ if (invoice.Fields.TryGetValue("InvoiceTotal", out FormField invoiceTotalField))
214212
Analyzing documents with `Azure.AI.DocumentIntelligence`:
215213
```C# Snippet:DocumentIntelligenceAnalyzeWithPrebuiltModelFromUriAsync
216214
Uri uriSource = new Uri("<uriSource>");
217-
218-
var options = new AnalyzeDocumentOptions("prebuilt-invoice", uriSource);
219-
220-
Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, options);
215+
Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-invoice", uriSource);
221216
AnalyzeResult result = operation.Value;
222217

223218
// To see the list of all the supported fields returned by service and its corresponding types for the
@@ -298,8 +293,9 @@ for (int i = 0; i < result.Documents.Count; i++)
298293
### Classifying documents
299294

300295
Differences between the versions:
301-
- The former `ClassifyDocument` method taking a `Stream` as the input document is still not supported in `Azure.AI.DocumentIntelligence` 1.0.0. As a workaround you will need to use a URI input or the new Base64 input option, which is described later in this guide ([Analyzing and classifying documents from a stream](#analyzing-and-classifying-documents-from-a-stream)).
302-
- `ClassifyDocumentFromUri` has been renamed to `ClassifyDocument` and its input arguments have been reorganized. The `documentUri` parameter has been removed. Instead, a `ClassifyDocumentContent` object must be passed to the method to select the desired input type: URI or Base64 binary data.
296+
- The former `ClassifyDocument` method taking a `Stream` as the input document is still not supported in `Azure.AI.DocumentIntelligence` 1.0.0. As a workaround you will need to use a URI input or the new binary data input option, which is described later in this guide ([Analyzing and classifying documents from a stream](#analyzing-and-classifying-documents-from-a-stream)).
297+
- `ClassifyDocumentFromUri` has been renamed to `ClassifyDocument`:
298+
- The `classifierId` and the `documentUri` parameters have been moved into a new `ClassifyDocumentOptions` property bag. The desired input type must be selected when creating the options object: URI or binary data.
303299

304300
Classifying documents with `Azure.AI.FormRecognizer`:
305301
```C#
@@ -338,8 +334,8 @@ foreach (AnalyzedDocument document in result.Documents)
338334
### Building a document model
339335

340336
Differences between the versions:
341-
- Parameters `trainingDataSource`, `buildMode`, `modelId`, and `options` have been removed. The method now takes a `buildRequest` parameter of type `BuildDocumentModelContent` containing all the removed options.
342-
- After creating a `BuildDocumentModelContent` instance, either property `AzureBlobSource` or `AzureBlobFileListSource` must be set depending on your data source.
337+
- Parameters `trainingDataSource`, `buildMode`, `modelId` have moved into `BuildDocumentModelOptions`, which is now required.
338+
- When creating a `BuildDocumentModelOptions` instance, either property `BlobSource` or `BlobFileListSource` must be set depending on your data source.
343339

344340
Building a document model with `Azure.AI.FormRecognizer`:
345341
```C#
@@ -404,16 +400,14 @@ foreach (KeyValuePair<string, DocumentTypeDetails> docType in model.DocumentType
404400

405401
### Analyzing and classifying documents from a stream
406402

407-
Currently neither `AnalyzeDocument` nor `ClassifyDocument` support submitting a document from a `Stream` input. As a temporary workaround, you can make use of the new Base64 input option. The following example illustrates how to submit a local file for analysis:
403+
Currently neither `AnalyzeDocument` nor `ClassifyDocument` support submitting a document from a `Stream` input. As a temporary workaround, you can make use of the new binary data input option. The following example illustrates how to submit a local file for analysis:
408404

409405
```C# Snippet:DocumentIntelligenceAnalyzeWithPrebuiltModelFromBytesAsync
410406
string filePath = "<filePath>";
411407
byte[] fileBytes = File.ReadAllBytes(filePath);
412408

413-
var bytesSource = BinaryData.FromBytes(fileBytes);
414-
var options = new AnalyzeDocumentOptions("prebuilt-invoice", bytesSource);
415-
416-
Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, options);
409+
BinaryData bytesSource = BinaryData.FromBytes(fileBytes);
410+
Operation<AnalyzeResult> operation = await client.AnalyzeDocumentAsync(WaitUntil.Completed, "prebuilt-invoice", bytesSource);
417411
AnalyzeResult result = operation.Value;
418412

419413
// To see the list of all the supported fields returned by service and its corresponding types for the
@@ -546,7 +540,7 @@ foreach (DocumentLine line in firstPage.Lines)
546540

547541
### Accessing an existing long-running operation
548542

549-
Storing the ID of a long-running operation to retrieve its status at a later point in time is still not supported in `Azure.AI.DocumentIntelligence` 1.0.0. There are no straightforward workarounds to support this scenario.
543+
With the exception of the new batch analysis API, storing the ID of a long-running operation to retrieve its status at a later point in time is still not supported in `Azure.AI.DocumentIntelligence` 1.0.0. There are no straightforward workarounds to support this scenario.
550544

551545
## Additional samples
552546

0 commit comments

Comments
 (0)