Skip to content

Commit 5ad600d

Browse files
committed
restructure
1 parent 393aa96 commit 5ad600d

11 files changed

+243
-367
lines changed

articles/ai-services/document-intelligence/concept-accuracy-confidence.md

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -21,13 +21,11 @@ ms.author: lajanuar
2121
> * **Custom neural models** do not provide accuracy scores during training.
2222
> * Confidence scores for tables, table rows and table cells are available starting with the **2024-02-29-preview** API version for **custom models**.
2323
24-
25-
Custom template models generate an estimated accuracy score when trained. Documents analyzed with a custom model produce a confidence score for extracted fields. In this article, learn to interpret accuracy and confidence scores and best practices for using those scores to improve accuracy and confidence results.
24+
Custom template models generate an estimated accuracy score when trained. Documents analyzed with a custom model produce a confidence score for extracted fields. A confidence score indicates probability by measuring the degree of statistical certainty that the extracted result is detected correctly. The estimated accuracy is calculated by running a few different combinations of the training data to predict the labeled values. In this article, learn to interpret accuracy and confidence scores and best practices for using those scores to improve accuracy and confidence results.
2625

2726
## Accuracy scores
2827

29-
The output of a `build` (v3.0) or `train` (v2.1) custom model operation includes the estimated accuracy score. This score represents the model's ability to accurately predict the labeled value on a visually similar document.
30-
The accuracy value range is a percentage between 0% (low) and 100% (high). The estimated accuracy is calculated by running a few different combinations of the training data to predict the labeled values.
28+
The output of a `build` (v3.0) or `train` (v2.1) custom model operation includes the estimated accuracy score. This score represents the model's ability to accurately predict the labeled value on a visually similar document. Accuracy is measured within a percentage value range from 0% (low) to 100% (high). It's best to target a score of 80% or higher. For more sensitive cases, like financial or medical records, we recommend a score of close to 100%. You can also require human review.
3129

3230
**Document Intelligence Studio** </br>
3331
**Trained custom model (invoice)**
@@ -50,6 +48,20 @@ Field confidence indicates an estimated probability between 0 and 1 that the pre
5048

5149
:::image type="content" source="media/accuracy-confidence/confidence-scores.png" alt-text="confidence scores from Document Intelligence Studio":::
5250

51+
## Improve confidence scores
52+
53+
After an analysis operation, review the JSON output. Examine the `confidence` values for each key/value result under the `pageResults` node. You should also look at the confidence score in the `readResults` node, which corresponds to the text-read operation. The confidence of the read results doesn't affect the confidence of the key/value extraction results, so you should check both. Here are some tips:
54+
55+
* If the confidence score for the `readResults` object is low, improve the quality of your input documents.
56+
57+
* If the confidence score for the `pageResults` object is low, ensure that the documents you're analyzing are of the same type.
58+
59+
* Consider incorporating human review into your workflows.
60+
61+
* Use forms that have different values in each field.
62+
63+
* For custom models, use a larger set of training documents. Tagging more documents teaches your model to recognize fields with greater accuracy.
64+
5365
## Interpret accuracy and confidence scores for custom models
5466

5567
When interpreting the confidence score from a custom model, you should consider all the confidence scores returned from the model. Let's start with a list of all the confidence scores.

articles/ai-services/document-intelligence/concept-composed-models.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,16 @@ With the introduction of [**custom classification models**](./concept-custom-cla
5555
> [!NOTE]
5656
> With the addition of **_custom neural model_** , there are a few limits to the compatibility of models that can be composed together.
5757
58+
* With the model compose operation, you can assign up to 200 models to a single model ID. If the number of models that I want to compose exceeds the upper limit of a composed model, you can use one of these alternatives:
59+
60+
* Classify the documents before calling the custom model. You can use the [read model](concept-read.md) and build a classification based on the extracted text from the documents and certain phrases by using sources like code, regular expressions, or search.
61+
62+
* If you want to extract the same fields from various structured, semi-structured, and unstructured documents, consider using the deep-learning [custom neural model](concept-custom-neural.md). Learn more about the [differences between the custom template model and the custom neural model](concept-custom.md#compare-model-features).
63+
64+
* Analyzing a document by using composed models is identical to analyzing a document by using a single model. The `Analyze Document` result returns a `docType` property that indicates which of the component models you selected for analyzing the document. There's no change in pricing for analyzing a document by using an individual custom model or a composed custom model.
65+
66+
* Model Compose is currently available only for custom models trained with labels.
67+
5868
### Composed model compatibility
5969

6070
|Custom model type|Models trained with v2.1 and v2.0 | Custom template models v3.0 |Custom neural models 3.0|Custom Neural models v3.1|

articles/ai-services/document-intelligence/concept-custom-classifier.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: vkurpad
66
manager: nitinme
77
ms.service: azure-ai-document-intelligence
88
ms.topic: conceptual
9-
ms.date: 02/29/2024
9+
ms.date: 06/26/2024
1010
ms.author: lajanuar
1111
ms.custom:
1212
- references_regions
@@ -49,9 +49,9 @@ Custom classification models are deep-learning-model types that combine layout a
4949
5050
Custom classification models can analyze a single- or multi-file documents to identify if any of the trained document types are contained within an input file. Here are the currently supported scenarios:
5151

52-
* A single file containing one document. For instance, a loan application form.
52+
* A single file containing one document type, such as a loan application form.
5353

54-
* A single file containing multiple documents. For instance, a loan application package containing a loan application form, payslip, and bank statement.
54+
* A single file containing multiple document types. For instance, a loan application package that contains a loan application form, payslip, and bank statement.
5555

5656
* A single file containing multiple instances of the same document. For instance, a collection of scanned invoices.
5757

articles/ai-services/document-intelligence/concept-custom.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -103,6 +103,28 @@ If the language of your documents and extraction scenarios supports custom neura
103103

104104
* For custom classification model training, the total size of training data is `1GB` with a maximum of 10,000 pages.
105105

106+
### Optimal training data
107+
108+
Training input data is the foundation of any machine learning model. It determines the quality, accuracy, and performance of the model. Therefore, it is crucial to create the best training input data possible for your Document Intelligence project. When you use the Document Intelligence custom model, you provide your own training data. Here are a few tips to help train your models effectively:
109+
110+
* Use text*based instead of image*based PDFs when possible. One way to identify an image*based PDF is to try selecting specific text in the document. If you can select only the entire image of the text, the document is image based, not text based.
111+
112+
* Organize your training documents by using a subfolder for each format (JPEG/JPG, PNG, BMP, PDF, or TIFF).
113+
114+
* Use forms that have all of the available fields completed.
115+
116+
* Use forms with differing values in each field.
117+
118+
* If your images are low quality, use a larger dataset (more than five training documents).
119+
120+
* Determine if you need to use a single model or multiple models composed into a single model.
121+
122+
* Model accuracy can decrease when you have different formats analyzed with a single model. Plan on segmenting your dataset into folders, where each folder is a unique template. Train one model per folder, and compose the resulting models into a single endpoint.
123+
124+
* Custom forms rely on a consistent visual template. If your form has variations with formats and page breaks, consider segmenting your dataset to train multiple models.
125+
126+
* Ensure that you have a balanced dataset by accounting for formats, document types, and structure.
127+
106128
### Build mode
107129

108130
The build custom model operation adds support for the *template* and *neural* custom models. Previous versions of the REST API and client libraries only supported a single build mode that is now known as the *template* mode.
@@ -149,6 +171,10 @@ Document Intelligence v3.1 and later models support the following tools, applica
149171
|---|---|:---|
150172
|Custom model| &bullet; [Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio/customform/projects)</br>&bullet; [REST API](/rest/api/aiservices/document-models/analyze-document?view=rest-aiservices-2023-07-31&preserve-view=true&tabs=HTTP)</br>&bullet; [C# SDK](quickstarts/get-started-sdks-rest-api.md?view=doc-intel-3.0.0&preserve-view=true)</br>&bullet; [Python SDK](quickstarts/get-started-sdks-rest-api.md?view=doc-intel-3.0.0&preserve-view=true)|***custom-model-id***|
151173

174+
## Custom model life cycle
175+
176+
The life cycle of a custom model depends on the API version that is used to train it. If the API version is a general availability (GA) version, the custom model will have the same life cycle as that version. The custom model will not be available for inference when the API version is deprecated. If the API version is a preview version, the custom model will have the same life cycle as the preview version of the API.
177+
152178
:::moniker-end
153179

154180
::: moniker range="doc-intel-2.1.0"

articles/ai-services/document-intelligence/concept-document-intelligence-studio.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: azure-ai-document-intelligence
88
ms.custom:
99
- ignite-2023
1010
ms.topic: conceptual
11-
ms.date: 05/10/2024
11+
ms.date: 06/26/2024
1212
ms.author: lajanuar
1313
monikerRange: '>=doc-intel-3.0.0'
1414
---

articles/ai-services/document-intelligence/concept-layout.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -573,6 +573,16 @@ if page.selection_marks:
573573

574574
Extracting tables is a key requirement for processing documents containing large volumes of data typically formatted as tables. The Layout model extracts tables in the `pageResults` section of the JSON output. Extracted table information includes the number of columns and rows, row span, and column span. Each cell with its bounding polygon is output along with information whether the area is recognized as a `columnHeader` or not. The model supports extracting tables that are rotated. Each table cell contains the row and column index and bounding polygon coordinates. For the cell text, the model outputs the `span` information containing the starting index (`offset`). The model also outputs the `length` within the top-level content that contains the full text from the document.
575575

576+
Here are a few factors to consider when using the Document Intelligence bale extraction capability:
577+
578+
* Is the data that you want to extract presented as a table, and is the table structure meaningful?
579+
580+
* If the data isn't in a table format, can the data fit in a two-dimensional grid?
581+
582+
* Do your tables span multiple pages? If so, to avoid having to label all the pages, split the PDF into pages before sending it to Document Intelligence. After the analysis, post-process the pages to a single table.
583+
584+
* If you're creating custom models, refer to [Labeling as tables](quickstarts/try-document-intelligence-studio.md#labeling-as-tables). Dynamic tables have a variable number of rows for each column. Fixed tables have a constant number of rows for each column.
585+
576586
> [!NOTE]
577587
> Table is not supported if the input file is XLSX.
578588

articles/ai-services/document-intelligence/concept-model-overview.md

Lines changed: 34 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -41,6 +41,11 @@ ms.author: lajanuar
4141
Azure AI Document Intelligence supports a wide variety of models that enable you to add intelligent document processing to your apps and flows. You can use a prebuilt domain-specific model or train a custom model tailored to your specific business need and use cases. Document Intelligence can be used with the REST API or Python, C#, Java, and JavaScript client libraries.
4242
::: moniker-end
4343

44+
> [!NOTE]
45+
>
46+
> * Document processing projects that involve financial data, protected health data, personal data, or highly sensitive data require careful attention.
47+
> * Be sure to comply with all [national/regional and industry-specific requirements](https://azure.microsoft.com/resources/microsoft-azure-compliance-offerings/).
48+
4449
## Model overview
4550

4651
The following table shows the available models for each current preview and stable API:
@@ -73,6 +78,10 @@ The following table shows the available models for each current preview and stab
7378

7479
\* - Contains sub-models. See the model specific information for supported variations and sub-types.
7580

81+
### Latency
82+
83+
Latency is the amount of time it takes for an API server to handle and process an incoming request and deliver the outgoing response to the client. The time to analyze a document depends on the size (for example, number of pages) and associated content on each page. Document Intelligence is a multi-tenant service where latency for similar documents is comparable but not always identical. Occasional variability in latency and performance is inherent in any microservice-based, stateless, asynchronous service that processes images and large documents at scale. Although we're continuously scaling up the hardware and capacity and scaling capabilities, you might still have latency issues at runtime.
84+
7685
|**Add-on Capability**| **Add-On/Free**|&bullet; [2024-02-29-preview](/rest/api/aiservices/document-models/build-model?view=rest-aiservices-2024-02-29-preview&preserve-view=true&branch=docintelligence&tabs=HTTP) <br>&bullet [2023-10-31-preview](/rest/api/aiservices/operation-groups?view=rest-aiservices-2024-02-29-preview&preserve-view=true|[`2023-07-31` (GA)](/rest/api/aiservices/document-models/analyze-document?view=rest-aiservices-2023-07-31&preserve-view=true&tabs=HTTP)|[`2022-08-31` (GA)](/rest/api/aiservices/document-models/analyze-document?view=rest-aiservices-v3.0%20(2022-08-31)&preserve-view=true&tabs=HTTP)|[v2.1 (GA)](/rest/api/aiservices/analyzer?view=rest-aiservices-v2.1&preserve-view=true)|
7786
|----------------|-----------|---|--|---|---|
7887
|Font property extraction|Add-On| ✔️| ✔️| n/a| n/a|
@@ -111,6 +120,14 @@ Add-On* - Query fields are priced differently than the other add-on features. Se
111120
| [Custom classification model](#custom-classifier)| The **Custom classification model** can classify each page in an input file to identify the documents within and can also identify multiple documents or multiple instances of a single document within an input file.
112121
| [Composed models](#composed-models) | Combine several custom models into a single model to automate processing of diverse document types with a single composed model.
113122

123+
### Bounding box and polygon coordinates
124+
125+
A bounding box (`polygon` in v3.0 and later versions) is an abstract rectangle that surrounds text elements in a document used as a reference point for object detection.
126+
127+
* The bounding box specifies position by using an x and y coordinate plane presented in an array of four numerical pairs. Each pair represents a corner of the box in the following order: upper left, upper right, lower right, lower left.
128+
129+
* For an image, coordinates are presented in pixels. For a PDF, coordinates are presented in inches.
130+
114131
For all models, except Business card model, Document Intelligence now supports add-on capabilities to allow for more sophisticated analysis. These optional capabilities can be enabled and disabled depending on the scenario of the document extraction. There are seven add-on capabilities available for the `2023-07-31` (GA) and later API version:
115132

116133
* [`ocrHighResolution`](concept-add-on-capabilities.md#high-resolution-extraction)
@@ -121,7 +138,22 @@ For all models, except Business card model, Document Intelligence now supports a
121138
* [`keyValuePairs`](concept-add-on-capabilities.md#key-value-pairs) (2024-02-29-preview, 2023-10-31-preview)
122139
* [`queryFields`](concept-add-on-capabilities.md#query-fields) (2024-02-29-preview, 2023-10-31-preview) `Not available with the US.Tax models`
123140

124-
## Model details
141+
## Language support
142+
143+
The deep-learning-based universal models in Document Intelligence support many languages that can extract multilingual text from your images and documents, including text lines with mixed languages.
144+
Language support varies by Document Intelligence service functionality. For a complete list, see the following articles:
145+
146+
* [Language support: document analysis models](language-support-ocr.md)
147+
* [Language support: prebuilt models](language-support-prebuilt.md)
148+
* [Language support: custom models](language-support-custom.md)
149+
150+
## Regional availability
151+
152+
Document Intelligence is generally available in many of the [60+ Azure global infrastructure regions](https://azure.microsoft.com/global-infrastructure/services/?products=metrics-advisor&regions=all#select-product).
153+
154+
For more information, see our [Azure geographies](https://azure.microsoft.com/global-infrastructure/geographies/#overview) page to help choose the region that's best for you and your customers.
155+
156+
## Model details
125157

126158
This section describes the output you can expect from each model. Please note that you can extend the output of most models with add-on features.
127159

@@ -295,7 +327,7 @@ Custom models can be broadly classified into two types. Custom classification mo
295327

296328
Custom document models analyze and extract data from forms and documents specific to your business. They're trained to recognize form fields within your distinct content and extract key-value pairs and table data. You only need one example of the form type to get started.
297329

298-
Version v3.0 custom model supports signature detection in custom template (form) and cross-page tables in both template and neural models.
330+
Version v3.0 and later custom models support signature detection in custom template (form) and cross-page tables in both template and neural models. [Signature detection](quickstarts/try-document-intelligence-studio.md#signature-detection) looks for the presence of a signature, not the identity of the person who signs the document. If the model returns **unsigned** for signature detection, the model didn't find a signature in the defined field.
299331

300332
***Sample custom template processed using [Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio/customform/projects)***:
301333

articles/ai-services/document-intelligence/containers/install-run.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -683,6 +683,8 @@ The Document Intelligence containers send billing information to Azure by using
683683

684684
Queries to the container are billed at the pricing tier of the Azure resource used for the API `Key`. You're billed for each container instance used to process your documents and images.
685685

686+
If you receive the error *Container isn't in a valid state. Subscription validation failed with status 'OutOfQuota' API key is out of quota* it is an indicator that your containers are not communication wit the billing endpoint.
687+
686688
### Connect to Azure
687689

688690
The container needs the billing argument values to run. These values allow the container to connect to the billing endpoint. The container reports usage about every 10 to 15 minutes. If the container doesn't connect to Azure within the allowed time window, the container continues to run, but doesn't serve queries until the billing endpoint is restored. The connection is attempted 10 times at the same time interval of 10 to 15 minutes. If it can't connect to the billing endpoint within the 10 tries, the container stops serving requests. See the [Azure AI container FAQ](../../../ai-services/containers/container-faq.yml#how-does-billing-work) for an example of the information sent to Microsoft for billing.

0 commit comments

Comments
 (0)