Skip to content

Commit 19c575f

Browse files
Merge pull request #2279 from laujan/aditi-2249-2248-2194
[Azure AI Svcs] address aditi prs
2 parents ef02dbe + 04db9d6 commit 19c575f

File tree

3 files changed

+47
-37
lines changed

3 files changed

+47
-37
lines changed

articles/ai-services/document-intelligence/faq.yml

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ metadata:
77
ms.service: azure-ai-document-intelligence
88
ms.custom: references_regions
99
ms.topic: faq
10-
ms.date: 11/19/2024
10+
ms.date: 01/14/2025
1111
ms.author: lajanuar
1212
title: Frequently asked questions
1313
summary: |
@@ -62,7 +62,7 @@ sections:
6262
Can Document Intelligence help with semantic chunking within documents for retrieval-augmented generation?
6363
answer: |
6464
**Yes.**
65-
65+
6666
Document Intelligence can provide the building blocks to enable semantic chunking. Semantic chunking is a key step in retrieval-augmented generation (RAG) to ensure context dense chunks and relevance improvement.
6767
6868
- Document Intelligence provides a layout model that provides a visual decomposition of the document into lines, paragraphs, sections, headers, and footers.
@@ -133,14 +133,14 @@ sections:
133133
answer: |
134134
**Yes.**
135135
136-
If your Document Intelligence resource is configured with a firewall or virtual network, you need to add the dedicated IP address 20.3.165.95 to the firewall allowlist for your Document Intelligence resource. Some functions in custom projects (for example, autolabel, project management and human in the loop) don't work if the public network access is disabled.
136+
For `v4.0 11-30-2024 (GA)`, auto labeling is hosted natively with the rest of the service, so there's no need for IP allowlisting. For any previous version, if your Document Intelligence resource is configured with a firewall or virtual network, you need to add the dedicated IP address 20.3.165.95 to the firewall allowlist for your Document Intelligence resource. Some functions in custom projects (for example, autolabel, project management and human in the loop) don't work if the public network access is disabled.
137137
138138
- question: |
139139
When I upload a file in Document Intelligence Studio by "Fetch from URL" function, can I use a URL from my blob storage?
140140
141141
answer: |
142-
**Yes.**
143-
142+
**Yes.**
143+
144144
If your Azure blob storage URL includes a SAS token, and is accessible from public networks. You can't use the **Fetch** function for storage accounts where the key access is disabled or behind a firewall/VNet.
145145
146146
- question: |
@@ -171,7 +171,7 @@ sections:
171171
172172
Document Intelligence offers the latest development options within the following platforms:
173173
174-
- [REST API](/rest/api/aiservices/document-models/analyze-document?view=rest-aiservices-2023-07-31 &preserve-view=true&tabs=HTTP)
174+
- [REST API](/rest/api/aiservices/document-models/analyze-document?view=rest-aiservices-2023-07-31&preserve-view=true&tabs=HTTP)
175175
176176
- [Document Intelligence Studio](https://formrecognizer.appliedai.azure.com/studio)
177177
@@ -295,14 +295,21 @@ sections:
295295
296296
The copy operation is limited to copying models within the specific cloud environment where you trained the model. For instance, copying models from the public cloud to the Azure Government cloud isn't supported.
297297
298+
- question: |
299+
Am I charged when using auto labeling?
300+
301+
answer: |
302+
**Yes.**
303+
Auto label incurs a cost which is equivalent to an analyze request for the corresponding model for a document.
304+
298305
- question: |
299306
Am I charged when training a custom models?
300307
answer: |
301308
**Yes.**
302309
303-
For `v4.0 11-30-2024 (GA)` custom neural models can be trained for free for a **maximum of 10 hours**. Whether you're training a single model for the 10 hours, or training multiple models for the total of 10 hours, you aren't charged for the first 10 hours. After using up the free 10 hours, you're **automatically charged by the extra training hour**. For details on the prices, refer to the [pricing page](https://azure.microsoft.com/pricing/details/ai-document-intelligence/). This new paid training feature enables training models for an extended duration to process larger documents. For more information on this paid training feature, check [custom neural model billing section](train/custom-neural.md#billing).
304-
305-
For `v3.0 2022-08-31` or `v3.1 2023-07-31`, custom neural models can be trained for free for a maximum of 20 training sessions, with each session capped at 30 minutes of training duration. Once you use up all of the 20 training sessions, you can submit Azure support ticket to increase the training session limit. To increase the limit, two training sessions are considered as one training hour, and you're charged per two sessions / one training hour. For details on the prices, refer to the [pricing page]. For more information on ways to increase the limit, check [custom neural model billing section](train/custom-neural.md#billing). **For `v3.0` and `v3.1`, paid training feature is unavailable. Paid training feature for custom neural model is only available on `v4.0`.**
310+
For `v4.0 11-30-2024 (GA)` custom neural models can be trained for free for a **maximum of 10 hours**. Whether you're training a single model for the 10 hours, or training multiple models for the total of 10 hours, you aren't charged for the first 10 hours. After using up the free 10 hours, you're **automatically charged by the extra training hour**. For details on prices, refer to the [pricing page](https://azure.microsoft.com/pricing/details/ai-document-intelligence/). This new paid training feature enables training models for an extended duration to process larger documents. For more information on this paid training feature, check [custom neural model billing section](train/custom-neural.md#billing).
311+
312+
For `v3.0 2022-08-31` or `v3.1 2023-07-31`, custom neural models can be trained for free for a maximum of 20 training sessions, with each session capped at 30 minutes of training duration. Once you use up all of the 20 training sessions, you can submit Azure support ticket to increase the training session limit. To increase the limit, two training sessions are considered as one training hour, and you're charged per two sessions / one training hour. For details on the prices, refer to the [pricing page](https://azure.microsoft.com/pricing/details/ai-document-intelligence/). For more information on ways to increase the limit, check [custom neural model billing section](train/custom-neural.md#billing). **For `v3.0` and `v3.1`, paid training feature is unavailable. Paid training feature for custom neural model is only available on `v4.0`.**
306313
307314
- name: Storage account
308315
questions:

articles/ai-services/document-intelligence/train/custom-neural.md

Lines changed: 23 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: laujan
66
manager: nitinme
77
ms.service: azure-ai-document-intelligence
88
ms.topic: conceptual
9-
ms.date: 11/19/2024
9+
ms.date: 01/14/2025
1010
ms.author: lajanuar
1111
ms.custom:
1212
- references_regions
@@ -20,7 +20,6 @@ monikerRange: '>=doc-intel-3.0.0'
2020

2121
# Document Intelligence custom neural model
2222

23-
2423
:::moniker range="doc-intel-4.0.0"
2524
**This content applies to:**![checkmark](../media/yes-icon.png) **v4.0 (GA)** | **Previous versions:** ![blue-checkmark](../media/blue-yes-icon.png) [**v3.1 (GA)**](?view=doc-intel-3.1.0&preserve-view=tru) ![blue-checkmark](../media/blue-yes-icon.png) [**v3.0 (GA)**](?view=doc-intel-3.0.0&preserve-view=tru)
2625
::: moniker-end
@@ -45,7 +44,7 @@ Custom neural models share the same labeling format and strategy as [custom temp
4544
## Model capabilities
4645

4746
> [!IMPORTANT]
48-
> Custom neural v4.0 `2024-11-30` (GA) model supports overlapping fields and table cell confidence.
47+
> Custom neural v4.0 `2024-11-30` (GA) model supports signature detection, table cell confidence, and overlapping fields.
4948
5049
Custom neural models currently support key-value pairs and selection marks and structured fields (tables).
5150

@@ -62,22 +61,9 @@ The `Build` operation supports *template* and *neural* custom models. Previous v
6261

6362
Neural models support documents that have the same information, but different page structures. Examples of these documents include United States W2 forms, which share the same information, but can vary in appearance across companies. For more information, *see* [Custom model build mode](custom-model.md#build-mode).
6463

65-
### Overlapping fields
66-
67-
Custom neural v4.0 `2024-11-30` (GA) model supports overlapping fields:
68-
69-
To use the overlapping fields, your dataset needs to contain at least one sample with the expected overlap. To label an overlap, use **region labeling** to designate each of the spans of content (with the overlap) for each field. Labeling an overlap with field selection (highlighting a value) fails in the Studio as region labeling is the only supported labeling tool for indicating field overlaps. Overlap support includes:
70-
71-
* Complete overlap. The same set of tokens are labeled for two different fields.
72-
* Partial overlap. Some tokens belong to both fields, but there are tokens that are only part of one field or the other.
73-
74-
Overlapping fields have some limits:
75-
76-
* Any token or word can only be labeled as two fields.
77-
* overlapping fields in a table can't span table rows.
78-
* Overlapping fields can only be recognized if at least one sample in the dataset contains overlapping labels for those fields.
64+
### Signature detection
7965

80-
To use overlapping fields, label your dataset with the overlaps and train the model with the API version ``**2024-11-30 (GA)**``.
66+
Custom neural v4.0 2024-11-30 (GA) model supports signature detection. To label a signature, use field type as Signature and draw the regions for signature. Signature field only supports one draw region per field.
8167

8268
## Tabular fields
8369

@@ -94,7 +80,7 @@ Tabular fields support **cross page tables** by default:
9480

9581
Tabular fields are also useful when extracting repeating information within a document that isn't recognized as a table. For example, a repeating section of work experiences in a resume can be labeled and extracted as a tabular field.
9682

97-
Tabular fields provide **table, row and cell confidence** with the ``**2024-11-30 (GA)**`` API:
83+
Tabular fields provide **table, row and cell confidence** with the `2024-11-30 (GA)` API:
9884

9985
* Fixed or dynamic tables add confidence support for the following elements:
10086
* Table confidence, a measure of how accurately the entire table is recognized.
@@ -103,6 +89,23 @@ Tabular fields provide **table, row and cell confidence** with the ``**2024-11-3
10389

10490
* The recommended approach is to review the accuracy in a top-down manner starting with the table first, followed by the row and then the cell. See [confidence and accuracy scores](../concept/accuracy-confidence.md) to learn more about table, row, and cell confidence.
10591

92+
### Overlapping fields
93+
94+
Custom neural v4.0 2024-11-30 (GA) model supports overlapping fields:
95+
96+
To use the overlapping fields, your dataset needs to contain at least one sample with the expected overlap. To label an overlap, use **region labeling** to designate each of the spans of content (with the overlap) for each field. Labeling an overlap with field selection (highlighting a value) fails in the Studio as region labeling is the only supported labeling tool for indicating field overlaps. Overlap support includes:
97+
98+
* Complete overlap. The same set of tokens are labeled for two different fields.
99+
* Partial overlap. Some tokens belong to both fields, but there are tokens that are only part of one field or the other.
100+
101+
Overlapping fields have some limits:
102+
103+
* Any token or word can only be labeled as two fields.
104+
* overlapping fields in a table can't span table rows.
105+
* Overlapping fields can only be recognized if at least one sample in the dataset contains overlapping labels for those fields.
106+
107+
To use overlapping fields, label your dataset with the overlaps and train the model with the API version ``**2024-11-30 (GA)**``.
108+
106109
### Supported languages and locales
107110

108111
*See* our [Language Support—custom models](../language-support/custom.md#custom-neural) for a complete list of supported languages.
@@ -203,7 +206,7 @@ Custom neural models differ from custom template models in a few different ways.
203206

204207
* Custom neural model doesn't recognize values split across page boundaries.
205208
* Custom neural unsupported field types are ignored if a dataset labeled for custom template models is used to train a custom neural model.
206-
* Custom neural models are limited to 20 build operations per month. Open a support request if you need the limit increased. For more information, see [Document Intelligence service quotas and limits](../service-limits.md).
209+
* Custom neural models are limited to 20 build operations per month for versions 3.x. Open a support request if you need the limit increased. For more information, see [Document Intelligence service quotas and limits](../service-limits.md).
207210

208211
## Training a model
209212

articles/ai-services/document-intelligence/whats-new.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ author: laujan
66
manager: nitinme
77
ms.service: azure-ai-document-intelligence
88
ms.topic: whats-new
9-
ms.date: 12/17/2024
9+
ms.date: 01/14/2025
1010
ms.author: lajanuar
1111
ms.custom:
1212
- references_regions
@@ -25,11 +25,11 @@ ms.custom:
2525
Document Intelligence service is updated on an ongoing basis. Bookmark this page to stay up to date with release notes, feature enhancements, and our newest documentation.
2626

2727
> [!IMPORTANT]
28-
> Preview API versions are retired once the GA API is released. The 2023-02-28-preview API version is being retired, if you are still using the preview API or the associated SDK versions, please update your code to target the latest API version 2024-11-30 (GA). </br>
28+
> Preview API versions are retired once the GA API is released. The 2023-02-28-preview API version is retiring. If you're still using the preview API or the associated SDK versions, update your code to target the latest API version `2024-11-30 (GA)`. </br>
2929
3030
## December 2024
3131

32-
**Document Intelligence v4.0 programming language SDKs are now generally available (GA)**! <br><br>The latest client SDKs default to the [**2024-11-30 REST API (GA)**](/rest/api/aiservices/operation-groups?view=rest-aiservices-v4.0%20(2024-11-30)&preserve-view=true) version of the service.<br><br>
32+
**Document Intelligence v4.0 programming language SDKs are now generally available (GA)**! <br><br>The latest client libraries default to the [**2024-11-30 REST API (GA)**](/rest/api/aiservices/operation-groups?view=rest-aiservices-v4.0%20(2024-11-30&preserve-view=true) version of the service.<br><br>
3333
For more information, *see* client libraries for the following supported programming languages:
3434

3535
* [🆕 .NET (C#)](versioning/changelog-release-history.md?view=doc-intel-4.0.0&tabs=csharp&preserve-view=true)
@@ -53,12 +53,13 @@ For more information, *see* client libraries for the following supported program
5353
* 🆕 Searchable PDF. The [prebuilt read](prebuilt/read.md) model now supports images formats (JPEG/JPG, PNG, BMP, TIFF, HEIF) and language expansion to include Chinese, Japanese, and Korean for [PDF output](prebuilt/read.md#searchable-pdf).
5454

5555
* [Custom classification model](train/custom-model.md#custom-classification-model)
56-
* Custom classification model supports incremental training. You can add new samples to exisisting classes or add new classes by referencing an existing classifier.
56+
* Custom classification model supports incremental training. You can add new samples to existing classes or add new classes by referencing an existing classifier.
5757
* With v4.0, custom classification model doesn't split documents by default during analysis. You need to explicitly set 'splitMode' property to auto to preserve the older behavior.
5858
* Custom classification model now supports 25,000 pages as new training page limit.
5959

6060
* [Custom Neural Model](train/custom-neural.md)
6161
* Custom Neural model now supports signature detection.
62+
* Custom neural models support paid training for longer duration when you need to train model with a larger labeled dataset. The first 20 training runs in a calendar month continue to be free. Any training operations over 20 is on the paid tier. Learn more details on [billing](train/custom-neural.md#billing).
6263

6364
* [ US Bank statement model](concept-bank-statement.md)
6465
* US Bank Statement Model now supports check table extraction.
@@ -221,14 +222,13 @@ The Document Intelligence [**2023-10-31-preview**](/rest/api/aiservices/document
221222
* Add-on capabilities are available within all models excluding the [Read model](prebuilt/read.md).
222223

223224
>[!NOTE]
224-
> With the 2022-08-31 API general availability (GA) release, the associated preview APIs are being deprecated. If you are using the 2021-09-30-preview, the 2022-01-30-preview or he 2022-06-30-preview API versions, please update your applications to target the 2022-08-31 API version. There are a few minor changes involved, for more information, _see_ the [migration guide](v3-1-migration-guide.md).
225+
> With the 2022-08-31 API general availability (GA) release, the associated preview APIs are being deprecated. If you're using the 2021-09-30-preview, 2022-01-30-preview, or 2022-06-30-preview API versions, update your applications to target the 2022-08-31 API version. There are a few minor changes involved, for more information, _see_ the [migration guide](v3-1-migration-guide.md).
225226
226227
## July 2023
227228

228229
> [!NOTE]
229230
> Form Recognizer is now **Azure AI Document Intelligence**!
230231
>
231-
> * Document, Azure AI services encompass all of what were previously known as Cognitive Services and Azure Applied AI Services.
232232
> * There are no changes to pricing.
233233
> * The names *Cognitive Services* and *Azure Applied AI* continue to be used in Azure billing, cost analysis, price list, and price APIs.
234234
> * There are no breaking changes to application programming interfaces (APIs) or client libraries.
@@ -265,7 +265,7 @@ The v3.1 API introduces new and updated capabilities:
265265
:::image type="content" source="media/studio/analyze-options.gif" alt-text="Animated screenshot showing use of the analyze-options button to configure options in Studio.":::
266266

267267
> [!NOTE]
268-
> Font extraction is not visualized in Document Intelligence Studio. However, you can check the styles section of the JSON output for the font detection results.
268+
> Font extraction isn't visualized in Document Intelligence Studio. However, you can check the styles section of the JSON output for the font detection results.
269269
270270
✔️ **Auto labeling documents with prebuilt models or one of your own models**
271271

@@ -496,7 +496,7 @@ The v3.1 API introduces new and updated capabilities:
496496
## September 2022
497497

498498
>[!NOTE]
499-
> Starting with version 4.0.0, a new set of clients has been introduced to leverage the newest features of the Document Intelligence service.
499+
> Starting with version 4.0.0, a new set of clients is introduced to apply the newest features of the Document Intelligence service.
500500
501501
**SDK version 4.0.0 GA release includes the following updates:**
502502

0 commit comments

Comments
 (0)