Skip to content

Commit 8bd1e12

Browse files
committed
update main
1 parent 1f9e8d4 commit 8bd1e12

File tree

7 files changed

+43
-50
lines changed

7 files changed

+43
-50
lines changed

articles/applied-ai-services/form-recognizer/concept-query-field-extraction.md

Lines changed: 25 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -18,11 +18,32 @@ recommendations: false
1818

1919
**This article applies to:** ![Form Recognizer v3.0 checkmark](media/yes-icon.png) **Form Recognizer v3.0**.
2020

21-
With query field extraction, you can easily extract any fields from your documents without the need for training. Simply specify the fields you want to extract and Form Recognizer will analyze the document accordingly. For instance, if you're dealing with a contract, you could pass a list of labels like "Party1, Party2, TermsOfUse, PaymentTerms, PaymentDate, TermEndDate" to Form Recognizer as part of the analyze document request. Form Recognizer will leverage the capabilities of both Azure Open AI and Form Recognizer to extract the information in the document and return the values in a structured JSON output. In addition to the query fields, the response will include text, tables, selection marks, general document key-value pairs, and other relevant data.
21+
> [!IMPORTANT]
22+
>
23+
> * The Form Recognizer Studio query fields extraction feature is currently in gated preview. Features, approaches and processes may change, prior to General Availability (GA), based on user feedback.
24+
> * Complete and submit the [**Form Recognizer private preview request form**](https://aka.ms/form-recognizer/preview/survey) to request access.
2225
23-
To get access to this new capability request access [here](https://customervoice.microsoft.com/Pages/ResponsePage.aspx?id=v4j5cvGGr0GRqy180BHbR7en2Ais5pxKtso_Pz4b1_xUQTRDQUdHMTBWUDRBQ01QUVNWNlNYMVFDViQlQCN0PWcu)
26+
Form Recognizer now supports query field extractions using Azure OpenAI capabilities. With query field extraction, you can add fields to the extraction process using a query request without the need for added training.
2427

25-
<<Image 1>>
28+
> [!NOTE]
29+
>
30+
> Form Recognizer Studio query field extraction is currently available with the general document model for the `2023-02-28-preview` release.
2631
27-
<<Image 2>>
32+
## Select query fields
2833

34+
For query field extraction, specify the fields you want to extract and Form Recognizer will analyze the document accordingly. Here's an example:
35+
36+
* If you're processing a contract in the Form Recognizer Studio, you can pass a list of field labels like `Party1`, `Party2`, `TermsOfUse`, `PaymentTerms`, `PaymentDate`, and `TermEndDate`" as part of the analyze document request.
37+
38+
:::image type="content" source="media/studio/query-field-select.png" alt-text="Screenshot of query fields selection window in Form Recognizer Studio.":::
39+
40+
* Form Recognizer will utilize the capabilities of both Azure OpenAI and and extraction model to analyze and extract the field data and return the values in a structured JSON output.
41+
42+
* In addition to the query fields, the response will include text, tables, selection marks, general document key-value pairs, and other relevant data.
43+
44+
:::image type="content" source="media/studio/query-field-analyze.png" alt-text="Screenshot of query field analysis in Form Recognizer Studio.":::
45+
46+
## Next steps
47+
48+
> [!div class="nextstepaction"]
49+
> [Try the Form Recognizer Studio quickstart](./quickstarts/try-form-recognizer-studio.md)

articles/applied-ai-services/form-recognizer/concept-query-fields.md

Lines changed: 0 additions & 28 deletions
This file was deleted.

articles/applied-ai-services/form-recognizer/faq.yml

Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ metadata:
77
ms.service: applied-ai-services
88
ms.subservice: forms-recognizer
99
ms.topic: faq
10-
ms.date: 02/07/2023
10+
ms.date: 03/09/2023
1111
ms.author: lajanuar
1212
monikerRange: '>=form-recog-2.1.0'
1313
recommendations: false
@@ -39,7 +39,7 @@ sections:
3939
Learn more about [use case considerations](/legal/cognitive-services/form-recognizer/fr-transparency-note?context=/azure/applied-ai-services/form-recognizer/context/context#considerations-when-choosing-other-use-cases).
4040
4141
- question: |
42-
What languages are supported by Form Recognizer?
42+
What languages does Form Recognizer support?
4343
answer: |
4444
4545
Form Recognizer's deep-learning-based universal models support many languages that can extract multi-lingual text from your images and documents, including text lines with mixed languages.
@@ -92,7 +92,7 @@ sections:
9292
- question: |
9393
How can I improve accuracy scores?
9494
answer: |
95-
The accuracy of a model is influenced by variances in the visual structure of your documents.
95+
Variances in the visual structure of your documents can influence the accuracy of a model.
9696
9797
- Ensure that all variations of a document are included in the training dataset. Variations include different formats, for example, digital versus scanned PDFs.
9898
@@ -212,14 +212,14 @@ sections:
212212
213213
- The parameter `pages`(supported in both v2.1 and v3.0 REST API) enables you to specify pages for multi-page PDF and TIFF documents. Accepted input includes the following ranges:
214214
215-
- Single pages (for example,'1, 2' -> pages 1 and 2 will be processed).- Finite (for example '2-5' -> pages 2 to 5 will be processed)
216-
- Open-ended ranges (for example '5-' -> all the pages from page 5 will be processed & for example, '-10' -> pages 1 to 10 will be processed).
215+
- Single pages (for example,'1, 2' -> pages 1 and 2 are processed).- Finite (for example '2-5' -> pages 2 to 5 are processed)
216+
- Open-ended ranges (for example '5-' -> all the pages from page 5 are processed & for example, '-10' -> pages 1 to 10 are processed).
217217
218-
- These parameters can be mixed together and ranges are allowed to overlap (for example, '-5, 1, 3, 5-10' - pages 1 to 10 will be processed).
218+
- These parameters can be mixed together and ranges are allowed to overlap (for example, '-5, 1, 3, 5-10' - pages 1 to 10 are processed).
219219
220-
- The service will accept the request if it can process at least one page of the document. For example, using '5-100' on a five page document is a valid input where page 5 will be processed.
220+
- The service accepts the request if it can process at least one page of the document. For example, using '5-100' on a five page document is a valid input where page 5 is processed.
221221
222-
- If no page range is provided, the entire document will be processed.
222+
- If no page range is provided, the entire document is processed.
223223
224224
- question: |
225225
Both Form Recognizer Studio and the FOTT Sample Labeling tool are available. Which one should I use?
@@ -243,9 +243,9 @@ sections:
243243
244244
- When analyzing PDF and TIFF files, each page in the PDF file or each image in the TIFF file is counted as one page with no maximum character limits.
245245
246-
- When analyzing Microsoft Word and HTML files supported by only the Read model, pages are counted in blocks of 3,000 characters each. For example, if your document contains 7,000 characters, the two pages with 3,000 characters each and one page with 1,000 characters will add up to a total of three pages.
246+
- When analyzing Microsoft Word and HTML files supported by only the Read model, pages are counted in blocks of 3,000 characters each. For example, if your document contains 7,000 characters, the two pages with 3,000 characters each and one page with 1,000 characters adds up to a total of three pages.
247247
248-
- In addition, when using the Read model, if your Microsoft Word, Excel, and PowerPoint pages have embedded images, each image will be analyzed and counted as a page. Therefore, the total analyzed pages for Microsoft Office documents will be equal to the sum of total text pages and total images analyzed. In the previous example if the document contains two embedded images, the total page count in the service output will be three text pages plus two images equaling a total of five pages.
248+
- In addition, when using the Read model, if your Microsoft Word, Excel, and PowerPoint pages have embedded images, each image is analyzed and counted as a page. Therefore, the total analyzed pages for Microsoft Office documents are equal to the sum of total text pages and total images analyzed. In the previous example if the document contains two embedded images, the total page count in the service output is three text pages plus two images equaling a total of five pages.
249249
250250
- Training a custom model is always free with Form Recognizer. You’re only charged when a model is used to analyze a document.
251251
@@ -278,7 +278,7 @@ sections:
278278
Learn more about Form Recognizer [service quotas and limits](service-limits.md)
279279
280280
- question: |
281-
How long will it take to analyze a document?
281+
How long does it take to analyze a document?
282282
answer: |
283283
Form Recognizer is a multi-tenanted service where latency for similar documents is comparable but not always identical. The time to analyze a document depends on the size (for example, number of pages) and associated content on each page.
284284
@@ -335,7 +335,7 @@ sections:
335335
336336
- Model Compose is currently available only for custom models trained with labels.
337337
338-
- Analyzing a document with composed models is identical to analyzing a document with a single model, the analyze result returns a ```docType``` property indicating which of the component models was selected for analyzing the document. There is no change in pricing for analyzing a document with an individual custom model or a composed custom model.
338+
- Analyzing a document with composed models is identical to analyzing a document with a single model, the analyze result returns a ```docType``` property indicating which of the component models was selected for analyzing the document. There's no change in pricing for analyzing a document with an individual custom model or a composed custom model.
339339
340340
Learn more about [composed models](concept-custom.md).
341341
@@ -396,12 +396,12 @@ sections:
396396
397397
When you create a shared access signature (SAS), the default duration is 48 hours. After 48 hours, you'll need to create a new token.
398398
399-
Consider setting a longer duration period for the time you'll be using your storage account with Form Recognizer.
399+
Consider setting a longer duration period for the time you're using your storage account with Form Recognizer.
400400
401401
- question: |
402402
If my storage account is behind a VNet or firewall, how do I give Form Recognizer access to my storage account data?
403403
answer: |
404-
If you have an Azure storage account protected by a Virtual Network (VNet) or firewall, Form Recognizer can’t directly access your storage account. However, Private Azure storage account access and authentication are supported by [managed identities for Azure resources](../../active-directory/managed-identities-azure-resources/overview.md). Once a managed identity is enabled, the Form Recognizer service can access your storage account using an assigned managed identity credential.
404+
If you have an Azure storage account protected by a Virtual Network (VNet) or firewall, Form Recognizer can’t directly access your storage account. However, Private Azure storage account access and authentication support [managed identities for Azure resources](../../active-directory/managed-identities-azure-resources/overview.md). Once a managed identity is enabled, the Form Recognizer service can access your storage account using an assigned managed identity credential.
405405
406406
If you intend to analyze your private storage account data with FOTT, the tool must be deployed behind the VNet or firewall.
407407

articles/applied-ai-services/form-recognizer/language-support.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ manager: nitinme
77
ms.service: applied-ai-services
88
ms.subservice: forms-recognizer
99
ms.topic: reference
10-
ms.date: 02/13/2023
10+
ms.date: 03/09/2023
1111
ms.author: lajanuar
1212
---
1313

@@ -26,7 +26,7 @@ This article covers the supported languages for text and field **extraction (by
2626

2727
## Read, layout, and custom form (template) model
2828

29-
The following lists include the currently GA languages in the most recent v3.0 version. These languages are supported by Read, Layout, and Custom form (template) model features.
29+
The following lists include the currently GA languages in the most recent v3.0 version for Read, Layout, and Custom template (form) models.
3030

3131
> [!NOTE]
3232
> **Language code optional**
607 KB
Loading
324 KB
Loading

articles/applied-ai-services/form-recognizer/toc.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -145,8 +145,8 @@ items:
145145
- name: Layout model
146146
displayName: tables, selection marks, structure, paragraph roles, text, headers, page numbers
147147
href: concept-layout.md
148-
- name: Query fields
149-
href: concept-query-fields.md
148+
- name: 🆕 Query field extraction
149+
href: concept-query-field-extraction.md (preview)
150150
- name: 🆕 Health insurance card model (preview)
151151
displayName: health, proof, hospital
152152
href: concept-insurance-card.md

0 commit comments

Comments
 (0)