Skip to content

Commit 15d278d

Browse files
authored
Merge pull request #191810 from laujan/1925568-single-script-python
update python to single script
2 parents 301492b + ed1b076 commit 15d278d

File tree

3 files changed

+76
-66
lines changed

3 files changed

+76
-66
lines changed

articles/applied-ai-services/form-recognizer/concept-custom.md

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -23,7 +23,9 @@ Custom models can be one of two types, [**custom template**](concept-custom-temp
2323

2424
### Custom template model
2525

26-
The custom template or custom form model relies on a consistent visual template to extract the labeled data. The accuracy of your model is affected by variances in the visual structure of your documents. Structured forms such as questionnaires or applications are examples of consistent visual templates. Your training set will consist of structured documents where the formatting and layout are static and constant from one document instance to the next. Custom template models support key-value pairs, selection marks, tables, signature fields and regions and can be trained on documents in any of the [supported languages](language-support.md). For more information, *see* [custom template models](concept-custom-template.md ).
26+
The custom template or custom form model relies on a consistent visual template to extract the labeled data. The accuracy of your model is affected by variances in the visual structure of your documents. Structured forms such as questionnaires or applications are examples of consistent visual templates.
27+
28+
Your training set will consist of structured documents where the formatting and layout are static and constant from one document instance to the next. Custom template models support key-value pairs, selection marks, tables, signature fields, and regions. Template models and can be trained on documents in any of the [supported languages](language-support.md). For more information, *see* [custom template models](concept-custom-template.md ).
2729

2830
> [!TIP]
2931
>
@@ -33,15 +35,15 @@ Custom models can be one of two types, [**custom template**](concept-custom-temp
3335
3436
### Custom neural model
3537

36-
The custom neural (custom document) model is a deep learning model type that relies on a base model trained on a large collection of documents. This model is then fine-tuned or adapted to your data when you train the model with a labeled dataset. Custom neural models support structured, semi-structured, and unstructured documents to extract fields. Custom neural models currently support English-language documents. When you're choosing between the two model types, start with a neural model if it meets your functional needs. See [neural models](concept-custom-neural.md) to learn more about custom document models.
38+
The custom neural (custom document) model uses deep learning models and base model trained on a large collection of documents. This model is then fine-tuned or adapted to your data when you train the model with a labeled dataset. Custom neural models support structured, semi-structured, and unstructured documents to extract fields. Custom neural models currently support English-language documents. When you're choosing between the two model types, start with a neural model if it meets your functional needs. See [neural models](concept-custom-neural.md) to learn more about custom document models.
3739

3840
## Build mode
3941

4042
The build custom model operation has added support for the *template* and *neural* custom models. Previous versions of the REST API and SDKs only supported a single build mode that is now known as the *template* mode.
4143

4244
* Template models only accept documents that have the same basic page structure—a uniform visual appearance—or the same relative positioning of elements within the document.
4345

44-
* Neural models support documents that have the same information, but different page structures. Examples of these documents include United States W2 forms, which share the same information, but may vary in appearance by the company that created the document. Neural models currently only support English text.
46+
* Neural models support documents that have the same information, but different page structures. Examples of these documents include United States W2 forms, which share the same information, but may vary in appearance across companies. Neural models currently only support English text.
4547

4648
This table provides links to the build mode programming language SDK references and code samples on GitHub:
4749

@@ -68,15 +70,15 @@ The table below compares custom template and custom neural features:
6870

6971
The following tools are supported by Form Recognizer v2.1:
7072

71-
| Feature | Resources |
72-
|----------|-------------------------|
73-
|Custom model| <ul><li>[Form Recognizer labeling tool](https://fott-2-1.azurewebsites.net)</li><li>[REST API](quickstarts/try-sdk-rest-api.md?pivots=programming-language-rest-api#analyze-forms-with-a-custom-model)</li><li>[Client library SDK](quickstarts/try-sdk-rest-api.md)</li><li>[Form Recognizer Docker container](containers/form-recognizer-container-install-run.md?tabs=custom#run-the-container-with-the-docker-compose-up-command)</li></ul>|
73+
| Feature | Resources | Model ID|
74+
|---|---|:---|
75+
|Custom model| <ul><li>[Form Recognizer labeling tool](https://fott-2-1.azurewebsites.net)</li><li>[REST API](quickstarts/try-sdk-rest-api.md?pivots=programming-language-rest-api#analyze-forms-with-a-custom-model)</li><li>[Client library SDK](quickstarts/try-sdk-rest-api.md)</li><li>[Form Recognizer Docker container](containers/form-recognizer-container-install-run.md?tabs=custom#run-the-container-with-the-docker-compose-up-command)</li></ul>|***custom-model-id***|
7476

7577
The following tools are supported by Form Recognizer v3.0:
7678

77-
| Feature | Resources |
78-
|----------|-------------|
79-
|Custom model| <ul><li>[Form Recognizer Studio](https://formrecognizer.appliedai.azure.com/studio/customform/projects)</li><li>[REST API](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v3-0-preview-2/operations/AnalyzeDocument)</li><li>[C# SDK](quickstarts/try-v3-csharp-sdk.md)</li><li>[Python SDK](quickstarts/try-v3-python-sdk.md)</li></ul>|
79+
| Feature | Resources | Model ID|
80+
|---|---|:---|
81+
|Custom model| <ul><li>[Form Recognizer Studio](https://formrecognizer.appliedai.azure.com/studio/customform/projects)</li><li>[REST API](https://westus.dev.cognitive.microsoft.com/docs/services/form-recognizer-api-v3-0-preview-2/operations/AnalyzeDocument)</li><li>[C# SDK](quickstarts/try-v3-csharp-sdk.md)</li><li>[Python SDK](quickstarts/try-v3-python-sdk.md)</li></ul>|***custom-model-id***|
8082

8183
### Try Form Recognizer
8284

articles/applied-ai-services/form-recognizer/quickstarts/try-v3-csharp-sdk.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -152,7 +152,7 @@ Analyze and extract text, tables, structure, key-value pairs, and named entities
152152
> * We've added the file URI value to the `Uri fileUri` variable at the top of the script.
153153
> * For simplicity, all the entity fields that the service returns are not shown here. To see the list of all supported fields and corresponding types, see the [General document](../concept-general-document.md#named-entity-recognition-ner-categories) concept page.
154154
155-
### Add the following code to the Program.cs file:
155+
**Add the following code sample to the Program.cs file:**
156156

157157
```csharp
158158
using Azure;
@@ -268,7 +268,7 @@ for (int i = 0; i < result.Tables.Count; i++)
268268
### General document model output
269269

270270
Visit the Azure samples repository on GitHub to view the [general document model output](https://github.com/Azure-Samples/cognitive-services-quickstart-code/blob/master/dotnet/FormRecognizer/v3-csharp-sdk-general-document-output.md).
271-
271+
___
272272

273273
## Layout model
274274

@@ -280,7 +280,7 @@ Extract text, selection marks, text styles, table structures, and bounding regio
280280
> * We've added the file URI value to the `Uri fileUri` variable at the top of the script.
281281
> * To extract the layout from a given file at a URI, use the `StartAnalyzeDocumentFromUri` method and pass `prebuilt-layout` as the model ID. The returned value is an `AnalyzeResult` object containing data from the submitted document.
282282
283-
#### Add the following code to the Program.cs file:
283+
**Add the following code sample to the Program.cs file:**
284284

285285
```csharp
286286
using Azure;
@@ -383,7 +383,7 @@ Analyze and extract common fields from specific document types using a prebuilt
383383
> * To analyze a given file at a URI, use the `StartAnalyzeDocumentFromUri` method and pass `prebuilt-invoice` as the model ID. The returned value is an `AnalyzeResult` object containing data from the submitted document.
384384
> * For simplicity, all the key-value pairs that the service returns are not shown here. To see the list of all supported fields and corresponding types, see our [Invoice](../concept-invoice.md#field-extraction) concept page.
385385
386-
#### Add the following code to your Program.cs file:
386+
**Add the following code sample to your Program.cs file:**
387387

388388
```csharp
389389

articles/applied-ai-services/form-recognizer/quickstarts/try-v3-python-sdk.md

Lines changed: 61 additions & 53 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ manager: nitinme
77
ms.service: applied-ai-services
88
ms.subservice: forms-recognizer
99
ms.topic: quickstart
10-
ms.date: 03/08/2022
10+
ms.date: 03/15/2022
1111
ms.author: lajanuar
1212
recommendations: false
1313
ms.custom: ignite-fall-2021, mode-api
@@ -16,9 +16,9 @@ ms.custom: ignite-fall-2021, mode-api
1616
# Get started: Form Recognizer Python SDK v3.0 | Preview
1717

1818
>[!NOTE]
19-
> Form Recognizer v3.0 is currently in public preview. Some features may not be supported or have limited capabilities.
19+
> Form Recognizer v3.0 is currently in public preview. Some features may not be supported or have limited capabilities.
2020
21-
[Reference documentation](https://azuresdkdocs.blob.core.windows.net/$web/python/azure-ai-formrecognizer/3.2.0b3/index.html) | [Library source code](https://github.com/Azure/azure-sdk-for-python/tree/azure-ai-formrecognizer_3.2.0b3/sdk/formrecognizer/azure-ai-formrecognizer/) | [Package (PyPi)](https://pypi.org/project/azure-ai-formrecognizer/3.2.0b3/) | [Samples](https://github.com/Azure/azure-sdk-for-python/blob/azure-ai-formrecognizer_3.2.0b3/sdk/formrecognizer/azure-ai-formrecognizer/samples/README.md)
21+
[Reference documentation](/python/api/azure-ai-formrecognizer/azure.ai.formrecognizer?view=azure-python-preview&preserve-view=true) | [Library source code](https://github.com/Azure/azure-sdk-for-python/tree/azure-ai-formrecognizer_3.2.0b3/sdk/formrecognizer/azure-ai-formrecognizer/) | [Package (PyPi)](https://pypi.org/project/azure-ai-formrecognizer/3.2.0b3/) | [Samples](https://github.com/Azure/azure-sdk-for-python/blob/azure-ai-formrecognizer_3.2.0b3/sdk/formrecognizer/azure-ai-formrecognizer/samples/README.md)
2222

2323
Get started with Azure Form Recognizer using the Python programming language. Azure Form Recognizer is a cloud-based Azure Applied AI Service that uses machine learning to extract key-value pairs, text, and tables from your documents. You can easily call Form Recognizer models by integrating our client library SDks into your workflows and applications. We recommend that you use the free service when you're learning the technology. Remember that the number of free pages is limited to 500 per month.
2424

@@ -56,52 +56,17 @@ In this quickstart you'll use following features to analyze and extract data and
5656
Open a terminal window in your local environment and install the Azure Form Recognizer client library for Python with pip:
5757

5858
```console
59-
pip install azure-ai-formrecognizer==3.2.0b2
59+
pip install azure-ai-formrecognizer==3.2.0b3
6060

6161
```
6262

6363
### Create a new Python application
6464

65-
Create a new Python file called **form_recognizer_quickstart.py** in your preferred editor or IDE. Then import the following libraries:
65+
To interact with the Form Recognizer service, you'll need to create an instance of the `DocumentAnalysisClient` class. To do so, you'll create an `AzureKeyCredential` with your key from the Azure portal and a `DocumentAnalysisClient` instance with the `AzureKeyCredential` and your Form Recognizer `endpoint`.
6666

67-
```python
68-
import os
69-
from azure.core.exceptions import ResourceNotFoundError
70-
from azure.ai.formrecognizer import DocumentAnalysisClient
71-
from azure.core.credentials import AzureKeyCredential
72-
```
73-
74-
### Create variables for your Azure resource API endpoint and key
75-
76-
```python
77-
endpoint = "YOUR_FORM_RECOGNIZER_ENDPOINT"
78-
key = "YOUR_FORM_RECOGNIZER_SUBSCRIPTION_KEY"
79-
```
80-
81-
At this point, your Python application should contain the following lines of code:
82-
83-
```python
84-
import os
85-
from azure.core.exceptions import ResourceNotFoundError
86-
from azure.ai.formrecognizer import DocumentAnalysisClient
87-
from azure.core.credentials import AzureKeyCredential
88-
89-
endpoint = "YOUR_FORM_RECOGNIZER_ENDPOINT"
90-
key = "YOUR_FORM_RECOGNIZER_SUBSCRIPTION_KEY"
91-
92-
```
93-
94-
> [!TIP]
95-
> If you would like to try more than one code sample:
96-
>
97-
> * Select one of the sample code blocks below to copy and paste into your application.
98-
> * [**Run your application**](#run-your-application).
99-
> * Comment out that sample code block but keep the set-up code and library directives.
100-
> * Select another sample code block to copy and paste into your application.
101-
> * [**Run your application**](#run-your-application).
102-
> * You can continue to comment out, copy/paste, and run the sample blocks of code.
103-
104-
### Select a code sample to copy and paste into your application:
67+
1. Create a new Python file called **form_recognizer_quickstart.py** in your preferred editor or IDE.
68+
69+
1. Open the **form_recognizer_quickstart.py** file and select one of the following code samples to copy and pasted into your application:
10570

10671
* [**General document**](#general-document-model)
10772

@@ -124,10 +89,20 @@ Extract text, tables, structure, key-value pairs, and named entities from docume
12489
> * We've added the file URL value to the `docUrl` variable in the `analyze_general_documents` function.
12590
> * For simplicity, all the entity fields that the service returns are not shown here. To see the list of all supported fields and corresponding types, see our [General document](../concept-general-document.md#named-entity-recognition-ner-categories) concept page.
12691
127-
###### Add the following code to your general document application on a line below the `key` variable
92+
<!-- markdownlint-disable MD036 -->
93+
**Add the following sample code to your form_recognizer_quickstart.py application:**
12894

12995
```python
13096

97+
# import libraries
98+
import os
99+
from azure.ai.formrecognizer import DocumentAnalysisClient
100+
from azure.core.credentials import AzureKeyCredential
101+
102+
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
103+
endpoint = "<your-endpoint>"
104+
key = "<your-key>"
105+
131106
def format_bounding_region(bounding_regions):
132107
if not bounding_regions:
133108
return "N/A"
@@ -140,10 +115,10 @@ def format_bounding_box(bounding_box):
140115

141116

142117
def analyze_general_documents():
143-
# sample document
118+
# sample document
144119
docUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
145120

146-
document_analysis_client = DocumentAnalysisClient(
121+
# create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
147122
endpoint=endpoint, credential=AzureKeyCredential(key)
148123
)
149124

@@ -249,6 +224,12 @@ if __name__ == "__main__":
249224
analyze_general_documents()
250225
```
251226

227+
### General document model output
228+
229+
Visit the Azure samples repository on GitHub to view the [general document model output](https://github.com/Azure-Samples/cognitive-services-quickstart-code/blob/master/python/FormRecognizer/v3-python-sdk-general-document-output.md)
230+
231+
___
232+
252233
## Layout model
253234

254235
Extract text, selection marks, text styles, table structures, and bounding region coordinates from documents.
@@ -259,10 +240,19 @@ Extract text, selection marks, text styles, table structures, and bounding regio
259240
> * We've added the file URL value to the `formUrl` variable in the `analyze_layout` function.
260241
> * To analyze a given file at a URL, you'll use the `begin_analyze_document_from_url` method and pass in `prebuilt-layout` as the model Id. The returned value is a `result` object containing data about the submitted document.
261242
262-
#### Add the following code to your layout application on the line below the `key` variable
243+
**Add the following sample code to your form_recognizer_quickstart.py application:**
263244

264245
```python
265246

247+
# import libraries
248+
import os
249+
from azure.ai.formrecognizer import DocumentAnalysisClient
250+
from azure.core.credentials import AzureKeyCredential
251+
252+
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
253+
endpoint = "<your-endpoint>"
254+
key = "<your-key>"
255+
266256
def format_bounding_box(bounding_box):
267257
if not bounding_box:
268258
return "N/A"
@@ -272,6 +262,7 @@ def analyze_layout():
272262
# sample form document
273263
formUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-layout.pdf"
274264

265+
# create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
275266
document_analysis_client = DocumentAnalysisClient(
276267
endpoint=endpoint, credential=AzureKeyCredential(key)
277268
)
@@ -360,25 +351,37 @@ if __name__ == "__main__":
360351

361352
```
362353

354+
### Layout model output
355+
356+
Visit the Azure samples repository on GitHub to view the [layout model output](https://github.com/Azure-Samples/cognitive-services-quickstart-code/blob/master/python/FormRecognizer/v3-python-sdk-layout-output.md)
357+
358+
___
359+
363360
## Prebuilt model
364361

365-
In this example, we'll analyze an invoice using the **prebuilt-invoice** model.
362+
Analyze and extract common fields from specific document types using a prebuilt model. In this example, we'll analyze an invoice using the **prebuilt-invoice** model.
366363

367364
> [!TIP]
368365
> You aren't limited to invoices—there are several prebuilt models to choose from, each of which has its own set of supported fields. The model to use for the analyze operation depends on the type of document to be analyzed. See [**model data extraction**](../concept-model-overview.md#model-data-extraction).
369366
370-
#### Try the prebuilt invoice model
371-
372367
> [!div class="checklist"]
373368
>
374369
> * Analyze an invoice using the prebuilt-invoice model. You can use our [sample invoice document](https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf) for this quickstart.
375370
> * We've added the file URL value to the `invoiceUrl` variable at the top of the file.
376371
> * To analyze a given file at a URI, you'll use the `beginAnalyzeDocuments` method and pass `PrebuiltModels.Invoice` as the model Id. The returned value is a `result` object containing data about the submitted document.
377372
> * For simplicity, all the key-value pairs that the service returns are not shown here. To see the list of all supported fields and corresponding types, see our [Invoice](../concept-invoice.md#field-extraction) concept page.
378373
379-
#### Add the following code to your prebuilt invoice application below the `key` variable
374+
**Add the following sample code to your form_recognizer_quickstart.py application:**
380375

381376
```python
377+
# import libraries
378+
import os
379+
from azure.ai.formrecognizer import DocumentAnalysisClient
380+
from azure.core.credentials import AzureKeyCredential
381+
382+
# set `<your-endpoint>` and `<your-key>` variables with the values from the Azure portal
383+
endpoint = "<your-endpoint>"
384+
key = "<your-key>"
382385

383386
def format_bounding_region(bounding_regions):
384387
if not bounding_regions:
@@ -394,7 +397,8 @@ def format_bounding_box(bounding_box):
394397
def analyze_invoice():
395398

396399
invoiceUrl = "https://raw.githubusercontent.com/Azure-Samples/cognitive-services-REST-api-samples/master/curl/form-recognizer/sample-invoice.pdf"
397-
400+
401+
# create your `DocumentAnalysisClient` instance and `AzureKeyCredential` variable
398402
document_analysis_client = DocumentAnalysisClient(
399403
endpoint=endpoint, credential=AzureKeyCredential(key)
400404
)
@@ -654,6 +658,10 @@ if __name__ == "__main__":
654658
analyze_invoice()
655659
```
656660

661+
### Prebuilt model output
662+
663+
Visit the Azure samples repository on GitHub to view the [prebuilt invoice model output](https://github.com/Azure-Samples/cognitive-services-quickstart-code/blob/master/python/FormRecognizer/v3-python-sdk-prebuilt-invoice-output.md)
664+
657665
## Run your application
658666

659667
1. Navigate to the folder where you have your **form_recognizer_quickstart.py** file.
@@ -664,7 +672,7 @@ if __name__ == "__main__":
664672
python form_recognizer_quickstart.py
665673
```
666674

667-
That's it, congratulations!
675+
That's it, congratulations!
668676

669677
In this quickstart, you used the Form Recognizer Python SDK to analyze various forms in different ways. Next, explore the reference documentation to learn more about Form Recognizer v3.0 API.
670678

0 commit comments

Comments
 (0)