Skip to content

Commit 72a0880

Browse files
authored
Merge pull request #196326 from laujan/form-recognizer-update-file-size-directive
update input directives
2 parents cb67937 + 90abe2f commit 72a0880

12 files changed

+21
-58
lines changed

articles/applied-ai-services/form-recognizer/concept-business-card.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -78,16 +78,13 @@ You'll need a business card document. You can use our [sample business card docu
7878
## Input requirements
7979

8080
* For best results, provide one clear photo or high-quality scan per document.
81-
* Supported file formats: JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
81+
* Supported file formats: JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
8282
* For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
83-
* The file size must be less than 50 MB.
83+
* The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
8484
* Image dimensions must be between 50 x 50 pixels and 10,000 x 10,000 pixels.
8585
* PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
8686
* The total size of the training data is 500 pages or less.
8787
* If your PDFs are password-locked, you must remove the lock before submission.
88-
* For unsupervised learning (without labeled data):
89-
* Data must contain keys and values.
90-
* Keys must appear above or to the left of the values; they can't appear below or to the right.
9188

9289
> [!NOTE]
9390
> The [Sample Labeling tool](https://fott-2-1.azurewebsites.net/) does not support the BMP file format. This is a limitation of the tool not the Form Recognizer Service.

articles/applied-ai-services/form-recognizer/concept-custom.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -149,16 +149,13 @@ The following table describes the features available with the associated tools a
149149
> Custom template models trained with the 3.0 API will have a few improvements over the 2.1 API stemming from improvements to the OCR engine. Datasets used to train a custom template model using the 2.1 API can still be used to train a new model using the 3.0 API.
150150
151151
* For best results, provide one clear photo or high-quality scan per document.
152-
* Supported file formats are JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
152+
* Supported file formats are JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
153153
* For PDF and TIFF files, up to 2,000 pages can be processed. With a free tier subscription, only the first two pages are processed.
154-
* The file size must be less than 50 MB.
154+
* The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
155155
* Image dimensions must be between 50 x 50 pixels and 10,000 x 10,000 pixels.
156156
* PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
157157
* The total size of the training data is 500 pages or less.
158158
* If your PDFs are password-locked, you must remove the lock before submission.
159-
* For unsupervised learning (without labeled data):
160-
* Data must contain keys and values.
161-
* Keys must appear above or to the left of the values. They can't appear below or to the right.
162159

163160
> [!TIP]
164161
> Training data:

articles/applied-ai-services/form-recognizer/concept-general-document.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -101,16 +101,13 @@ The key value pair extraction model and entity identification model are run in p
101101
## Input requirements
102102

103103
* For best results, provide one clear photo or high-quality scan per document.
104-
* Supported file formats: JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
104+
* Supported file formats: JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
105105
* For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
106-
* The file size must be less than 50 MB.
106+
* The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
107107
* Image dimensions must be between 50 x 50 pixels and 10,000 x 10,000 pixels.
108108
* PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
109109
* The total size of the training data is 500 pages or less.
110110
* If your PDFs are password-locked, you must remove the lock before submission.
111-
* For unsupervised learning (without labeled data):
112-
* Data must contain keys and values.
113-
* Keys must appear above or to the left of the values; they can't appear below or to the right.
114111

115112
## Supported languages and locales
116113

articles/applied-ai-services/form-recognizer/concept-id-document.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -78,16 +78,13 @@ You'll need an ID document. You can use our [sample ID document](https://raw.git
7878
## Input requirements
7979

8080
* For best results, provide one clear photo or high-quality scan per document.
81-
* Supported file formats: JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
81+
* Supported file formats: JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
8282
* For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
83-
* The file size must be less than 50 MB.
83+
* The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
8484
* Image dimensions must be between 50 x 50 pixels and 10,000 x 10,000 pixels.
8585
* PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
8686
* The total size of the training data is 500 pages or less.
8787
* If your PDFs are password-locked, you must remove the lock before submission.
88-
* For unsupervised learning (without labeled data):
89-
* Data must contain keys and values.
90-
* Keys must appear above or to the left of the values; they can't appear below or to the right.
9188

9289
> [!NOTE]
9390
> The [Sample Labeling tool](https://fott-2-1.azurewebsites.net/) does not support the BMP file format. This is a limitation of the tool not the Form Recognizer Service.

articles/applied-ai-services/form-recognizer/concept-invoice.md

Lines changed: 3 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -76,16 +76,13 @@ You'll need an invoice document. You can use our [sample invoice document](https
7676
## Input requirements
7777

7878
* For best results, provide one clear photo or high-quality scan per document.
79-
* Supported file formats: JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
79+
* Supported file formats: JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
8080
* For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
81-
* The file size must be less than 50 MB.
81+
* The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
8282
* Image dimensions must be between 50 x 50 pixels and 10,000 x 10,000 pixels.
8383
* PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
8484
* The total size of the training data is 500 pages or less.
8585
* If your PDFs are password-locked, you must remove the lock before submission.
86-
* For unsupervised learning (without labeled data):
87-
* Data must contain keys and values.
88-
* Keys must appear above or to the left of the values; they can't appear below or to the right.
8986

9087
> [!NOTE]
9188
> The [Sample Labeling tool](https://fott-2-1.azurewebsites.net/) does not support the BMP file format. This is a limitation of the tool not the Form Recognizer Service.
@@ -147,7 +144,7 @@ Following are the line items extracted from an invoice in the JSON output respon
147144
| Unit | String| The unit of the line item, e.g, kg, lb etc. | Hours | |
148145
| Date | Date| Date corresponding to each line item. Often it's a date the line item was shipped | 3/4/2021| 2021-03-04 |
149146
| Tax | Number | Tax associated with each line item. Possible values include tax amount, tax %, and tax Y/N | 10% | |
150-
| VAT | Number | Stands for Value added tax. This is a flat tax levied on an item. Common in european countries | €20.00 | |
147+
| VAT | Number | Stands for Value added tax. This is a flat tax levied on an item. Common in European countries | €20.00 | |
151148

152149
The invoice key-value pairs and line items extracted are in the `documentResults` section of the JSON output.
153150

articles/applied-ai-services/form-recognizer/concept-layout.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -89,9 +89,9 @@ You'll need a form document. You can use our [sample form document](https://raw.
8989
## Input requirements
9090

9191
* For best results, provide one clear photo or high-quality scan per document.
92-
* Supported file formats: JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
92+
* Supported file formats: JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
9393
* For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
94-
* The file size must be less than 50 MB (4 MB for the free tier).
94+
* The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier (4 MB for the free tier).
9595
* Image dimensions must be between 50 x 50 pixels and 10,000 x 10,000 pixels.
9696

9797
> [!NOTE]

articles/applied-ai-services/form-recognizer/concept-model-overview.md

Lines changed: 0 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -196,9 +196,6 @@ A composed model is created by taking a collection of custom models and assignin
196196
* PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
197197
* The total size of the training data is 500 pages or less.
198198
* If your PDFs are password-locked, you must remove the lock before submission.
199-
* For unsupervised learning (without labeled data):
200-
* Data must contain keys and values.
201-
* Keys must appear above or to the left of the values; they can't appear below or to the right.
202199

203200
> [!NOTE]
204201
> The [Sample Labeling tool](https://fott-2-1.azurewebsites.net/) does not support the BMP file format. This is a limitation of the tool not the Form Recognizer Service.

articles/applied-ai-services/form-recognizer/concept-read.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -64,9 +64,9 @@ See how text is extracted from forms and documents using the Form Recognizer Stu
6464
## Input requirements
6565

6666
* For best results, provide one clear photo or high-quality scan per document.
67-
* Supported file formats: JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
67+
* Supported file formats: JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
6868
* For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
69-
* The file size must be less than 50 MB (4 MB for the free tier)
69+
* The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier (4 MB for the free tier)
7070
* Image dimensions must be between 50 x 50 pixels and 10,000 x 10,000 pixels.
7171

7272
## Supported languages and locales

articles/applied-ai-services/form-recognizer/concept-receipt.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -78,16 +78,13 @@ You will need a receipt document. You can use our [sample receipt document](http
7878
## Input requirements
7979

8080
* For best results, provide one clear photo or high-quality scan per document.
81-
* Supported file formats: JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
81+
* Supported file formats: JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
8282
* For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
83-
* The file size must be less than 50 MB.
83+
* The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
8484
* Image dimensions must be between 50 x 50 pixels and 10000 x 10000 pixels.
8585
* PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
8686
* The total size of the training data is 500 pages or less.
8787
* If your PDFs are password-locked, you must remove the lock before submission.
88-
* For unsupervised learning (without labeled data):
89-
* Data must contain keys and values.
90-
* Keys must appear above or to the left of the values; they can't appear below or to the right.
9188

9289
## Supported languages and locales v2.1
9390

articles/applied-ai-services/form-recognizer/concept-w2.md

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -59,16 +59,13 @@ See how data is extracted from W-2 forms using the Form Recognizer Studio. You'l
5959
## Input requirements
6060

6161
* For best results, provide one clear photo or high-quality scan per document.
62-
* Supported file formats: JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
62+
* Supported file formats: JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
6363
* For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
64-
* The file size must be less than 50 MB.
64+
* The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
6565
* Image dimensions must be between 50 x 50 pixels and 10,000 x 10,000 pixels.
6666
* PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
6767
* The total size of the training data is 500 pages or less.
6868
* If your PDFs are password-locked, you must remove the lock before submission.
69-
* For unsupervised learning (without labeled data):
70-
* Data must contain keys and values.
71-
* Keys must appear above or to the left of the values; they can't appear below or to the right.
7269

7370
## Supported languages and locales
7471

0 commit comments

Comments
 (0)