You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/applied-ai-services/form-recognizer/concept-model-overview.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -189,9 +189,9 @@ A composed model is created by taking a collection of custom models and assignin
189
189
## Input requirements
190
190
191
191
* For best results, provide one clear photo or high-quality scan per document.
192
-
* Supported file formats: JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
192
+
* Supported file formats: JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned). Text-embedded PDFs are best to eliminate the possibility of error in character extraction and location.
193
193
* For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
194
-
* The file size must be less than 50 MB.
194
+
* The file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
195
195
* Image dimensions must be between 50 x 50 pixels and 10,000 x 10,000 pixels.
196
196
* PDF dimensions are up to 17 x 17 inches, corresponding to Legal or A3 paper size, or smaller.
197
197
* The total size of the training data is 500 pages or less.
Which file formats does Form Recognizer support? Are there size limitations for input documents?
202
202
answer: |
203
203
204
-
- Form Recognizer extracts data from document images JPEG, PNG, BMP, TIFF, and PDF (text-embedded or scanned) formats and returns a structured output.
204
+
- Form Recognizer extracts data from document images JPEG/JPG, PNG, BMP, TIFF, and PDF (text-embedded or scanned) formats and returns a structured output.
205
205
- For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
206
-
- Your file size must be less than 50 MB.
206
+
- Your file size must be less than 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
207
207
- Image dimensions must be between 50 x 50 pixels and 10000 x 10,000 pixels.
208
208
- PDF dimensions can be a maximum of 17 x 17 inches (corresponding to Legal or A3 paper size) or smaller.
209
209
- The total allowable size of training data is 500 pages or less.
210
210
211
211
To ensure the best results, see [input requirements](concept-model-overview.md#input-requirements).
212
212
213
+
- question: |
214
+
How can I specify a specific range of pages to be analyzed in a document?
215
+
answer: |
216
+
217
+
There's a parameter `pages` supported in both v2.1 and v3.0 REST API which you can specify for multi-page PDF and TIFF documents. Accepted input include single pages (e.g.'1, 2' -> pages 1 and 2 will be processed), finite (e.g. '2-5' -> pages 2 to 5 will be processed) and open-ended ranges (e.g. '5-' -> all the pages from page 5 will be processed & e.g. '-10' -> pages 1 to 10 will be processed). All of these can be mixed together and ranges are allowed to overlap (eg. '-5, 1, 3, 5-10' - pages 1 to 10 will be processed). The service will accept the request if it can process at least one page of the document (e.g. using '5-100' on a 5 page document is a valid input where page 5 will be processed). If no page range is provided, the entire document will be processed.
218
+
213
219
- question: |
214
220
Both Form Recognizer Studio and the FOTT sample labeling tool are available. Which one should I use?
215
221
answer: |
@@ -271,7 +277,7 @@ sections:
271
277
272
278
- Use text-based instead of image-based PDFs when possible. One way to identify an image-based PDF is to try selecting specific text in the document. If you can only select the entire image of the text, the document is image-based, not text-based.
273
279
274
-
- Organize your training documents by using a subfolder for each format (JPG, PNG, BMP, PDF, or TIFF).
280
+
- Organize your training documents by using a subfolder for each format (JPEG/JPG, PNG, BMP, PDF, or TIFF).
275
281
276
282
- Use forms that have all of the available fields completed.
277
283
@@ -315,6 +321,15 @@ sections:
315
321
316
322
Learn more about [composed models](concept-custom.md).
317
323
324
+
- question: |
325
+
If the number of models I want to compose exceeds the upper limit of composed model, what are the alternatives?
326
+
answer: |
327
+
You can classify the documents before calling the custom model or consider [Custom neural model](concept-custom-neural.md):
328
+
329
+
- Use [Read model](concept-read.md) and build a classification based on the extracted text from the documents and certain phrases using code, regular expressions, search etc.
330
+
331
+
- If you want to extract the same fields from various structured, semi-structured, and unstructured documents. Consider using the deep learning [custom neural model](concept-custom-neural.md). Learn more about the [differences between custom template model and custom neural model](concept-custom.md#compare-model-features).
332
+
318
333
- question: |
319
334
How do I refine a model beyond the initial training?
320
335
answer: |
@@ -349,6 +364,11 @@ sections:
349
364
- Do your tables span across multiple pages? If so, to avoid having to label all of the pages, split the PDF into pages prior to sending it to Form Recognizer. Following the analysis, post-process the pages to a single table.
350
365
351
366
- If you’re creating custom models, refer to [Labeling as tables](quickstarts/try-v3-form-recognizer-studio.md#labeling-as-tables). Dynamic tables have a variable number of rows for each given column. Fixed tables have a constant number of rows for each given column.
367
+
368
+
- question: |
369
+
How can I move my trained models from one environment (like beta) to another (like production)?
370
+
answer: |
371
+
The Copy API enables this scenario by allowing you to copy custom models from one Form Recognizer account or into others, which can exist in any supported geographical region. Follow [this document](disaster-recovery.md) for detailed instructions.
352
372
353
373
- name: Storage account
354
374
questions:
@@ -369,6 +389,22 @@ sections:
369
389
370
390
Learn how to [create and use a managed identity for your Form Recognizer resource](managed-identities.md)
371
391
392
+
- name: Form Recognizer Studio
393
+
questions:
394
+
- question: |
395
+
I have mulitple pages in a document, Why are there only 2 pages analyzed in Form Recognizer Studio?
396
+
answer: |
397
+
398
+
For free (F0) tier resources, only the first 2 pages are analyzed no matter you are using Form Recognizer Studio, REST API or SDKs. In Form Recognizer Studio, click the top right gear button (Settings), swtich to Resourecs tab and check the Price Tier you are using to analyze the documents. Change to an S0 paid resource if you want to analyze all pages in a document.
399
+
400
+
- question: |
401
+
How can I change directories or subscriptions to use in Form Recognizer Studio?
402
+
answer: |
403
+
404
+
- In Form Recognizer Studio, you can click the top right gear button (Settings), under Directory, search and select the directory from the list and click on Switch Directory. You will be prompted to sign in again after switching directory.
405
+
406
+
- Switching subscriptions or resources can be done under Settings -> Resource tab.
407
+
372
408
- name: Containers
373
409
questions:
374
410
- question: |
@@ -420,6 +456,11 @@ sections:
420
456
421
457
Learn more about [Data, privacy, and security for Form Recognizer](/legal/cognitive-services/form-recognizer/fr-data-privacy-security?context=/azure/applied-ai-services/form-recognizer/context/context).
422
458
459
+
- question: |
460
+
How are my trained custom models stored and utilized in Form Recognizer?
461
+
answer: |
462
+
The Custom model feature allows customers to build custom models from training data stored in customer’s Azure blob storage locations. The interim outputs after analysis and labeling are stored in the same location. The trained custom models are stored in Azure storage in the same region and logically isolated with their Azure subscription and API credentials.
0 commit comments