Skip to content

Commit b2b92c3

Browse files
authored
Merge pull request #284549 from laujan/296926-update-input-include
update include file
2 parents 9296d2c + 3faf59d commit b2b92c3

File tree

4 files changed

+24
-25
lines changed

4 files changed

+24
-25
lines changed

articles/ai-services/document-intelligence/concept-accuracy-confidence.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -16,9 +16,10 @@ ms.author: lajanuar
1616

1717
A confidence score indicates probability by measuring the degree of statistical certainty that the extracted result is detected correctly. The estimated accuracy is calculated by running a few different combinations of the training data to predict the labeled values. In this article, learn to interpret accuracy and confidence scores and best practices for using those scores to improve accuracy and confidence results.
1818

19-
2019
## Confidence scores
20+
2121
> [!NOTE]
22+
>
2223
> * Field level confidence is getting update to take into account word confidence score starting with **2024-07-31-preview** API version for **custom models**.
2324
> * Confidence scores for tables, table rows and table cells are available starting with the **2024-07-31-preview** API version for **custom models**.
2425

articles/ai-services/document-intelligence/concept-retrieval-augmented-generation.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,8 @@ author: laujan
66
manager: nitinme
77
ms.service: azure-ai-document-intelligence
88
ms.topic: conceptual
9-
ms.date: 02/29/2024
10-
ms.author: luzhan
9+
ms.date: 08/13/2024
10+
ms.author: lajanuar
1111
monikerRange: '>=doc-intel-3.1.0'
1212
---
1313

@@ -130,27 +130,27 @@ key = "<api_key>"
130130

131131
from langchain_community.document_loaders import AzureAIDocumentIntelligenceLoader
132132
from langchain.text_splitter import MarkdownHeaderTextSplitter
133-
133+
134134
# Initiate Azure AI Document Intelligence to load the document. You can either specify file_path or url_path to load the document.
135135
loader = AzureAIDocumentIntelligenceLoader(file_path="<path to your file>", api_key = key, api_endpoint = endpoint, api_model="prebuilt-layout")
136136
docs = loader.load()
137-
137+
138138
# Split the document into chunks base on markdown headers.
139139
headers_to_split_on = [
140140
("#", "Header 1"),
141141
("##", "Header 2"),
142142
("###", "Header 3"),
143143
]
144144
text_splitter = MarkdownHeaderTextSplitter(headers_to_split_on=headers_to_split_on)
145-
145+
146146
docs_string = docs[0].page_content
147147
splits = text_splitter.split_text(docs_string)
148148
splits
149149
```
150+
150151
> [!div class="nextstepaction"]
151152
> [View samples on GitHub.](https://github.com/Azure-Samples/document-intelligence-code-samples/blob/main/Python(v4.0)/Retrieval_Augmented_Generation_(RAG)_samples/sample_rag_langchain.ipynb)
152153
153-
154154
## Next steps
155155

156156
* Learn more about [Azure AI Document Intelligence](overview.md).
Lines changed: 13 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,39 +1,37 @@
11
---
22
author: laujan
33
ms.service: azure-ai-document-intelligence
4-
ms.custom:
5-
- ignite-2023
64
ms.topic: include
7-
ms.date: 11/15/2023
5+
ms.date: 08/13/2024
86
ms.author: lajanuar
97
---
108
<!-- markdownlint-disable MD041 -->
119

12-
* For best results, provide one clear photo or high-quality scan per document.
13-
1410
* Supported file formats:
1511

16-
|Model | PDF |Image: </br>JPEG/JPG, PNG, BMP, TIFF, HEIF | Microsoft Office: </br> Word (DOCX), Excel (XLSX), PowerPoint (PPTX), and HTML|
17-
|--------|:----:|:-----:|:---------------:
12+
|Model | PDF |Image: </br>`JPEG/JPG`, `PNG`, `BMP`, `TIFF`, `HEIF` | Microsoft Office: </br> Word (`DOCX`), Excel (`XLSX`), PowerPoint (`PPTX`), HTML|
13+
|--------|:----:|:-----:|:---------------:|
1814
|Read ||||
19-
|Layout ||| ✔ (2024-02-29-preview, 2023-10-31-preview) |
15+
|Layout ||| ✔ (2024-07-31-preview, 2024-02-29-preview, 2023-10-31-preview) |
2016
|General&nbsp;Document||| |
2117
|Prebuilt ||| |
2218
|Custom extraction ||| |
23-
|Custom classification ||| ✔ (2024-02-29-preview) |
19+
|Custom classification ||| ✔ (2024-07-31-preview, 2024-02-29-preview) |
20+
21+
* For best results, provide one clear photo or high-quality scan per document.
2422

25-
* For PDF and TIFF, up to 2000 pages can be processed (with a free tier subscription, only the first two pages are processed).
23+
* For PDF and TIFF, up to 2,000 pages can be processed (with a free tier subscription, only the first two pages are processed).
2624

27-
* The file size for analyzing documents is 500 MB for paid (S0) tier and 4 MB for free (F0) tier.
25+
* The file size for analyzing documents is 500 MB for paid (S0) tier and `4` MB for free (F0) tier.
2826

29-
* Image dimensions must be between 50 x 50 pixels and 10,000 px x 10,000 pixels.
27+
* Image dimensions must be between 50 pixels x 50 pixels and 10,000 pixels x 10,000 pixels.
3028

3129
* If your PDFs are password-locked, you must remove the lock before submission.
3230

33-
* The minimum height of the text to be extracted is 12 pixels for a 1024 x 768 pixel image. This dimension corresponds to about `8`-point text at 150 dots per inch (DPI).
31+
* The minimum height of the text to be extracted is 12 pixels for a 1024 x 768 pixel image. This dimension corresponds to about `8` point text at 150 dots per inch (DPI).
3432

3533
* For custom model training, the maximum number of pages for training data is 500 for the custom template model and 50,000 for the custom neural model.
3634

37-
* For custom extraction model training, the total size of training data is 50 MB for template model and 1G-MB for the neural model.
35+
* For custom extraction model training, the total size of training data is 50 MB for template model and `1` GB for the neural model.
3836

39-
* For custom classification model training, the total size of training data is `1GB` with a maximum of 10,000 pages.
37+
* For custom classification model training, the total size of training data is `1` GB with a maximum of 10,000 pages. For 2024-07-31-preview and later, the total size of training data is `2` GB with a maximum of 10,000 pages.

articles/ai-services/translator/document-translation/how-to-guides/create-sas-tokens.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.topic: how-to
66
manager: nitinme
77
ms.author: lajanuar
88
author: laujan
9-
ms.date: 02/12/2024
9+
ms.date: 08/13/2024
1010
---
1111

1212
# Create SAS tokens for your storage containers
@@ -82,9 +82,9 @@ Go to the [Azure portal](https://portal.azure.com/#home) and navigate to your co
8282
* Consider setting a longer duration period for the time you're using your storage account for Translator Service operations.
8383
* The value of the expiry time is determined by whether you're using an **Account key** or **User delegation key** **Signing method**:
8484
* **Account key**: While a maximum time limit isn't imposed, best practice recommends that you configure an expiration policy to limit the interval and minimize compromise. [Configure an expiration policy for shared access signatures](/azure/storage/common/sas-expiration-policy).
85-
* **User delegation key**: The value for the expiry time is a maximum of seven days from the creation of the SAS token. The SAS is invalid after the user delegation key expires, so a SAS with an expiry time of greater than seven days will still only be valid for seven days. For more information,*see* [Use Microsoft Entra credentials to secure a SAS](/azure/storage/blobs/storage-blob-user-delegation-sas-create-cli#use-azure-ad-credentials-to-secure-a-sas).
85+
* **User delegation key**: The value for the expiry time is a maximum of seven days from the creation of the SAS token. The SAS is invalid after the user delegation key expires, so a SAS with an expiry time of greater than seven days will still only be valid for seven days. For more information, *see* [Use Microsoft Entra credentials to secure a SAS](/azure/storage/blobs/storage-blob-user-delegation-sas-create-cli#use-azure-ad-credentials-to-secure-a-sas).
8686

87-
1. The **Allowed IP addresses** field is optional and specifies an IP address or a range of IP addresses from which to accept requests. If the request IP address doesn't match the IP address or address range specified on the SAS token, authorization fails. The IP address or a range of IP addresses must be public IPs, not private. For more information,*see*, [**Specify an IP address or IP range**](/rest/api/storageservices/create-account-sas#specify-an-ip-address-or-ip-range).
87+
1. The **Allowed IP addresses** field is optional and specifies an IP address or a range of IP addresses from which to accept requests. If the request IP address doesn't match the IP address or address range specified on the SAS token, authorization fails. The IP address or a range of IP addresses must be public IPs, not private. For more information, *see*, [**Specify an IP address or IP range**](/rest/api/storageservices/create-account-sas#specify-an-ip-address-or-ip-range).
8888

8989
1. The **Allowed protocols** field is optional and specifies the protocol permitted for a request made with the SAS. The default value is HTTPS.
9090

0 commit comments

Comments
 (0)