Skip to content

Commit ec691e6

Browse files
Merge pull request #1275 from HeidiSteen/heidist-hnsw
[azure search] Preview SLA stamp on preview feature docs
2 parents 4c87852 + b120980 commit ec691e6

5 files changed

+39
-22
lines changed

articles/search/cognitive-search-skill-document-intelligence-layout.md

Lines changed: 16 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
---
2-
title: Document Intelligence Layout skill
2+
title: Document Layout skill
33
titleSuffix: Azure AI Search
44
description: Analyze a document to extract regions of interest and their inter-relationships to produce a syntactical representation (markdown format) in an enrichment pipeline in Azure AI Search.
55

@@ -10,17 +10,23 @@ ms.service: azure-ai-search
1010
ms.custom:
1111
- references_regions
1212
ms.topic: reference
13-
ms.date: 10/10/2024
13+
ms.date: 11/19/2024
1414
---
15-
# Document Intelligence Layout skill
1615

17-
The **Document Intelligence Layout** skill analyzes a document to extract regions of interest and their inter-relationships to produce a syntactical representation (markdown format). This skill uses the [Document Intelligence layout model](/azure/ai-services/document-intelligence/concept-layout) provided in [Azure AI Document Intelligence](/azure/ai-services/document-intelligence/overview). This article is the reference documentation for the Document Intelligence Layout skill.
16+
# Document Layout skill
1817

19-
+ The **Document Intelligence Layout** skill uses [Document Intelligence Public preview version 2024-07-31-preview](/rest/api/aiservices/operation-groups?view=rest-aiservices-v4.0%20(2024-07-31-preview)&preserve-view=true). It's currently only available in the following Azure regions:
20-
+ East US
21-
+ West US2
22-
+ West Europe
23-
+ North Central US
18+
[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
19+
20+
The **Document Layout** skill analyzes a document to extract regions of interest and their inter-relationships to produce a syntactical representation (markdown format). This skill uses the [Document Intelligence layout model](/azure/ai-services/document-intelligence/concept-layout) provided in [Azure AI Document Intelligence](/azure/ai-services/document-intelligence/overview).
21+
22+
This article is the reference documentation for the Document Layout skill.
23+
24+
The **Document Layout** skill calls the [Document Intelligence Public preview version 2024-07-31-preview](/rest/api/aiservices/operation-groups?view=rest-aiservices-v4.0%20(2024-07-31-preview)&preserve-view=true). It's currently only available in the following Azure regions:
25+
26+
+ East US
27+
+ West US2
28+
+ West Europe
29+
+ North Central US
2430

2531
Supported file formats include:
2632

@@ -44,12 +50,12 @@ Supported file formats include:
4450
Microsoft.Skills.Util.DocumentIntelligenceLayoutSkill
4551

4652
## Data limits
53+
4754
+ For PDF and TIFF, up to 2,000 pages can be processed (with a free tier subscription, only the first two pages are processed).
4855
+ The file size for analyzing documents is 500 MB for [Azure AI Document Intelligence paid (S0) tier](https://azure.microsoft.com/pricing/details/cognitive-services/) and 4 MB for [Azure AI Document Intelligence free (F0) tier](https://azure.microsoft.com/pricing/details/cognitive-services/).
4956
+ Image dimensions must be between 50 pixels x 50 pixels and 10,000 pixels x 10,000 pixels.
5057
+ If your PDFs are password-locked, you must remove the lock before submission.
5158

52-
5359
## Skill parameters
5460

5561
Parameters are case-sensitive.

articles/search/search-how-to-index-markdown-blobs.md

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,14 +10,17 @@ ms.service: azure-ai-search
1010
ms.custom:
1111
- ignite-2024
1212
ms.topic: how-to
13-
ms.date: 10/22/2024
13+
ms.date: 11/19/2024
1414
---
1515

1616
# Index Markdown blobs and files in Azure AI Search
1717

18+
[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
19+
1820
**Applies to**: [Blob indexers](search-howto-indexing-azure-blob-storage.md), [OneLake indexers](search-how-to-index-onelake-files.md), [File indexers](search-file-storage-integration.md)
1921

20-
In Azure AI Search, indexers for Azure Blob Storage and Azure Files support a `markdown` parsing mode for Markdown files. Markdown files can be indexed in two ways:
22+
In Azure AI Search, indexers for Azure Blob Storage and Azure Files support a `markdown` parsing mode for Markdown files. Markdown files can be indexed in two ways:
23+
2124
+ One-to-many parsing mode
2225
+ One-to-one parsing mode
2326

articles/search/search-how-to-semantic-chunking.md

Lines changed: 13 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,28 +11,32 @@ ms.custom:
1111
- references_regions
1212
---
1313

14-
# Semantic chunking and vectorization using the Document Intelligence Layout skill and index projections
14+
# Semantic chunking and vectorization using the Document Layout skill and index projections
15+
16+
[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
17+
1518
Text data chunking strategies play a key role in optimizing the RAG response and performance. Semantic chunking is to find semantically coherent fragments of a sentence representation. These fragments can then be processed independently and recombined as semantic representations without loss of information, interpretation, or semantic relevance. The inherent meaning of the text is used as a guide for the chunking process. Markdown is a structured and formatted markup language and a popular input for enabling semantic chunking in RAG (Retrieval-Augmented Generation)
1619

17-
The Document Intelligence Layout skill offers a comprehensive solution for advanced content extraction and chunk functionality. With the Layout skill, you can easily extract document layout and content as markdown format and utilize markdown parsing mode to produce a set of document chunks
20+
The Document Layout skill offers a comprehensive solution for advanced content extraction and chunk functionality. With the Layout skill, you can easily extract document layout and content as markdown format and utilize markdown parsing mode to produce a set of document chunks
1821

1922
This article shows:
20-
+ How to use the document intelligence layout skill to extract markdown sections
23+
+ How to use the Document Layout skill to extract markdown sections
2124
+ How to apply split skill to constrain chunk size within each markdown section
2225
+ Generate embeddings for the content within those sections
2326
+ How to use index projections to compile and write them into a search index.
2427

2528
## Prerequisites
29+
2630
+ An [indexer-based indexing pipeline](search-indexer-overview.md).
2731
+ An index that accepts the output of the indexer pipeline.
2832
+ A [supported data source](search-indexer-overview.md#supported-data-sources) having content that you want to chunk.
29-
+ A [Document Intelligence Layout skill](cognitive-search-skill-document-intelligence-layout.md) that splits documents based on paragraph boundaries.
33+
+ A [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) that splits documents based on paragraph boundaries.
3034
+ An [Azure OpenAI Embedding skill](cognitive-search-skill-azure-openai-embedding.md) that generates vector embeddings
3135
+ An [index projection](search-how-to-define-index-projections.md) for one-to-many indexing
3236

3337
## Prepare data files
3438

35-
The raw inputs must be in a [supported data source](search-indexer-overview.md#supported-data-sources) and the file needs to be a format which [Document Intelligence Layout skill](cognitive-search-skill-document-intelligence-layout.md) supports.
39+
The raw inputs must be in a [supported data source](search-indexer-overview.md#supported-data-sources) and the file needs to be a format which [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) supports.
3640

3741
+ Supported file format: PDF, JPEG, JPG, PNG, BMP, TIFF, DOCX, XLSX,PPTX,HTML
3842

@@ -164,7 +168,7 @@ An index must exist on the search service before you create the skill set or run
164168

165169
You can use the REST APIs to [create or update a skill set](cognitive-search-defining-skillset.md).
166170

167-
Here's an example skill set definition payload to project individual markdown sections chunks and their vector outputs as documents in the search index using the [Document Intelligence Layout skill](cognitive-search-skill-document-intelligence-layout.md) and [Azure OpenAI Embedding skill](cognitive-search-skill-azure-openai-embedding.md)
171+
Here's an example skill set definition payload to project individual markdown sections chunks and their vector outputs as documents in the search index using the [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md) and [Azure OpenAI Embedding skill](cognitive-search-skill-azure-openai-embedding.md)
168172

169173
```json
170174
{
@@ -286,7 +290,7 @@ Here's an example skill set definition payload to project individual markdown se
286290
## Run the indexer
287291
Once you create a data source, indexes, and skill set, you're ready to [create and run the indexer](search-howto-create-indexers.md#run-the-indexer). This step puts the pipeline into execution.
288292

289-
When using the [Document Intelligence Layout skill](cognitive-search-skill-document-intelligence-layout.md), make sure to set the following parameters on the indexer definition:
293+
When using the [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md), make sure to set the following parameters on the indexer definition:
290294
+ The `allowSkillsetToReadFileData` parameter should be set to "true."
291295
+ the `parsingMode` parameter should be set to "default."
292296

@@ -330,10 +334,11 @@ POST /indexes/[index name]/docs/search?api-version=[api-version]
330334
```
331335

332336
## See also
337+
333338
+ [Create a data source](search-howto-indexing-azure-blob-storage.md)
334339
+ [Define an index projection](search-how-to-define-index-projections.md)
335340
+ [How to define a skill set](cognitive-search-defining-skillset.md)
336-
+ [Document Intelligence Layout skill](cognitive-search-skill-document-intelligence-layout.md)
341+
+ [Document Layout skill](cognitive-search-skill-document-intelligence-layout.md)
337342
+ [Azure OpenAI Embedding skill](cognitive-search-skill-azure-openai-embedding.md)
338343
+ [Create indexer (REST)](/rest/api/searchservice/indexers/create)
339344
+ [Search Explorer](search-explorer.md)

articles/search/search-markdown-data-tutorial.md

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,16 @@ author: mdonovan
77
ms.author: mdonovan
88
ms.service: azure-ai-search
99
ms.custom:
10-
- ignite-2023
10+
- ignite-2024
1111
ms.topic: tutorial
12-
ms.date: 10/24/2024
12+
ms.date: 11/19/2024
1313

1414
---
1515

1616
# Tutorial: Index nested Markdown blobs from Azure Storage using REST
1717

18+
[!INCLUDE [Feature preview](./includes/previews/preview-generic.md)]
19+
1820
Azure AI Search can index Markdown documents and arrays in Azure Blob Storage using an [indexer](search-indexer-overview.md) that knows how to read Markdown data.
1921

2022
This tutorial shows you to index Markdown files indexed using the `oneToMany` Markdown parsing mode. It uses a REST client and the [Search REST APIs](/rest/api/searchservice/) to perform the following tasks:

articles/search/whats-new.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,7 @@ ms.custom:
2525
|-----------------------------|------|--------------|
2626
| [**Add Azure AI Search to a network security perimeter**](search-security-network-security-perimiter.md) | Security | Join a search service to a [network security perimeter](/azure/private-link/network-security-perimeter-concepts) to control network access to your search service. The Azure portal and the Management REST APIs in the [2024-06-01-preview](/rest/api/searchmanagement/network-security-perimeter-configurations?view=rest-searchmanagement-2024-06-01-preview&preserve-view=true) can be used to view and reconcile network security perimeter configurations. |
2727
| [**Query rewrite in the semantic reranker**](semantic-how-to-query-rewrite.md) | Relevance (scoring) | You can set options on a semantic query to rewrite the query input into a revised or expanded query that generates more relevant results from the L2 ranker. Available in the [Search Documents (2024-11-01-preview)](/rest/api/searchservice/documents/search-post?view=rest-searchservice-2024-11-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature.|
28+
| [**New semantic ranker models**](semantic-search-overview.md) | Relevance (scoring) | Semantic ranker runs with improved models in all supported regions. There is no change to APIs or the portal experience. |
2829
| [**Document Layout skill**](cognitive-search-skill-document-intelligence-layout.md) | Applied AI (skills) | A new skill used to analyze a document for structure and provide [structure-aware chunking](search-how-to-semantic-chunking.md). This skill calls Document Intelligence and uses the Document Intelligence layout model. Available in selected regions through the [Create or Update Skillset (2024-11-01-preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature.|
2930
| [**Managed identity for keyless billing to an Azure AI multiservice subdomain**](cognitive-search-attach-cognitive-services.md). | Applied AI (skills) | You can now use a managed identity and roles for a keyless connection to Azure AI services for built-in skills processing. This capability removes restrictions for having both search and AI services in the same region. Available in the [Create or Update Skillset (2024-11-01-preview)](/rest/api/searchservice/skillsets/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature. |
3031
| [**Markdown parsing mode**](search-how-to-index-markdown-blobs.md) | Indexer data source | With this parsing mode, indexers can generate one-to-one or one-to-many search documents from Markdown files in Azure Storage. Available in the [Create or Update Indexer (2024-11-01-preview)](/rest/api/searchservice/indexers/create-or-update?view=rest-searchservice-2024-11-01-preview&preserve-view=true), the Azure portal, and in the Azure SDK beta packages that provide this feature. |

0 commit comments

Comments
 (0)