Skip to content

Commit 27ceb8b

Browse files
committed
Revised AI enrichment intro
1 parent 1cd2691 commit 27ceb8b

File tree

1 file changed

+18
-8
lines changed

1 file changed

+18
-8
lines changed

articles/search/cognitive-search-concept-intro.md

Lines changed: 18 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,26 @@ author: HeidiSteen
88
ms.author: heidist
99
ms.service: cognitive-search
1010
ms.topic: conceptual
11-
ms.date: 01/14/2022
11+
ms.date: 02/01/2022
1212
ms.custom: references_regions
1313
---
1414
# AI enrichment in Azure Cognitive Search
1515

16-
In Azure Cognitive Search, AI enrichment refers to a pipeline process that adds machine learning to [indexer-based indexing](search-indexer-overview.md). Steps in the pipeline create information where none previously existed. For example, steps in the pipeline can extract information from images, detect sentiment or key phrases from chunks of text, and recognize entities. These processes transform unsearchable content into searchable text, for full text search and knowledge mining scenarios.
16+
*AI enrichment* is the application of machine learning models over raw content, where analysis and inference are used to create searchable content and structure where none previously existed. Because Azure Cognitive Search is a full text search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios:
1717

18-
[**Azure Blob Storage**](../storage/blobs/storage-blobs-overview.md) is a frequently used input, but any supported data source can provide the initial content. A [**skillset**](cognitive-search-working-with-skillsets.md), attached to an indexer, adds the AI processing. The indexer extracts content and sets up the pipeline. The skillset performs enrichment steps over blob, image, and raw text inputs. Output is always a [**search index**](search-what-is-an-index.md), and optionally a [**knowledge store**](knowledge-store-concept-intro.md).
18+
+ Machine translation and language detection support multi-lingual search
19+
+ Entity recognition finds people, places, and other entities in large chunks of text
20+
+ Key phrase extraction identifies and then aggregates important terms
21+
+ Optical Character Recognition (OCR) extracts text from binary files
22+
+ Image analysis tags and describes images in searchable text fields
23+
24+
AI enrichment is an extension of [**indexers**](search-indexer-overview.md). The types of AI enrichment and order of operations are defined by a [**skillset**](cognitive-search-working-with-skillsets.md).
25+
26+
[**Blobs in Azure Storage**](../storage/blobs/storage-blobs-overview.md) are the most common input, but any supported data source can provide the initial content. A [**skillset**](cognitive-search-working-with-skillsets.md), attached to an indexer, adds the AI processing. The indexer extracts content and sets up the pipeline. The skillset performs the enrichment steps. Output is always a [**search index**](search-what-is-an-index.md), and optionally a [**knowledge store**](knowledge-store-concept-intro.md).
1927

2028
![Enrichment pipeline diagram](./media/cognitive-search-intro/cogsearch-architecture.png "enrichment pipeline overview")
2129

22-
Skillsets are composed of [*built-in skills*](cognitive-search-predefined-skills.md) from Cognitive Search or [*custom skills*](cognitive-search-create-custom-skill-example.md) for external processing that you provide. Custom skills are not always complex. For example, if you have an existing package that provides pattern matching or a document classification model, you can wrap it in a custom skill.
30+
Skillsets are composed of [*built-in skills*](cognitive-search-predefined-skills.md) from Cognitive Search or [*custom skills*](cognitive-search-create-custom-skill-example.md) for external processing that you provide. Custom skills aren’t always complex. For example, if you have an existing package that provides pattern matching or a document classification model, you can wrap it in a custom skill.
2331

2432
Built-in skills fall into these categories:
2533

@@ -59,7 +67,9 @@ A [skillset](cognitive-search-defining-skillset.md) that's assembled using built
5967

6068
+ Unstructured or semi-structured documents containing content that has inherent meaning or context that is hidden in the larger document.
6169

62-
Blobs in particular often contain a large body of content that is packed into a single "field". By attaching image and natural language processing skills to an indexer, you can create information that is extant in the raw content, but not otherwise surfaced as distinct fields. Some ready-to-use built-in cognitive skills that can help: [Key Phrase Extraction](cognitive-search-skill-keyphrases.md) and [Entity Recognition](cognitive-search-skill-entity-recognition-v3.md) (people, organizations, and locations to name a few).
70+
Blobs in particular often contain a large body of content that is packed into a single "field". By attaching image and natural language processing skills to an indexer, you can create information that is extant in the raw content, but not otherwise surfaced as distinct fields.
71+
72+
Some ready-to-use built-in cognitive skills that can help: [Key Phrase Extraction](cognitive-search-skill-keyphrases.md) and [Entity Recognition](cognitive-search-skill-entity-recognition-v3.md) (people, organizations, and locations to name a few).
6373

6474
Additionally, built-in skills can also be used restructure content through text split, merge, and shape operations.
6575

@@ -68,7 +78,7 @@ A [skillset](cognitive-search-defining-skillset.md) that's assembled using built
6878
Custom skills can support more complex scenarios, such as recognizing forms, or custom entity detection using a model that you provide and wrap in the [custom skill web interface](cognitive-search-custom-skill-interface.md). Several examples of custom skills include:
6979

7080
+ [Forms Recognizer](../applied-ai-services/form-recognizer/overview.md)
71-
+ Integration of the [Bing Entity Search API](./cognitive-search-create-custom-skill-example.md)
81+
+ [Bing Entity Search API](./cognitive-search-create-custom-skill-example.md)
7282
+ [Custom entity recognition](https://github.com/Microsoft/SkillsExtractorCognitiveSearch)
7383

7484
## Enrichment steps <a name="enrichment-steps"></a>
@@ -87,7 +97,7 @@ This step assembles all of the initial or raw content that will undergo AI enric
8797

8898
### Step 2: Skillset enrichment phase
8999

90-
A skillset defines the atomic operations that are performed on each document. For example, for text and images extracted from a PDF, a skillset might apply entity recognition, language detection, or key phrase extraction to produce new fields in your index that are not available natively in the source.
100+
A skillset defines the atomic operations that are performed on each document. For example, for text and images extracted from a PDF, a skillset might apply entity recognition, language detection, or key phrase extraction to produce new fields in your index that aren’t available natively in the source.
91101

92102
![Enrichment phase](./media/cognitive-search-intro/enrichment-phase-blowup.png "enrichment phase")
93103

@@ -145,7 +155,7 @@ In Azure Storage, a [knowledge store](knowledge-store-concept-intro.md) can assu
145155

146156
1. Run queries to evaluate results and modify code to update skillsets, schema, or indexer configuration.
147157

148-
To repeat any of the above steps, [reset the indexer](search-howto-reindex.md) before you run it. Or, delete and recreate the objects on each run (recommended if you are using the free tier). You should also [enable enrichment caching](cognitive-search-incremental-indexing-conceptual.md) to reuse existing enrichments wherever possible.
158+
To repeat any of the above steps, [reset the indexer](search-howto-reindex.md) before you run it. Or, delete and recreate the objects on each run (recommended if you’re using the free tier). You should also [enable enrichment caching](cognitive-search-incremental-indexing-conceptual.md) to reuse existing enrichments wherever possible.
149159

150160
## Next steps
151161

0 commit comments

Comments
 (0)