You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-concept-intro.md
+18-8Lines changed: 18 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,18 +8,26 @@ author: HeidiSteen
8
8
ms.author: heidist
9
9
ms.service: cognitive-search
10
10
ms.topic: conceptual
11
-
ms.date: 01/14/2022
11
+
ms.date: 02/01/2022
12
12
ms.custom: references_regions
13
13
---
14
14
# AI enrichment in Azure Cognitive Search
15
15
16
-
In Azure Cognitive Search, AI enrichment refers to a pipeline process that adds machine learning to [indexer-based indexing](search-indexer-overview.md). Steps in the pipeline create information where none previously existed. For example, steps in the pipeline can extract information from images, detect sentiment or key phrases from chunks of text, and recognize entities. These processes transform unsearchable content into searchable text, for full text search and knowledge mining scenarios.
16
+
*AI enrichment* is the application of machine learning models over raw content, where analysis and inference are used to create searchable content and structure where none previously existed. Because Azure Cognitive Search is a full text search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios:
17
17
18
-
[**Azure Blob Storage**](../storage/blobs/storage-blobs-overview.md) is a frequently used input, but any supported data source can provide the initial content. A [**skillset**](cognitive-search-working-with-skillsets.md), attached to an indexer, adds the AI processing. The indexer extracts content and sets up the pipeline. The skillset performs enrichment steps over blob, image, and raw text inputs. Output is always a [**search index**](search-what-is-an-index.md), and optionally a [**knowledge store**](knowledge-store-concept-intro.md).
18
+
+ Machine translation and language detection support multi-lingual search
19
+
+ Entity recognition finds people, places, and other entities in large chunks of text
20
+
+ Key phrase extraction identifies and then aggregates important terms
21
+
+ Optical Character Recognition (OCR) extracts text from binary files
22
+
+ Image analysis tags and describes images in searchable text fields
23
+
24
+
AI enrichment is an extension of [**indexers**](search-indexer-overview.md). The types of AI enrichment and order of operations are defined by a [**skillset**](cognitive-search-working-with-skillsets.md).
25
+
26
+
[**Blobs in Azure Storage**](../storage/blobs/storage-blobs-overview.md) are the most common input, but any supported data source can provide the initial content. A [**skillset**](cognitive-search-working-with-skillsets.md), attached to an indexer, adds the AI processing. The indexer extracts content and sets up the pipeline. The skillset performs the enrichment steps. Output is always a [**search index**](search-what-is-an-index.md), and optionally a [**knowledge store**](knowledge-store-concept-intro.md).
Skillsets are composed of [*built-in skills*](cognitive-search-predefined-skills.md) from Cognitive Search or [*custom skills*](cognitive-search-create-custom-skill-example.md) for external processing that you provide. Custom skills are not always complex. For example, if you have an existing package that provides pattern matching or a document classification model, you can wrap it in a custom skill.
30
+
Skillsets are composed of [*built-in skills*](cognitive-search-predefined-skills.md) from Cognitive Search or [*custom skills*](cognitive-search-create-custom-skill-example.md) for external processing that you provide. Custom skills aren’t always complex. For example, if you have an existing package that provides pattern matching or a document classification model, you can wrap it in a custom skill.
23
31
24
32
Built-in skills fall into these categories:
25
33
@@ -59,7 +67,9 @@ A [skillset](cognitive-search-defining-skillset.md) that's assembled using built
59
67
60
68
+ Unstructured or semi-structured documents containing content that has inherent meaning or context that is hidden in the larger document.
61
69
62
-
Blobs in particular often contain a large body of content that is packed into a single "field". By attaching image and natural language processing skills to an indexer, you can create information that is extant in the raw content, but not otherwise surfaced as distinct fields. Some ready-to-use built-in cognitive skills that can help: [Key Phrase Extraction](cognitive-search-skill-keyphrases.md) and [Entity Recognition](cognitive-search-skill-entity-recognition-v3.md) (people, organizations, and locations to name a few).
70
+
Blobs in particular often contain a large body of content that is packed into a single "field". By attaching image and natural language processing skills to an indexer, you can create information that is extant in the raw content, but not otherwise surfaced as distinct fields.
71
+
72
+
Some ready-to-use built-in cognitive skills that can help: [Key Phrase Extraction](cognitive-search-skill-keyphrases.md) and [Entity Recognition](cognitive-search-skill-entity-recognition-v3.md) (people, organizations, and locations to name a few).
63
73
64
74
Additionally, built-in skills can also be used restructure content through text split, merge, and shape operations.
65
75
@@ -68,7 +78,7 @@ A [skillset](cognitive-search-defining-skillset.md) that's assembled using built
68
78
Custom skills can support more complex scenarios, such as recognizing forms, or custom entity detection using a model that you provide and wrap in the [custom skill web interface](cognitive-search-custom-skill-interface.md). Several examples of custom skills include:
@@ -87,7 +97,7 @@ This step assembles all of the initial or raw content that will undergo AI enric
87
97
88
98
### Step 2: Skillset enrichment phase
89
99
90
-
A skillset defines the atomic operations that are performed on each document. For example, for text and images extracted from a PDF, a skillset might apply entity recognition, language detection, or key phrase extraction to produce new fields in your index that are not available natively in the source.
100
+
A skillset defines the atomic operations that are performed on each document. For example, for text and images extracted from a PDF, a skillset might apply entity recognition, language detection, or key phrase extraction to produce new fields in your index that aren’t available natively in the source.
@@ -145,7 +155,7 @@ In Azure Storage, a [knowledge store](knowledge-store-concept-intro.md) can assu
145
155
146
156
1. Run queries to evaluate results and modify code to update skillsets, schema, or indexer configuration.
147
157
148
-
To repeat any of the above steps, [reset the indexer](search-howto-reindex.md) before you run it. Or, delete and recreate the objects on each run (recommended if you are using the free tier). You should also [enable enrichment caching](cognitive-search-incremental-indexing-conceptual.md) to reuse existing enrichments wherever possible.
158
+
To repeat any of the above steps, [reset the indexer](search-howto-reindex.md) before you run it. Or, delete and recreate the objects on each run (recommended if you’re using the free tier). You should also [enable enrichment caching](cognitive-search-incremental-indexing-conceptual.md) to reuse existing enrichments wherever possible.
0 commit comments