You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-concept-intro.md
+11-9Lines changed: 11 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,33 +1,35 @@
1
1
---
2
2
title: AI enrichment concepts
3
3
titleSuffix: Azure Cognitive Search
4
-
description: Content extraction, natural language processing (NLP) and image processing are used to create searchable content in Azure Cognitive Search indexes with both pre-defined cognitive skills and custom AI algorithms.
4
+
description: Content extraction, natural language processing (NLP), and image processing are used to create searchable content in Azure Cognitive Search indexes. AI enrichment can use both pre-defined cognitive skills and custom AI algorithms.
5
5
6
6
manager: nitinme
7
7
author: HeidiSteen
8
8
ms.author: heidist
9
9
ms.service: cognitive-search
10
10
ms.topic: conceptual
11
-
ms.date: 08/10/2021
11
+
ms.date: 01/14/2022
12
12
ms.custom: references_regions
13
13
---
14
14
# AI enrichment in Azure Cognitive Search
15
15
16
-
In Azure Cognitive Search, AI enrichment refers to built-in cognitive skills and custom skills that add analysis, transformations, and content generation during indexing. Enrichments create new information where none previously existed: extracting information from images, detecting sentiment, key phrases, and entities from text, to name a few. Enrichments also add structure to undifferentiated text. All of these processes result in making previously unsearchable content available to full text search scenarios. In many instances, enriched documents are useful for scenarios other than search, such as for knowledge mining.
16
+
In Azure Cognitive Search, AI enrichment refers to a pipeline process that adds machine learning to [indexer-based indexing](search-indexer-overview.md). Steps in the pipeline create new information where none previously existed: extracting information from images, detecting sentiment or key phrases from chunks of text, and recognizing entities, to name a few. All of these processes result in making previously unsearchable content available to full text search and knowledge mining scenarios.
17
17
18
-
Enrichment is defined by a [**skillset**](cognitive-search-working-with-skillsets.md) that's attached to an [**indexer**](search-indexer-overview.md). The indexer will extract and set up the content, while the skillset identifies, analyzes, and creates new information and structures from images, blobs, and other unstructured data sources. The output of an enrichment pipeline is either a [**search index**](search-what-is-an-index.md) or a[**knowledge store**](knowledge-store-concept-intro.md).
18
+
Azure Blob Storage is the most commonly used input, but any indexer-supported data source can provide the initial content. A [**skillset**](cognitive-search-working-with-skillsets.md), attached to an indexer, adds the AI processing. The indexer extracts content and sets up the pipeline, while the skillset identifies, analyzes, and creates new information out of blobs, images, and raw text. Output is a [**search index**](search-what-is-an-index.md) or optional[**knowledge store**](knowledge-store-concept-intro.md).
A skillset can contain built-in skills from Cognitive Search or embed external processing that you provide in a [*custom skill*](cognitive-search-create-custom-skill-example.md). Examples of a custom skill might be a custom entity module or document classifier targeting a specific domain such as finance, scientific publications, or medicine.
22
+
Skillsets are composed of built-in skills from Cognitive Search or [*custom skills*](cognitive-search-create-custom-skill-example.md) for external processing that you provide. Custom skills might sound complex but can be simple and straightforward in terms of implementation. If you have existing packages that provide pattern matching or document classification models, the content you extract during indexing could be passed to these models for processing.
23
23
24
24
Built-in skills fall into these categories:
25
25
26
-
+**Natural language processing**skills include [entity recognition](cognitive-search-skill-entity-recognition-v3.md), [language detection](cognitive-search-skill-language-detection.md), [key phrase extraction](cognitive-search-skill-keyphrases.md), text manipulation, [sentiment detection (including opinion mining)](cognitive-search-skill-sentiment-v3.md), and [PII detection](cognitive-search-skill-pii-detection.md). With these skills, unstructured text is mapped as searchable and filterable fields in an index.
26
+
+**Machine translation**is provided by the [text translation](cognitive-search-skill-text-translation.md)skill, often paired with [language detection](cognitive-search-skill-language-detection.md) for multi-language solutions.
27
27
28
-
+**Image processing** skills include [Optical Character Recognition (OCR)](cognitive-search-skill-ocr.md) and identification of [visual features](cognitive-search-skill-image-analysis.md), such as facial detection, image interpretation, image recognition (famous people and landmarks) or attributes like image orientation. These skills create text representations of image content, making it searchable using the query capabilities of Azure Cognitive Search.
28
+
+**Image processing** skills include [Optical Character Recognition (OCR)](cognitive-search-skill-ocr.md) and identification of [visual features](cognitive-search-skill-image-analysis.md), such as facial detection, image interpretation, image recognition (famous people and landmarks), or attributes like image orientation. These skills create text representations of image content for full text search in Azure Cognitive Search.
29
29
30
-
Built-in skills in Azure Cognitive Search are based on pre-trained machine learning models in Cognitive Services APIs: [Computer Vision](../cognitive-services/computer-vision/index.yml) and [Language Service](../cognitive-services/language-service/overview.md). You can attach a Cognitive Services resource if you want to leverage these resources during content processing.
30
+
+**Natural language processing** skills include [entity recognition](cognitive-search-skill-entity-recognition-v3.md), [language detection](cognitive-search-skill-language-detection.md), [key phrase extraction](cognitive-search-skill-keyphrases.md), text manipulation, [sentiment detection (including opinion mining)](cognitive-search-skill-sentiment-v3.md), and [personal identifiable information (PII) detection](cognitive-search-skill-pii-detection.md). With these skills, unstructured text is mapped as searchable and filterable fields in an index.
31
+
32
+
Built-in skills are based on pre-trained machine learning models in Cognitive Services APIs: [Computer Vision](../cognitive-services/computer-vision/index.yml) and [Language Service](../cognitive-services/language-service/overview.md). You should [attach a billable Cognitive Services resource](cognitive-search-attach-cognitive-services.md) if you want these resources for larger workloads.
31
33
32
34
Natural language and image processing is applied during the data ingestion phase, with results becoming part of a document's composition in a searchable index in Azure Cognitive Search. Data is sourced as an Azure data set and then pushed through an indexing pipeline using whichever [built-in skills](cognitive-search-predefined-skills.md) you need.
33
35
@@ -43,7 +45,7 @@ If your search service is located in one of these regions, you will not be able
43
45
44
46
## When to use AI enrichment
45
47
46
-
You should consider enrichment if your raw content is unstructured text, image content, or content that needs language detection and translation. Applying AI through the built-in cognitive skills can unlock this content for full text search and data science apps.
48
+
You should consider enrichment if your raw content is unstructured text, image content, or content that needs language detection and translation. Applying AI through the built-in cognitive skills can unlock this content for full text search and data science applications.
47
49
48
50
Additionally, you might consider adding a custom skill if you have open-source, third-party, or first-party code that you'd like to integrate into the pipeline. Classification models that identify salient characteristics of various document types fall into this category, but any package that adds value to your content could be used.
0 commit comments