You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/search/cognitive-search-concept-intro.md
+27-22Lines changed: 27 additions & 22 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,20 +8,22 @@ author: HeidiSteen
8
8
ms.author: heidist
9
9
ms.service: cognitive-search
10
10
ms.topic: conceptual
11
-
ms.date: 06/27/2022
11
+
ms.date: 07/01/2022
12
12
ms.custom: references_regions
13
13
---
14
14
# AI enrichment in Azure Cognitive Search
15
15
16
-
*AI enrichment* is the application of machine learning models over raw content, where analysis and inference are used to create searchable content and structure where none previously existed. Because Azure Cognitive Search is a full text search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios:
16
+
*AI enrichment* is the application of machine learning models over content that isn't full text searchable in its raw form. Through enrichment, analysis and inference are used to create searchable content and structure where none previously existed.
17
17
18
-
+ Machine translation and language detection support multi-lingual search
19
-
+ Entity recognition finds people, places, and other entities in large chunks of text
18
+
Because Azure Cognitive Search is a full text search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios:
19
+
20
+
+ Machine translation and language detection, in support of multi-lingual search
21
+
+ Entity recognition extracts people, places, and other entities from large chunks of text
20
22
+ Key phrase extraction identifies and then outputs important terms
21
-
+ Optical Character Recognition (OCR) recognizes text in binary files
23
+
+ Optical Character Recognition (OCR) recognizes printed and handwritten text in binary files
22
24
+ Image analysis describes image content and outputs the descriptions as searchable text fields
23
25
24
-
AI enrichment is an extension of an [**indexer pipeline**](search-indexer-overview.md). It has all of the base components (indexer, data source, index), plus a [**skillset**](cognitive-search-working-with-skillsets.md) that specifies atomic enrichment steps.
26
+
AI enrichment is an extension of an [**indexer pipeline**](search-indexer-overview.md). An enrichment pipeline has all of the components of an indexer pipeline (indexer, data source, index), plus a [**skillset**](cognitive-search-working-with-skillsets.md) that specifies atomic enrichment steps.
25
27
26
28
The following diagram shows the progression of AI enrichment:
27
29
@@ -31,21 +33,24 @@ The following diagram shows the progression of AI enrichment:
31
33
32
34
**Enrich & Index** covers most of the AI enrichment pipeline:
33
35
34
-
+ Enrichment starts when the indexer ["cracks documents"](search-indexer-overview.md#document-cracking) and extracts images and text. The kind of processing that occurs next will depend on your data and which skills you've added to a skillset. If you have images, they can be forwarded to skills that perform image processing. Text content is queued for text and natural language processing. Internally, skills create an "enriched document" that collects the transformations as they occur.
36
+
+ Enrichment starts when the indexer ["cracks documents"](search-indexer-overview.md#document-cracking) and extracts images and text. The kind of processing that occurs next will depend on your data and which skills you've added to a skillset. If you have images, they can be forwarded to skills that perform image processing. Text content is queued for text and natural language processing. Internally, skills create an ["enriched document"](cognitive-search-working-with-skillsets.md#enrichment-tree) that collects the transformations as they occur.
37
+
38
+
+ Enriched content is generated during skillset execution, and is temporary unless you save it. You can enable an [enrichment cache](cognitive-search-incremental-indexing-conceptual.md) to persist cracked documents and skill outputs for subsequent reuse during future skillset executions.
35
39
36
-
Enriched content is generated during skillset execution, and is temporary unless you save it. In order for enriched content to appear in a search index, the indexer must have mapping information so that it can send enriched content to a field in a search index. Output field mappings set up these associations.
40
+
+ To get content into a search index, the indexer must have mapping information for sending enriched content to target field. [Field mappings](search-indexer-field-mappings.md) (explicit or implicit) set the data path from source data to a search index. [Output field mappings](cognitive-search-output-field-mapping.md) set the data path from enriched documents to an index.
37
41
38
-
+ Indexing is the process wherein raw and enriched content is ingested into a [search index](search-what-is-an-index.md) (its files and folders).
42
+
+ Indexing is the process wherein raw and enriched content is ingested into the physical data structures of a [search index](search-what-is-an-index.md) (its files and folders). Lexical analysis and tokenization occur in this step.
39
43
40
-
**Exploration** is the last step. Output is always a [search index](search-what-is-an-index.md) that you can query from a client app. Output can optionally be a [knowledge store](knowledge-store-concept-intro.md) consisting of blobs and tables in Azure Storage that are accessed through data exploration tools or downstream processes. [Field mappings](search-indexer-field-mappings.md), [output field mappings](cognitive-search-output-field-mapping.md), and [projections](knowledge-store-projection-overview.md) determine the data paths that direct content out of the pipeline and into a search index or knowledge store. The same enriched content can appear in both, using implicit or explicit field mappings to send the content to the correct fields.
44
+
**Exploration** is the last step. Output is always a [search index](search-what-is-an-index.md) that you can query from a client app. Output can optionally be a [knowledge store](knowledge-store-concept-intro.md) consisting of blobs and tables in Azure Storage that are accessed through data exploration tools or downstream processes. If you're creating a knowledge store, [projections](knowledge-store-projection-overview.md) determine the data path for enriched content. The same enriched content can appear in both indexes and knowledge stores.
Enrichment is useful if raw content is unstructured text, image content, or content that needs language detection and translation. Applying AI through the [*built-in skills*](cognitive-search-predefined-skills.md) can unlock this content for full text search and data science applications.
50
+
Enrichment is useful if raw content is unstructured text, image content, or content that needs language detection and translation. Applying AI through the [**built-in skills**](cognitive-search-predefined-skills.md) can unlock this content for full text search and data science applications.
47
51
48
-
Enrichment also unlocks external processing that you provide. Open-source, third-party, or first-party code can be integrated into the pipeline as a custom skill. Classification models that identify salient characteristics of various document types fall into this category, but any external package that adds value to your content could be used.
52
+
You can also create [**custom skills**](cognitive-search-create-custom-skill-example.md) to provide external processing.
53
+
Open-source, third-party, or first-party code can be integrated into the pipeline as a custom skill. Classification models that identify salient characteristics of various document types fall into this category, but any external package that adds value to your content could be used.
49
54
50
55
### Use-cases for built-in skills
51
56
@@ -109,25 +114,25 @@ Billing follows a pay-as-you-go pricing model. The costs of using built-in skill
109
114
110
115
## Checklist: A typical workflow
111
116
112
-
An enrichment pipeline consists of [*indexers*](search-indexer-overview.md) that have [*skillsets*](cognitive-search-working-with-skillsets.md). A skillset defines the enrichment steps, and the indexer drives the skillset. When configuring an indexer, you can include properties like output field mappings that send enriched content to a [search index](search-what-is-an-index.md) or projections that define data structures in a [knowledge store](knowledge-store-concept-intro.md).
117
+
An enrichment pipeline consists of [*indexers*](search-indexer-overview.md) that have [*skillsets*](cognitive-search-working-with-skillsets.md). Post-indexing, you can query an index to validate your results.
113
118
114
-
Post-indexing, you can access content via search requests through all [query types supported by Azure Cognitive Search](search-query-overview.md).
115
-
116
-
1. Start with a subset of data. Indexer and skillset design is an iterative process, and the work goes faster with a small representative data set.
119
+
Start with a subset of data in a [supported data source](search-indexer-overview.md#supported-data-sources). Indexer and skillset design is an iterative process. The work goes faster with a small representative data set.
117
120
118
121
1. Create a [data source](/rest/api/searchservice/create-data-source) that specifies a connection to your data.
119
122
120
-
1. Create a [skillset](cognitive-search-defining-skillset.md) to add enrichment steps. If you're using a knowledge store, you'll specify it in this step. Unless you're doing a small proof-of-concept exercise, you'll want to [attach a multi-region Cognitive Services resource](cognitive-search-attach-cognitive-services.md) to the skillset.
123
+
1.[Create a skillset](cognitive-search-defining-skillset.md). Unless your project is small, you'll want to [attach a Cognitive Services resource](cognitive-search-attach-cognitive-services.md). If you're [creating a knowledge store](knowledge-store-create-rest.md), define it within the skillset.
124
+
125
+
1.[Create an index schema](search-how-to-create-search-index.md) that defines a search index.
121
126
122
-
1. Create an [index schema](search-how-to-create-search-index.md)that defines a search index.
127
+
1.[Create and run the indexer](search-howto-create-indexers.md)to bring all of the above components together. This step retrieves the data, runs the skillset, and loads the index.
123
128
124
-
1. Create and run the [indexer](search-howto-create-indexers.md) to bring all of the above components together. This step retrieves the data, runs the skillset, and loads the index. An indexer is also where you specify field mappings and output field mappings that set up the data path to a search index.
129
+
An indexer is also where you specify field mappings and output field mappings that set up the data path to a search index.
125
130
126
-
If possible, [enable enrichment caching](cognitive-search-incremental-indexing-conceptual.md) in the indexer configuration. This step allows you to reuse existing enrichments later on.
131
+
Optionally, [enable enrichment caching](cognitive-search-incremental-indexing-conceptual.md) in the indexer configuration. This step allows you to reuse existing enrichments later on.
127
132
128
-
1. Run [queries](search-query-create.md) to evaluate results and modify code to update skillsets, schema, or indexer configuration.
133
+
1.[Run queries](search-query-create.md) to evaluate results or [start a debug session](cognitive-search-how-to-debug-skillset.md) to work through any skillset issues.
129
134
130
-
1.To repeat any of the above steps, [reset the indexer](search-howto-reindex.md) before you run it. Or, delete and recreate the objects on each run (recommended if you’re using the free tier). If you enabled caching the indexer will pull from the cache if data is unchanged at the source, and if your edits to the pipeline don't invalidate the cache.
135
+
To repeat any of the above steps, [reset the indexer](search-howto-reindex.md) before you run it. Or, delete and recreate the objects on each run (recommended if you’re using the free tier). If you enabled caching the indexer will pull from the cache if data is unchanged at the source, and if your edits to the pipeline don't invalidate the cache.
Copy file name to clipboardExpand all lines: articles/search/search-what-is-azure-search.md
+4-4Lines changed: 4 additions & 4 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,18 +8,18 @@ author: HeidiSteen
8
8
ms.author: heidist
9
9
ms.service: cognitive-search
10
10
ms.topic: overview
11
-
ms.date: 05/31/2022
11
+
ms.date: 07/01/2022
12
12
ms.custom: contperf-fy21q1
13
13
---
14
14
# What is Azure Cognitive Search?
15
15
16
16
Azure Cognitive Search ([formerly known as "Azure Search"](whats-new.md#new-service-name)) is a cloud search service that gives developers infrastructure, APIs, and tools for building a rich search experience over private, heterogeneous content in web, mobile, and enterprise applications.
17
17
18
-
Search is foundational to any app that surfaces text content to users, with common scenarios including catalog or document search, online retail, or data exploration.
18
+
Search is foundational to any app that surfaces text content to users, with common scenarios including catalog or document search, online retail, or data exploration over proprietary content.
19
19
20
20
When you create a search service, you'll work with the following capabilities:
21
21
22
-
+ A search engine for full text search with storage for user-owned content in a search index
22
+
+ A search engine for full text search over a search index containing your user-owned content
23
23
+ Rich indexing, with [text analysis](search-analyzers.md) and [optional AI enrichment](cognitive-search-concept-intro.md) for advanced content extraction and transformation
24
24
+ Rich query syntax that supplements free text search with filters, autocomplete, regex, geo-search and more
25
25
+ Programmability through REST APIs and client libraries in Azure SDKs for .NET, Python, Java, and JavaScript
@@ -43,7 +43,7 @@ On the search service itself, the two primary workloads are *indexing* and *quer
43
43
44
44
+[**Querying**](search-query-overview.md) can happen once an index is populated with searchable text, when your client app sends query requests to a search service and handles responses. All query execution is over a search index that you create, own, and store in your service. In your client app, the search experience is defined using APIs from Azure Cognitive Search, and can include relevance tuning, autocomplete, synonym matching, fuzzy matching, pattern matching, filter, and sort.
45
45
46
-
Functionality is exposed through a simple [REST API](/rest/api/searchservice/), or Azure SDKs like the [Azure SDK for .NET](search-howto-dotnet-sdk.md), that masks the inherent complexity of information retrieval. You can also use the Azure portal for service administration and content management, with tools for prototyping and querying your indexes and skillsets. Because the service runs in the cloud, infrastructure and availability are managed by Microsoft.
46
+
Functionality is exposed through simple [REST APIs](/rest/api/searchservice/), or Azure SDKs like the [Azure SDK for .NET](search-howto-dotnet-sdk.md), that mask the inherent complexity of information retrieval. You can also use the Azure portal for service administration and content management, with tools for prototyping and querying your indexes and skillsets. Because the service runs in the cloud, infrastructure and availability are managed by Microsoft.
0 commit comments