Skip to content

Commit 7c4bcb4

Browse files
authored
Merge pull request #203646 from HeidiSteen/heidist-fresh
[azure search] Refresh main overview articles (2)
2 parents c7762df + 86df379 commit 7c4bcb4

File tree

2 files changed

+31
-26
lines changed

2 files changed

+31
-26
lines changed

articles/search/cognitive-search-concept-intro.md

Lines changed: 27 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -8,20 +8,22 @@ author: HeidiSteen
88
ms.author: heidist
99
ms.service: cognitive-search
1010
ms.topic: conceptual
11-
ms.date: 06/27/2022
11+
ms.date: 07/01/2022
1212
ms.custom: references_regions
1313
---
1414
# AI enrichment in Azure Cognitive Search
1515

16-
*AI enrichment* is the application of machine learning models over raw content, where analysis and inference are used to create searchable content and structure where none previously existed. Because Azure Cognitive Search is a full text search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios:
16+
*AI enrichment* is the application of machine learning models over content that isn't full text searchable in its raw form. Through enrichment, analysis and inference are used to create searchable content and structure where none previously existed.
1717

18-
+ Machine translation and language detection support multi-lingual search
19-
+ Entity recognition finds people, places, and other entities in large chunks of text
18+
Because Azure Cognitive Search is a full text search solution, the purpose of AI enrichment is to improve the utility of your content in search-related scenarios:
19+
20+
+ Machine translation and language detection, in support of multi-lingual search
21+
+ Entity recognition extracts people, places, and other entities from large chunks of text
2022
+ Key phrase extraction identifies and then outputs important terms
21-
+ Optical Character Recognition (OCR) recognizes text in binary files
23+
+ Optical Character Recognition (OCR) recognizes printed and handwritten text in binary files
2224
+ Image analysis describes image content and outputs the descriptions as searchable text fields
2325

24-
AI enrichment is an extension of an [**indexer pipeline**](search-indexer-overview.md). It has all of the base components (indexer, data source, index), plus a [**skillset**](cognitive-search-working-with-skillsets.md) that specifies atomic enrichment steps.
26+
AI enrichment is an extension of an [**indexer pipeline**](search-indexer-overview.md). An enrichment pipeline has all of the components of an indexer pipeline (indexer, data source, index), plus a [**skillset**](cognitive-search-working-with-skillsets.md) that specifies atomic enrichment steps.
2527

2628
The following diagram shows the progression of AI enrichment:
2729

@@ -31,21 +33,24 @@ The following diagram shows the progression of AI enrichment:
3133

3234
**Enrich & Index** covers most of the AI enrichment pipeline:
3335

34-
+ Enrichment starts when the indexer ["cracks documents"](search-indexer-overview.md#document-cracking) and extracts images and text. The kind of processing that occurs next will depend on your data and which skills you've added to a skillset. If you have images, they can be forwarded to skills that perform image processing. Text content is queued for text and natural language processing. Internally, skills create an "enriched document" that collects the transformations as they occur.
36+
+ Enrichment starts when the indexer ["cracks documents"](search-indexer-overview.md#document-cracking) and extracts images and text. The kind of processing that occurs next will depend on your data and which skills you've added to a skillset. If you have images, they can be forwarded to skills that perform image processing. Text content is queued for text and natural language processing. Internally, skills create an ["enriched document"](cognitive-search-working-with-skillsets.md#enrichment-tree) that collects the transformations as they occur.
37+
38+
+ Enriched content is generated during skillset execution, and is temporary unless you save it. You can enable an [enrichment cache](cognitive-search-incremental-indexing-conceptual.md) to persist cracked documents and skill outputs for subsequent reuse during future skillset executions.
3539

36-
Enriched content is generated during skillset execution, and is temporary unless you save it. In order for enriched content to appear in a search index, the indexer must have mapping information so that it can send enriched content to a field in a search index. Output field mappings set up these associations.
40+
+ To get content into a search index, the indexer must have mapping information for sending enriched content to target field. [Field mappings](search-indexer-field-mappings.md) (explicit or implicit) set the data path from source data to a search index. [Output field mappings](cognitive-search-output-field-mapping.md) set the data path from enriched documents to an index.
3741

38-
+ Indexing is the process wherein raw and enriched content is ingested into a [search index](search-what-is-an-index.md) (its files and folders).
42+
+ Indexing is the process wherein raw and enriched content is ingested into the physical data structures of a [search index](search-what-is-an-index.md) (its files and folders). Lexical analysis and tokenization occur in this step.
3943

40-
**Exploration** is the last step. Output is always a [search index](search-what-is-an-index.md) that you can query from a client app. Output can optionally be a [knowledge store](knowledge-store-concept-intro.md) consisting of blobs and tables in Azure Storage that are accessed through data exploration tools or downstream processes. [Field mappings](search-indexer-field-mappings.md), [output field mappings](cognitive-search-output-field-mapping.md), and [projections](knowledge-store-projection-overview.md) determine the data paths that direct content out of the pipeline and into a search index or knowledge store. The same enriched content can appear in both, using implicit or explicit field mappings to send the content to the correct fields.
44+
**Exploration** is the last step. Output is always a [search index](search-what-is-an-index.md) that you can query from a client app. Output can optionally be a [knowledge store](knowledge-store-concept-intro.md) consisting of blobs and tables in Azure Storage that are accessed through data exploration tools or downstream processes. If you're creating a knowledge store, [projections](knowledge-store-projection-overview.md) determine the data path for enriched content. The same enriched content can appear in both indexes and knowledge stores.
4145

4246
<!-- ![Enrichment pipeline diagram](./media/cognitive-search-intro/cogsearch-architecture.png "enrichment pipeline") -->
4347

4448
## When to use AI enrichment
4549

46-
Enrichment is useful if raw content is unstructured text, image content, or content that needs language detection and translation. Applying AI through the [*built-in skills*](cognitive-search-predefined-skills.md) can unlock this content for full text search and data science applications.
50+
Enrichment is useful if raw content is unstructured text, image content, or content that needs language detection and translation. Applying AI through the [**built-in skills**](cognitive-search-predefined-skills.md) can unlock this content for full text search and data science applications.
4751

48-
Enrichment also unlocks external processing that you provide. Open-source, third-party, or first-party code can be integrated into the pipeline as a custom skill. Classification models that identify salient characteristics of various document types fall into this category, but any external package that adds value to your content could be used.
52+
You can also create [**custom skills**](cognitive-search-create-custom-skill-example.md) to provide external processing.
53+
Open-source, third-party, or first-party code can be integrated into the pipeline as a custom skill. Classification models that identify salient characteristics of various document types fall into this category, but any external package that adds value to your content could be used.
4954

5055
### Use-cases for built-in skills
5156

@@ -109,25 +114,25 @@ Billing follows a pay-as-you-go pricing model. The costs of using built-in skill
109114

110115
## Checklist: A typical workflow
111116

112-
An enrichment pipeline consists of [*indexers*](search-indexer-overview.md) that have [*skillsets*](cognitive-search-working-with-skillsets.md). A skillset defines the enrichment steps, and the indexer drives the skillset. When configuring an indexer, you can include properties like output field mappings that send enriched content to a [search index](search-what-is-an-index.md) or projections that define data structures in a [knowledge store](knowledge-store-concept-intro.md).
117+
An enrichment pipeline consists of [*indexers*](search-indexer-overview.md) that have [*skillsets*](cognitive-search-working-with-skillsets.md). Post-indexing, you can query an index to validate your results.
113118

114-
Post-indexing, you can access content via search requests through all [query types supported by Azure Cognitive Search](search-query-overview.md).
115-
116-
1. Start with a subset of data. Indexer and skillset design is an iterative process, and the work goes faster with a small representative data set.
119+
Start with a subset of data in a [supported data source](search-indexer-overview.md#supported-data-sources). Indexer and skillset design is an iterative process. The work goes faster with a small representative data set.
117120

118121
1. Create a [data source](/rest/api/searchservice/create-data-source) that specifies a connection to your data.
119122

120-
1. Create a [skillset](cognitive-search-defining-skillset.md) to add enrichment steps. If you're using a knowledge store, you'll specify it in this step. Unless you're doing a small proof-of-concept exercise, you'll want to [attach a multi-region Cognitive Services resource](cognitive-search-attach-cognitive-services.md) to the skillset.
123+
1. [Create a skillset](cognitive-search-defining-skillset.md). Unless your project is small, you'll want to [attach a Cognitive Services resource](cognitive-search-attach-cognitive-services.md). If you're [creating a knowledge store](knowledge-store-create-rest.md), define it within the skillset.
124+
125+
1. [Create an index schema](search-how-to-create-search-index.md) that defines a search index.
121126

122-
1. Create an [index schema](search-how-to-create-search-index.md) that defines a search index.
127+
1. [Create and run the indexer](search-howto-create-indexers.md) to bring all of the above components together. This step retrieves the data, runs the skillset, and loads the index.
123128

124-
1. Create and run the [indexer](search-howto-create-indexers.md) to bring all of the above components together. This step retrieves the data, runs the skillset, and loads the index. An indexer is also where you specify field mappings and output field mappings that set up the data path to a search index.
129+
An indexer is also where you specify field mappings and output field mappings that set up the data path to a search index.
125130

126-
If possible, [enable enrichment caching](cognitive-search-incremental-indexing-conceptual.md) in the indexer configuration. This step allows you to reuse existing enrichments later on.
131+
Optionally, [enable enrichment caching](cognitive-search-incremental-indexing-conceptual.md) in the indexer configuration. This step allows you to reuse existing enrichments later on.
127132

128-
1. Run [queries](search-query-create.md) to evaluate results and modify code to update skillsets, schema, or indexer configuration.
133+
1. [Run queries](search-query-create.md) to evaluate results or [start a debug session](cognitive-search-how-to-debug-skillset.md) to work through any skillset issues.
129134

130-
1. To repeat any of the above steps, [reset the indexer](search-howto-reindex.md) before you run it. Or, delete and recreate the objects on each run (recommended if you’re using the free tier). If you enabled caching the indexer will pull from the cache if data is unchanged at the source, and if your edits to the pipeline don't invalidate the cache.
135+
To repeat any of the above steps, [reset the indexer](search-howto-reindex.md) before you run it. Or, delete and recreate the objects on each run (recommended if you’re using the free tier). If you enabled caching the indexer will pull from the cache if data is unchanged at the source, and if your edits to the pipeline don't invalidate the cache.
131136

132137
## Next steps
133138

articles/search/search-what-is-azure-search.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,18 +8,18 @@ author: HeidiSteen
88
ms.author: heidist
99
ms.service: cognitive-search
1010
ms.topic: overview
11-
ms.date: 05/31/2022
11+
ms.date: 07/01/2022
1212
ms.custom: contperf-fy21q1
1313
---
1414
# What is Azure Cognitive Search?
1515

1616
Azure Cognitive Search ([formerly known as "Azure Search"](whats-new.md#new-service-name)) is a cloud search service that gives developers infrastructure, APIs, and tools for building a rich search experience over private, heterogeneous content in web, mobile, and enterprise applications.
1717

18-
Search is foundational to any app that surfaces text content to users, with common scenarios including catalog or document search, online retail, or data exploration.
18+
Search is foundational to any app that surfaces text content to users, with common scenarios including catalog or document search, online retail, or data exploration over proprietary content.
1919

2020
When you create a search service, you'll work with the following capabilities:
2121

22-
+ A search engine for full text search with storage for user-owned content in a search index
22+
+ A search engine for full text search over a search index containing your user-owned content
2323
+ Rich indexing, with [text analysis](search-analyzers.md) and [optional AI enrichment](cognitive-search-concept-intro.md) for advanced content extraction and transformation
2424
+ Rich query syntax that supplements free text search with filters, autocomplete, regex, geo-search and more
2525
+ Programmability through REST APIs and client libraries in Azure SDKs for .NET, Python, Java, and JavaScript
@@ -43,7 +43,7 @@ On the search service itself, the two primary workloads are *indexing* and *quer
4343

4444
+ [**Querying**](search-query-overview.md) can happen once an index is populated with searchable text, when your client app sends query requests to a search service and handles responses. All query execution is over a search index that you create, own, and store in your service. In your client app, the search experience is defined using APIs from Azure Cognitive Search, and can include relevance tuning, autocomplete, synonym matching, fuzzy matching, pattern matching, filter, and sort.
4545

46-
Functionality is exposed through a simple [REST API](/rest/api/searchservice/), or Azure SDKs like the [Azure SDK for .NET](search-howto-dotnet-sdk.md), that masks the inherent complexity of information retrieval. You can also use the Azure portal for service administration and content management, with tools for prototyping and querying your indexes and skillsets. Because the service runs in the cloud, infrastructure and availability are managed by Microsoft.
46+
Functionality is exposed through simple [REST APIs](/rest/api/searchservice/), or Azure SDKs like the [Azure SDK for .NET](search-howto-dotnet-sdk.md), that mask the inherent complexity of information retrieval. You can also use the Azure portal for service administration and content management, with tools for prototyping and querying your indexes and skillsets. Because the service runs in the cloud, infrastructure and availability are managed by Microsoft.
4747

4848
## Why use Cognitive Search?
4949

0 commit comments

Comments
 (0)