Skip to content

Commit 1f35ad4

Browse files
committed
more edits
1 parent c09aebf commit 1f35ad4

File tree

1 file changed

+13
-10
lines changed

1 file changed

+13
-10
lines changed

articles/search/search-blob-ai-integration.md

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -24,26 +24,25 @@ While you might need just one of these AI capabilities, it’s common to combine
2424

2525
AI enrichment creates new information, captured as text, stored in fields. Post-enrichment, you can access this information from a search index through full text search, or send enriched documents back to Azure storage to power new application experiences that include exploring data for discovery or analytics scenarios.
2626

27-
In this article, we view AI enrichment through a wide angle perspective so that you can quickly grasp the entire process, from transforming raw data in blobs, to queryable information in either a search index or a knowledge store.
27+
In this article, we view AI enrichment through a wide lens so that you can quickly grasp the entire process, from transforming raw data in blobs, to queryable information in either a search index or a knowledge store.
2828

29-
## Enriching blob data
29+
## What it means to "enrich" blob data
3030

3131
*AI enrichment* is part of the indexing architecture of Azure Search that integrates built-in AI from Microsoft or custom AI that you provide. It helps you implement end-to-end scenarios where you need to process blobs (both existing ones and new ones as they come in or are updated), crack open all file formats to extract images and text, extract the desired information using various AI capabilities, and index them in an Azure Search index for fast search, retrieval and exploration.
3232

3333
Inputs are your blobs, in a single container, in Azure Blob storage. Blobs can be almost any kind of text or image data.
3434

3535
Output is always an Azure Search index, used for fast text search, retrieval, and exploration in client applications. Additionally, output can also be a *knowledge store* that projects enriched documents into Azure blobs or Azure tables for downstream analysis in tools like Power BI or in data science workloads.
3636

37-
In between is the pipeline architecture itself. The pipeline is based on the *indexer* feature, to which you can assign a *skillset*, which is composed of one or more *skills* providing the AI. The purpose of the pipeline is to produce *enriched documents* that enter as blobs but pick up additional structure and information while moving through the pipeline. Enriched documents are consumed during indexing to create inverted indexes and other structures.
37+
In between is the pipeline architecture itself. The pipeline is based on the *indexer* feature, to which you can assign a *skillset*, which is composed of one or more *skills* providing the AI. The purpose of the pipeline is to produce *enriched documents* that enter as raw content but pick up additional structure, context, and information while moving through the pipeline. Enriched documents are consumed during indexing to create inverted indexes and other structures used in full text search or exploration and analytics.
3838

3939
## How to get started
4040

4141
You can start directly in your storage account portal page. Click **Add Azure Search** and create a new Azure Search service or select an existing one. If you already have an existing search service in the same subscription, clicking **Add Azure Search** opens the Import data wizard so that you can immediately step through indexing, enrichment, and index definition.
4242

43-
Once you add Azure Search to your storage account, you can follow the standard process to enrich data in any Azure data source, summarized as described step-by-step in [Create an AI enrichment pipeline using REST APIs](cognitive-search-tutorial-blob.md).
44-
45-
In the following sections, we'll explore components and concepts, enrichment design, and key decisions you will make along the way.
43+
Once you add Azure Search to your storage account, you can follow the standard process to enrich data in any Azure data source. Assuming you already have blob content, you can use the Import data wizard in Azure Search for an easy initial introduction to AI enrichment. This quickstart explains the steps: [Create an AI enrichment pipeline in the portal](cognitive-search-quickstart-blob.md).
4644

45+
In the following sections, we'll explore more components and concepts.
4746

4847
## Inputs to blob indexing
4948

@@ -59,13 +58,17 @@ The Blob indexer comes with configuration parameters. You can learn more about t
5958

6059
## Adding AI
6160

62-
*Skills* are the individual components of AI processing that you can use standalone or in combination with other skills for sequential processing. Built-in skills are backed by Cognitive Services, with image analysis based on Computer Vision, and natural language processing based on Text Analytics. Custom skills are custom code, wrapped in an interface definition that allows for integration into the pipeline. In customer solutions, it's common practice to use both, with custom skills providing open-source, third-party, or first-party AI modules.
61+
*Skills* are the individual components of AI processing that you can use standalone or in combination with other skills for sequential processing.
62+
63+
+ Built-in skills are backed by Cognitive Services, with image analysis based on Computer Vision, and natural language processing based on Text Analytics. A few examples are [OCR](cognitive-search-skill-ocr.md), [Entity Recognition](cognitive-search-skill-entity-recognition.md), and [Image Analysis](cognitive-search-skill-image-analysis.md). You can review the full list of built-in skills in [Predefined skills for content enrichment](cognitive-search-predefined-skills.md).
64+
65+
+ Custom skills are custom code, wrapped in an interface definition that allows for integration into the pipeline. In customer solutions, it's common practice to use both, with custom skills providing open-source, third-party, or first-party AI modules.
6366

64-
A *skillset* is the collection of skills, and its invoked after the document cracking phase makes content available to the pipeline. An indexer can consume exactly one skillset, but that skillset exists independently of an indexer so that you can reuse it in other scenarios.
67+
A *skillset* is the collection of skills used in a pipeline, and its invoked after the document cracking phase makes content available. An indexer can consume exactly one skillset, but that skillset exists independently of an indexer so that you can reuse it in other scenarios.
6568

66-
Custom skills are more straight forward than they sound. If you have existing packages that provide pattern matching or classification models, the content you extract from blobs could be passed to these models for processing. Since AI enrichment is Azure-based, your model should be on Azure also. Some common hosting methodologies include Azure Functions or Containers.
69+
Custom skills might sound complex but can be simple and straightforward in terms of implementation. If you have existing packages that provide pattern matching or classification models, the content you extract from blobs could be passed to these models for processing. Since AI enrichment is Azure-based, your model should be on Azure also. Some common hosting methodologies include using [Azure Functions](cognitive-search-create-custom-skill-example.md) or [Containers](https://github.com/Microsoft/SkillsExtractorCognitiveSearch).
6770

68-
Built-in skills backed by Cognitive Services require an attached Cognitive Services all-in-one subscription key that gives you access to the resource. An all-in-one key gives you image analysis, language detection, text translation, and text analytics. Other built-in skills are features of Azure Search and require no additional service or key. Text shaper, splitter, and merger are examples of helper skills that are sometimes necessary when designing the pipeline.
71+
Built-in skills backed by Cognitive Services require an [attached Cognitive Services](cognitive-search-attach-cognitive-services.md) all-in-one subscription key that gives you access to the resource. An all-in-one key gives you image analysis, language detection, text translation, and text analytics. Other built-in skills are features of Azure Search and require no additional service or key. Text shaper, splitter, and merger are examples of helper skills that are sometimes necessary when designing the pipeline.
6972

7073
If you use only custom skills and built-in utility skills, there is no dependency or costs related to Cognitive Services.
7174

0 commit comments

Comments
 (0)