Skip to content

Commit 7614de9

Browse files
committed
update tutorial sections
1 parent 7cc0eeb commit 7614de9

File tree

1 file changed

+7
-20
lines changed

1 file changed

+7
-20
lines changed

articles/ai-services/content-understanding/tutorial/RAG-tutorial.md

Lines changed: 7 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -48,26 +48,6 @@ To get started, you need **An active Azure subscription**. If you don't have an
4848
## Extracting Data with Content Understanding: Key Concepts
4949
Building a robust multimodal RAG solution begins with extracting and structuring data from diverse content types. Azure AI Content Understanding provides three key components to facilitate this process: **content extraction**, **field extraction**, and **analyzers**. Together, these components form the foundation for creating a unified, reusable, and enhanced data pipeline for RAG workflows.
5050

51-
### 1. Analyzers: Reusable Components for Data Analysis
52-
53-
Analyzers are reusable components in Content Understanding that streamline the data extraction process. Once an analyzer is created, it can be used repeatedly to process files and extract content or fields based on predefined schemas. An analyzer acts as a blueprint for how data should be processed, ensuring consistency and efficiency across multiple files and content types.
54-
55-
#### Key Benefits of Analyzers:
56-
- **Reusability:** Define once, use across multiple datasets.
57-
- **Customizability:** Tailor analyzers with field schemas to meet specific business needs.
58-
- **Scalability:** Process large volumes of multimodal data efficiently.
59-
60-
### 2. Content Extraction: The Foundation for Data Processing
61-
62-
Content extraction is the first step in the RAG implementation process. It transforms raw multimodal data—such as documents, images, audio, and video—into structured, searchable formats. This foundational step ensures that the content is organized and ready for indexing and retrieval. Content extraction provides the baseline for indexing and retrieval but may not fully address domain-specific needs or provide deeper contextual insights.
63-
[Learn more]() about content extraction capabilities for each modality.
64-
65-
### 3. Field Extraction: Enhancing Content with AI-Generated Metadata
66-
67-
Field extraction builds on content extraction by using AI to generate additional metadata that enriches the knowledge base. This step allows you to define custom fields tailored to your specific use case, enabling more precise retrieval and enhanced search relevance. Field extraction complements content extraction by adding depth and context, making the data more actionable for RAG scenarios.
68-
[Learn more]() about field extraction capabilities for each modality.
69-
70-
7151
## Implementation Steps
7252

7353
To implement data extraction in Content Understanding, follow these steps:
@@ -79,6 +59,7 @@ To implement data extraction in Content Understanding, follow these steps:
7959
## Code Samples
8060

8161
## Creating an Analyzer
62+
Analyzers are reusable components in Content Understanding that streamline the data extraction process. Once an analyzer is created, it can be used repeatedly to process files and extract content or fields based on predefined schemas. An analyzer acts as a blueprint for how data should be processed, ensuring consistency and efficiency across multiple files and content types.
8263

8364
The following code samples demonstrate how to create analyzers for each modality, specifying the structured data to be extracted, such as key fields, summaries, or classifications. These analyzers will serve as the foundation for extracting and enriching content in your RAG solution.
8465
Starting off with the schema details for each modality:
@@ -251,6 +232,12 @@ curl -i -X GET "{endpoint}/contentunderstanding/analyzers/{analyzerId}/operation
251232
---
252233

253234
## Perform Content and Field Analysis
235+
**Content extraction** is the first step in the RAG implementation process. It transforms raw multimodal data—such as documents, images, audio, and video—into structured, searchable formats. This foundational step ensures that the content is organized and ready for indexing and retrieval. Content extraction provides the baseline for indexing and retrieval but may not fully address domain-specific needs or provide deeper contextual insights.
236+
[Learn more]() about content extraction capabilities for each modality.
237+
238+
**Field extraction** builds on content extraction by using AI to generate additional metadata that enriches the knowledge base. This step allows you to define custom fields tailored to your specific use case, enabling more precise retrieval and enhanced search relevance. Field extraction complements content extraction by adding depth and context, making the data more actionable for RAG scenarios.
239+
[Learn more]() about field extraction capabilities for each modality.
240+
254241
With the analyzers created for each modality, we can now process files to extract structured content and AI-generated metadata based on the defined schemas. This section demonstrates how to use the analyzers to analyze multimodal data and provides a sample of the results returned by the APIs. These results showcase the transformation of raw data into actionable insights, forming the foundation for indexing, retrieval, and RAG workflows.
255242

256243
---

0 commit comments

Comments
 (0)