You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/cosmos-db/mongodb/vcore/rag.md
+15-15Lines changed: 15 additions & 15 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,7 +16,7 @@ In the fast-evolving realm of generative AI, Large Language Models (LLMs) like G
16
16
17
17
This tutorial explores how to use Azure Cosmos DB for MongoDB (vCore), LangChain, and OpenAI to implement Retrieval-Augmented Generation (RAG) for superior AI performance alongside discussing LLMs and their limitations. We explore the rapidly adopted paradigm of "retrieval-augmented generation" (RAG), and briefly discuss the LangChain framework, Azure OpenAI models. Finally, we integrate these concepts into a real-world application. By the end, readers will have a solid understanding of these concepts.
18
18
19
-
## Understanding Large Language Models (LLMs) and Their Limitations
19
+
## Understand Large Language Models (LLMs) and their limitations
20
20
21
21
Large Language Models (LLMs) are advanced deep neural network models trained on extensive text datasets, enabling them to understand and generate human-like text. While revolutionary in natural language processing, LLMs have inherent limitations:
22
22
@@ -25,7 +25,7 @@ Large Language Models (LLMs) are advanced deep neural network models trained on
25
25
-**No Access to User’s Local Data**: LLMs don't have direct access to personal or localized data, restricting their ability to provide personalized responses.
26
26
-**Token Limits**: LLMs have a maximum token limit per interaction, constraining the amount of text they can process at once. For example, OpenAI’s gpt-3.5-turbo has a token limit of 4096.
Retrieval-augmented generation (RAG) is an architecture designed to overcome LLM limitations. RAG uses vector search to retrieve relevant documents based on an input query, providing these documents as context to the LLM for generating more accurate responses. Instead of relying solely on pretrained patterns, RAG enhances responses by incorporating up-to-date, relevant information. This approach helps to:
31
31
@@ -36,21 +36,21 @@ Retrieval-augmented generation (RAG) is an architecture designed to overcome LLM
36
36
37
37
This tutorial demonstrates how RAG can be implemented using Azure Cosmos DB for MongoDB (vCore) to build a question-answering application tailored to your data.
38
38
39
-
## Application Architecture Overview
39
+
## Application architecture overview
40
40
41
41
The architecture diagram below illustrates the key components of our RAG implementation:
We'll now discuss the various frameworks, models, and components used in this tutorial, emphasizing their roles and nuances.
48
48
49
49
### Azure Cosmos DB for MongoDB (vCore)
50
50
51
51
Azure Cosmos DB for MongoDB (vCore) supports semantic similarity searches, essential for AI-powered applications. It allows data in various formats to be represented as vector embeddings, which can be stored alongside source data and metadata. Using an approximate nearest neighbors algorithm, like Hierarchical navigable small world (HNSW), these embeddings can be queried for fast semantic similarity searches.
52
52
53
-
### LangChain Framework
53
+
### LangChain framework
54
54
55
55
LangChain simplifies the creation of LLM applications by providing a standard interface for chains, multiple tool integrations, and end-to-end chains for common tasks. It enables AI developers to build LLM applications that leverage external data sources.
56
56
@@ -61,15 +61,15 @@ Key aspects of LangChain:
61
61
-**Modularity**: Simplifies development, debugging, and maintenance.
62
62
-**Popularity**: An open-source project rapidly gaining adoption and evolving to meet user needs.
63
63
64
-
### Azure App Services Interface
64
+
### Azure App Services interface
65
65
66
66
App services provide a robust platform for building user-friendly web interfaces for Gen-AI applications. This tutorial uses Azure App services to create an interactive web interface for the application.
67
67
68
-
### OpenAI Models
68
+
### OpenAI models
69
69
70
70
OpenAI is a leader in AI research, providing various models for language generation, text vectorization, image creation, and audio-to-text conversion. For this tutorial, we'll use OpenAI’s embedding and language models, crucial for understanding and generating language-based applications.
71
71
72
-
### Embedding Models vs. Language Generation Models
72
+
### Embedding models vs. Language generation models
@@ -85,7 +85,7 @@ OpenAI is a leader in AI research, providing various models for language generat
85
85
|**Dimensionality**| The length of the array corresponds to the number of dimensions in the embedding space, for example, 1536 dimensions. | Typically represented as a sequence of tokens, with the context determining the length. |
86
86
87
87
88
-
### Main Components of the Application
88
+
### Main components of the application
89
89
90
90
-**Azure Cosmos DB for MongoDB vCore**: Storing and querying vector embeddings.
91
91
-**LangChain**: Constructing the application’s LLM workflow. Utilizes tools such as:
@@ -97,7 +97,7 @@ OpenAI is a leader in AI research, providing various models for language generat
97
97
-**text-embedding-ada-002**: A text embedding model that converts text into vector embeddings with 1536 dimensions.
98
98
-**gpt-3.5-turbo**: A language model for understanding and generating natural language.
99
99
100
-
### Setting up the environment
100
+
### Set up the environment
101
101
102
102
To get started with optimizing retrieval-augmented generation (RAG) using Azure Cosmos DB for MongoDB (vCore), follow these steps:
103
103
@@ -118,7 +118,7 @@ In this tutorial, we will be loading a single text file using [Document](https:/
118
118
},
119
119
```
120
120
121
-
### Loading documents
121
+
### Load documents
122
122
1. Set the Cosmos DB for MongoDB (vCore) connection string, Database Name, Collection Name, and Index:
In this tutorial, we explored how to build a question-answering app that interacts with your private data using Cosmos DB as a vector store. By leveraging the retrieval-augmented generation (RAG) architecture with LangChain and Azure OpenAI, we demonstrated how vector stores are essential for LLM applications.
250
250
251
251
RAG is a significant advancement in AI, particularly in natural language processing, and combining these technologies allows for the creation of powerful AI-driven applications for various use cases.
252
252
253
-
## Next Steps
253
+
## Next steps
254
254
255
255
For a detailed, hands-on experience and to see how RAG can be implemented using Azure Cosmos DB for MongoDB (vCore), LangChain, and OpenAI models, visit our GitHub repository.
0 commit comments