updates

TheovanKraay · TheovanKraay · commit acd16c022817 · 2024-07-09T15:26:35.000+01:00
diff --git a/articles/cosmos-db/TOC.yml b/articles/cosmos-db/TOC.yml
@@ -28,6 +28,8 @@
         items:
         - name: Vector search overview
           href: gen-ai/vector-search-overview.md
+        - name: Retrieval Augmented Generation (RAG)
+          href: gen-ai/rag.md
         - name: Tokens
           href: gen-ai/tokens.md
         - name: Vector embeddings
diff --git a/articles/cosmos-db/gen-ai/distance-functions.md b/articles/cosmos-db/gen-ai/distance-functions.md
@@ -31,6 +31,7 @@ Two vectors are multiplied to return a single number. It combines the two vector
 ## Related content
 - [VectorDistance system function](../nosql/query/vectordistance.md) in Azure Cosmos DB NoSQL
 - [What is a vector database?](../vector-database.md)
+- [Retrieval Augmented Generation (RAG)](rag.md)
 - [Vector database in Azure Cosmos DB NoSQL](../nosql/vector-search.md)
 - [Vector database in Azure Cosmos DB for MongoDB](../mongodb/vcore/vector-search.md)
 - [What is vector search?](vector-search-overview.md)
diff --git a/articles/cosmos-db/gen-ai/knn-vs-ann.md b/articles/cosmos-db/gen-ai/knn-vs-ann.md
@@ -33,6 +33,7 @@ Two major categories of vector search algorithms are k-Nearest Neighbors (kNN) a
 
 ## Related content
 - [What is a vector database?](../vector-database.md)
+- [Retrieval Augmented Generation (RAG)](rag.md)
 - [Vector database in Azure Cosmos DB NoSQL](../nosql/vector-search.md)
 - [Vector database in Azure Cosmos DB for MongoDB](../mongodb/vcore/vector-search.md)
 - [What is vector search?](vector-search-overview.md)
diff --git a/articles/cosmos-db/gen-ai/quickstart-rag-chatbot.md b/articles/cosmos-db/gen-ai/quickstart-rag-chatbot.md
@@ -14,12 +14,12 @@ ms.author: thvankra
 
 [!INCLUDE[NoSQL](../includes/appliesto-nosql.md)]
 
-In this quickstart, we demonstrate how to build a RAG Pattern application using a subset of the Movie Lens dataset. This sample uses the Python SDK for Azure Cosmos DB for NoSQL to perform vector search for RAG, store and retrieve chat history, and store the vectors of the chat history to use as a semantic cache. Azure OpenAI is used to generate embeddings and Large Language Model (LLM) completions.
+In this quickstart, we demonstrate how to build a [RAG Pattern](../gen-ai/rag.md) application using a subset of the Movie Lens dataset. This sample uses the Python SDK for Azure Cosmos DB for NoSQL to perform vector search for RAG, store and retrieve chat history, and store the vectors of the chat history to use as a semantic cache. Azure OpenAI is used to generate embeddings and Large Language Model (LLM) completions.
 
 At the end, we create a simple UX using Gradio to allow users to type in questions and display responses generated by Azure OpenAI or served from the cache. The responses also display an elapsed time so you can see the impact caching has on performance versus generating a response.
 
 > [!TIP] 
-> For more samples, visit: [AzureDataRetrievalAugmentedGenerationSamples](https://github.com/microsoft/AzureDataRetrievalAugmentedGenerationSamples)
+> For more RAG samples, visit: [AzureDataRetrievalAugmentedGenerationSamples](https://github.com/microsoft/AzureDataRetrievalAugmentedGenerationSamples)
 
 **Important Note**: This sample requires you to setup accounts for Azure Cosmos DB for NoSQL, and Azure OpenAI. To get started, visit:
 - [Azure Cosmos DB for NoSQL Python Quickstart](../nosql/quickstart-python.md)
diff --git a/articles/cosmos-db/gen-ai/rag.md b/articles/cosmos-db/gen-ai/rag.md
@@ -0,0 +1,65 @@
+---
+
+title: Retrieval Augmented Generation (RAG) in Azure Cosmos DB
+description: Learn about Retrieval Augmented Generation (RAG) in Azure Cosmos DB
+author: TheovanKraay
+ms.service: cosmos-db
+ms.subservice: nosql
+ms.topic: conceptual
+ms.date: 07/09/2024
+ms.author: thvankra
+---
+
+# Retrieval Augmented Generation (RAG) in Azure Cosmos DB
+
+Retrieval Augmented Generation (RAG) combines the power of large language models (LLMs) with robust information retrieval systems to create more accurate and contextually relevant responses. Unlike traditional generative models that rely solely on pre-trained data, RAG architectures enhance an LLM's capabilities by integrating real-time information retrieval. This augmentation ensures responses are not only generative but also grounded in the most relevant, up-to-date data available. 
+
+Azure Cosmos DB, an operational database that supports vector search, stands out as an excellent platform for implementing RAG. Its ability to handle both operational and analytical workloads in a single database, along with advanced features such as multi-tenancy and hierarchical partition keys, provides a solid foundation for building sophisticated generative AI applications.
+
+## Key Advantages of Using Azure Cosmos DB
+
+### 1. Unified Data Storage and Retrieval
+Azure Cosmos DB enables seamless integration of [vector search](../nosql/vector-search.md) capabilities within a unified database system. This means that your operational data and vectorized data coexist, eliminating the need for separate indexing systems. 
+
+### 2. Real-Time Data Ingestion and Querying
+Azure Cosmos DB supports real-time ingestion and querying, making it ideal for applications that require up-to-the-minute information. This is crucial for RAG architectures, where the freshness of data can significantly impact the relevance of generated responses.
+
+### 3. Scalability and Global Distribution
+Designed for large-scale applications, Azure Cosmos DB offers global distribution and automatic scaling. This ensures that your RAG-enabled application can handle high query volumes and deliver consistent performance irrespective of user location.
+
+### 4. High Availability and Reliability
+Azure Cosmos DB offers comprehensive SLAs for throughput, latency, and [availability](/articles/reliability/reliability-cosmos-db-nosql.md). This reliability ensures that your RAG system is always available to generate responses with minimal downtime.
+
+### 5. Multi-Tenancy with Hierarchical Partition Keys
+Azure Cosmos DB supports [multi-tenancy](../nosql/multi-tenancy-vector-search.md) through various performance and security isolation models, making it easier to manage data for different clients or user groups within the same database. This feature is particularly useful for SaaS applications where separation of tenant data is crucial for security and compliance.
+
+### 6. Comprehensive Security Features
+With built-in features such as end-to-end encryption, role-based access control (RBAC), and virtual network (VNet) integration, Azure Cosmos DB ensures that your data remains secure. These security measures are essential for enterprise-grade RAG applications that handle sensitive information.
+
+
+
+## Implementing RAG with Azure Cosmos DB
+
+> [!TIP] 
+> For RAG samples, visit: [AzureDataRetrievalAugmentedGenerationSamples](https://github.com/microsoft/AzureDataRetrievalAugmentedGenerationSamples)
+
+Here's a streamlined process for building a RAG application with Azure Cosmos DB:
+
+1. **Data Ingestion**: Store your documents, images, and other content types in Azure Cosmos DB. Utilize the database's support for vector search to index and retrieve vectorized content.
+
+2. **Query Execution**: When a user submits a query, Azure Cosmos DB can quickly retrieve the most relevant data using its vector search capabilities.
+
+3. **LLM Integration**: Pass the retrieved data to an LLM (e.g., Azure OpenAI) to generate a response. The well-structured data provided by Cosmos DB enhances the quality of the model's output.
+
+4. **Response Generation**: The LLM processes the data and generates a comprehensive response, which is then delivered to the user.
+
+
+## Related content
+- [What is a vector database?](../vector-database.md)
+- [Vector database in Azure Cosmos DB NoSQL](../nosql/vector-search.md)
+- [Vector database in Azure Cosmos DB for MongoDB](../mongodb/vcore/vector-search.md)
+- LLM [tokens](tokens.md)
+- Vector [embeddings](vector-embeddings.md)
+- [Distance functions](distance-functions.md)
+- [kNN vs ANN vector search algorithms](knn-vs-ann.md)
+- [Multi-tenancy for Vector Search](../nosql/multi-tenancy-vector-search.md)
diff --git a/articles/cosmos-db/gen-ai/tokens.md b/articles/cosmos-db/gen-ai/tokens.md
@@ -14,6 +14,7 @@ Tokens are small chunks of text generated by splitting the input text into small
 
 ## Related content
 - [What is a vector database?](../vector-database.md)
+- [Retrieval Augmented Generation (RAG)](rag.md)
 - [Vector database in Azure Cosmos DB NoSQL](../nosql/vector-search.md)
 - [Vector database in Azure Cosmos DB for MongoDB](../mongodb/vcore/vector-search.md)
 - [What is vector search?](vector-search-overview.md)
diff --git a/articles/cosmos-db/gen-ai/vector-embeddings.md b/articles/cosmos-db/gen-ai/vector-embeddings.md
@@ -32,6 +32,7 @@ You can see more examples in this [interactive visualization](https://openai.com
 
 ## Related content
 - [What is a vector database?](../vector-database.md)
+- [Retrieval Augmented Generation (RAG)](rag.md)
 - [Vector database in Azure Cosmos DB NoSQL](../nosql/vector-search.md)
 - [Vector database in Azure Cosmos DB for MongoDB](../mongodb/vcore/vector-search.md)
 - [What is vector search?](vector-search-overview.md)
diff --git a/articles/cosmos-db/gen-ai/vector-search-overview.md b/articles/cosmos-db/gen-ai/vector-search-overview.md
@@ -20,6 +20,7 @@ Using an integrated vector search feature in a fully featured database ([as oppo
 
 ## Related content
 - [What is a vector database?](../vector-database.md)
+- [Retrieval Augmented Generation (RAG)](rag.md)
 - [Vector database in Azure Cosmos DB NoSQL](../nosql/vector-search.md)
 - [Vector database in Azure Cosmos DB for MongoDB](../mongodb/vcore/vector-search.md)
 - LLM [tokens](tokens.md)