Merge pull request #8490 from tswarmerdam-mx/genaicommons

Karuna-Mendix · web-flow · commit 497e3d02564d · 2024-10-28T16:16:34.000+05:30
Genaicommons additional terminology and small changes
diff --git a/content/en/docs/appstore/platform-supported-content/modules/genai/concepts/rag-example-implementation.md b/content/en/docs/appstore/platform-supported-content/modules/genai/concepts/rag-example-implementation.md
@@ -1,47 +1,75 @@
 ---
-title: "RAG Example Implementation in the GenAI Showcase App"
+title: "RAG in a Mendix App"
 url: /appstore/modules/genai/rag/
 
 linktitle: "Retrieval Augmented Generation (RAG)"
 weight: 30
-description: "Describes the retrieval augmented generation (RAG) example implementation in the GenAI Showcase App"
+description: "Describes the retrieval augmented generation (RAG) pattern and the example implementation in the GenAI Showcase App"
 ---
 
 ## Introduction {#introduction}
 
-Retrieval augmented generation (RAG) is a framework for an AI-based search with a private or external knowledge base that combines embeddings-based knowledge retrieval with a text generation model. The starting point will be a collection of data to be considered as the private knowledge base. The final goal is that an end user of the app can ask questions about the data and the assistant responses will only be based on this knowledge base. 
+Retrieval augmented generation (RAG) is a framework for an AI-based search using a private or external knowledge base that combines embeddings-based knowledge retrieval with a text generation model. The starting point is a collection of data to be considered as the private knowledge base. The final goal is that an end user of the app can ask questions about the data and the assistant's responses are only be based on this knowledge base. 
 
 {{% alert color="info" %}}This document describes how to set up RAG with PgVector. If you want to use the Bedrock Retrieval Augmented Generation capabilities, see [Bedrock Retrieval Augmented Generation](/appstore/modules/genai/using-gen-ai/#rag).{{% /alert %}}
 
+### Terminology
+
+To understand the basics of the RAG pattern, it is important to know the common terminology. As the [showcase example](https://marketplace.mendix.com/link/component/220475) and the relevant platform-supported modules depend on [GenAI Commons](/appstore/modules/genai/commons/), relevant entities will be linked for reference.
+
+#### Embedding vector
+
+Also called **Embedding** and sometimes shortened to **Vector**, this is a mathematical representation of an input string generated by the LLM of choice. It consists of an ordered set of numbers (typically written as [ 0.006, 0.108, ...]), and the total number of elements is called the **dimension**. An embeddings model can convert any string into a vector of fixed dimension. 
+
+Every LLM will have its algorithm for generating vectors, but the convention is that conceptually similar strings result in similar vectors. This enables **similarity search** where strings can be matched to a given search string input in terms of semantic meaning (i.e. content/tone/style/...) instead of exact character matches. Minimizing the **cosine distance**  between each element and the vector representation of the search string input is a common mathematical technique to search through a collection of vectors and to find the most similar elements.
+
+#### Chunk
+
+In the context of GenAI Commons in a Mendix app, embedding vectors are generated using a [Chunk](/appstore/modules/genai/commons/#chunk-entity). Each object represents a discrete piece of information and contains its original string representation, as well as (after the embedding operation) the vector representation of that string according to the LLM of choice.
+
+#### Knowledge base 
+
+This is the place to store discrete pieces of information. If information and its vector representation are stored together, a knowledge base can also be called a **vector database**. Common vector databases have built-in logic to execute similarity searches based on a search vector.
+
+In the context of GenAI Commons in a Mendix app, we use the [PgVector Knowledge Base](https://marketplace.mendix.com/link/component/225063) module to store and retrieve vectors.
+
+#### Knowledge base chunk
+
+In most use cases, more information needs to be stored than just the original input string and its vector representation. A [KnowledgeBaseChunk](/appstore/modules/genai/commons/#knowledgebasechunk-entity) is an extension of [Chunk](/appstore/modules/genai/commons/#chunk-entity) that can hold additional information that is typically required for useful insertion and retrieval from a Mendix application.
+
+#### Metadata
+
+If additional conventional filtering is needed during similarity searches, such additional data can be stored in the knowledge base as well. [Metadata](/appstore/modules/genai/commons/#metadata-entity) objects are key-value pairs that are inserted along with the chunks and contain this additional information. The filtering is applied on an exact string-match basis for the key-value pair. Records are only retrieved if they match all records of the metadata in the collection provided as part of the search step.
+
+{{% alert color="info" %}}The example described in the remainder of this document does not include the more advanced use case of metadata filtering nor does it cover the construction of complex input strings. If you want to see how this can work in practice, take a look at the *RAG with Semantic Search on Historical Data* example in the [GenAI Showcase app](https://marketplace.mendix.com/link/component/220475). {{% /alert %}}
+
 ## High-level Flow {#rag-high-level}
 
 The complete technical flow can be split up into the following three steps at a high level:
 
 1. Prepare the knowledge base (once per document)
    1. Data is chunked into smaller, partially overlapping, pieces of information.
-   2. For each data chunk, the embedding vector will be retrieved from OpenAI's embeddings API.
+   2. For each data chunk, the embedding vector will be retrieved from the LLM's embeddings operations.
    3. Data chunks (or their identifier) are stored in a vector database together with their embedding vector.
 
 2. Query the knowledge base (once per search)
    1. User query is sent to the embeddings API to retrieve the embedding vector of the query.
    2. A pre-defined number of most-relevant data chunks is retrieved from the vector database. This set is selected based on cosine similarity to the user query embedding vector.
 
 3. Invoke the text generation model (once per search)
-   1. User query and the relevant data chunks are sent to the chat completions API.
+   1. User query and the relevant data chunks are sent to the LLM's chat completions operation.
    2. Through prompt engineering, the text generation model is instructed to only base the answer on the data chunks that were sent as part of the request. This prevents the model from hallucinating.
    3. The assistant response is returned to the user.
 
-In summary, in the first step, you need to provide the private knowledge base, such as a text snippet. You need to prepare the content for RAG, which happens only once. If the content changes, you need to provide it again for RAG. The last two steps happen every time an end-user triggers the RAG flow, for example, by asking a question about the data.
+In summary, in the first step, you need to provide the private knowledge base, such as a text snippet. You need to prepare the content for RAG, which happens only once. If the content changes, you need to update the data in the knowledge base. The last two steps happen every time an end-user triggers the RAG flow, for example, by asking a question about the data.
 
 ## RAG Example in the GenAI Showcase App {#rag-showcase-app}
 
 ### Prerequisites {#prerequisites}
 
-Before you start experimenting with the end-to-end process, make sure that you have covered the following prerequisites:
+Before you start experimenting with the end-to-end process, make sure that you have access to a (remote) PostgreSQL database with the [pgvector](https://github.com/pgvector/pgvector) extension available. If you do not have one yet, [learn more](/appstore/modules/genai/pgvector-setup/) about how a PostgreSQL vector database can be set up to explore use cases with knowledge bases.
 
-You have access to a (remote) PostgreSQL database with the [pgvector](https://github.com/pgvector/pgvector) extension available.
-
-{{% alert color="info" %}}If you have access to an Amazon Web Services (AWS) account, Mendix recommends you use a [free-tier RDS](https://aws.amazon.com/rds/faqs/#product-faqs#amazon-rds-faqs#free-tier) setup described in the [Creating a PostgreSQL Database with Amazon RDS](/appstore/modules/genai/pgvector-setup/#aws-database-create) section. This is convenient, since PostgreSQL databases in Amazon RDS by default have the required pgvector extension available.{{% /alert %}}
+{{% alert color="info" %}}If you have access to an Amazon Web Services (AWS) account or Microsoft Azure account, Mendix recommends you use a setup described in the [Creating a PostgreSQL Database with Amazon RDS](/appstore/modules/genai/pgvector-setup/#aws-database-create) or [Managing a PostgreSQL Database with Microsoft Azure](/appstore/modules/genai/pgvector-setup/#azure-database) section. This is convenient, since these PostgreSQL databases in the cloud have the required pgvector extension available by default.{{% /alert %}}
 
 ### Steps {#steps}
 
@@ -88,8 +116,12 @@ If you would like to build your own RAG setup, feel free to learn from the GenAI
     * Find top-k nearest neighbors (select query; typically using cosine distance/similarity optimization as recommended by OpenAI).
     
     * Remove individual records (delete) or tables (drop table).
+   
+* Similarity search is only guaranteed to work if the embeddings model chosen for the retrieval step is the same as the model used at the time of population: different models use different algorithms to generate vectors and might even produce vectors of different dimensionalities, making cosine distance calculation impossible.
+
+* How you construct the input string affects similarity search results. In the similarity search example for **tickets** in the showcase application, the input string at the time of insertion is a concatenation of multiple attributes of each ticket record in the Mendix database. However, in the search step, the user's input—possibly just a brief description—is used to find similar tickets. While this discrepancy may lower overall similarity, the most relevant records will still appear at the top.
 
-{{% alert color="info" %}}Example queries in the form of SQL statements are available for inspiration in the source code of the [PgVector Knowledge Base module](/appstore/modules/genai/pgvector/) which comes automatically with GenAI Showcase App.{{% /alert %}}
+{{% alert color="info" %}}Reusable queries in the form of SQL statements are available in the source code of the [PgVector Knowledge Base module](/appstore/modules/genai/pgvector/) which comes automatically with GenAI Showcase App.{{% /alert %}}
 
 ## Read More {#read-more}
 
diff --git a/content/en/docs/appstore/platform-supported-content/modules/genai/genai-commons.md b/content/en/docs/appstore/platform-supported-content/modules/genai/genai-commons.md
@@ -191,7 +191,7 @@ This entity represents a collection of chunks. It is a wrapper entity for [Chunk
 
 #### `Chunk` {#chunk-entity}
 
-A piece of information (InputText) and the corresponding embeddings vector retrieved from an Embeddings API.
+A piece of information (InputText) and the corresponding embeddings vector retrieved from an Embeddings API. This is the relevant entity if you need to generate embedding vectors but do not need to store them in a knowledge base.
 
 | Attribute | Description |
 | --- | --- |
@@ -201,7 +201,7 @@ A piece of information (InputText) and the corresponding embeddings vector retri
 
 #### `KnowledgeBaseChunk` {#knowledgebasechunk-entity}
 
-This entity represents a discrete piece of knowledge that can be used in embed and store operations. It is a specialization of [Chunk](#chunk-entity).
+This entity represents a discrete piece of knowledge that can be used for embedding and storage operations. As a specialization of [Chunk](#chunk-entity), it is the appropriate entity to use when both generating embedding vectors and storing them in a knowledge base.
 
 | Attribute | Description |
 | --- | --- |
@@ -217,7 +217,7 @@ An optional collection of metadata. This is a wrapper entity for one or more [Me
 
 #### `Metadata` {#metadata-entity}
 
-This entity represents additional information that is to be stored with the [KnowledgeBaseChunk](#knowledgebasechunk-entity) in the knowledge base. It can be used for custom filtering during retrieval.
+This entity represents additional information to be stored with the [KnowledgeBaseChunk](#knowledgebasechunk-entity) in the knowledge base. At the insertion stage, you can link multiple metadata objects to a KnowledgeBaseChunk as needed. These metadata objects consist of key-value pairs used for custom filtering during retrieval. Retrieval operates on an exact string-match basis for each key-value pair, returning records only if they match all metadata records specified in the search criteria.
 
 | Attribute | Description |
 | --- | --- |
diff --git a/content/en/docs/appstore/platform-supported-content/modules/genai/pg-vector-knowledge-base/_index.md b/content/en/docs/appstore/platform-supported-content/modules/genai/pg-vector-knowledge-base/_index.md
@@ -86,7 +86,7 @@ A typical pattern for populating a knowledge base is as follows:
 
 1. Create a new `ChunkCollection`. See the [Initialize ChunkCollection](/appstore/modules/genai/commons/) section.
 2. For each knowledge item that needs to be inserted, do the following:
-    * Use [Initialize MetadataCollection with Metadata](/appstore/modules/genai/commons/) and [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) as many times as needed to create a collection of the necessary metadata for the knowledge base item.
+    * Use [Initialize MetadataCollection with Metadata](/appstore/modules/genai/commons/) and [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) to create a collection of the necessary metadata for the knowledge base item.
     * With both collections as input parameters, use [Add KnowledgeBaseChunk to ChunkCollection](/appstore/modules/genai/commons/) for the knowledge item.
 3. Call an embeddings endpoint with the `ChunkCollection` to generate an embedding vector for each `KnowledgeBaseChunk`
 4. With the `ChunkCollection`, use [(Re)populate Knowledge Base](#repopulate-knowledge-base) to store the chunks.
@@ -118,15 +118,15 @@ Currently, four operations are available for on-demand retrieval of data chunks
 A typical pattern for retrieval from a knowledge base uses GenAI Commons operations and can be illustrated as follows:
 
 1. Use [Initialize MetadataCollection with Metadata](/appstore/modules/genai/commons/) to set up a `MetadataCollection` for filtering with its first key-value pair added immediately. 
-2. Use [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) as many times as needed to create a collection of the necessary metadata.
+2. Use [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) (iteratively) to create a collection of the necessary metadata.
 3. Do the retrieval. For example, you could use [Retrieve Nearest Neighbors](#retrieve-nearest-neighbors) to find chunks based on vector similarity.
 
 For scenarios in which the created chunks were based on Mendix objects at the time of population and these objects need to be used in logic after the retrieval step, two additional operations are available. The Java actions [Retrieve & Associate](#retrieve-associate) and [Retrieve Nearest Neighbors & Associate](#retrieve-nearest-neighbors-associate) take care of the chunk retrieval and set the association towards the original object, if applicable.
 
 A typical pattern for this retrieval is as follows:
 
 1. Use [Initialize MetadataCollection with Metadata](/appstore/modules/genai/commons/) to set up a `MetadataCollection` for filtering with its first key-value pair added immediately. 
-2. Use [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) as many times as needed to create a collection of the necessary metadata.
+2. Use [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) (iteratively) to create a collection of the necessary metadata.
 3. Do the retrieval. For example, you could use [Retrieve Nearest Neighbors & Associate](#retrieve-nearest-neighbors-associate) to find chunks based on vector similarity.
 4. For each retrieved chunk, retrieve the original Mendix object and do custom logic.
 
diff --git a/content/en/docs/appstore/platform-supported-content/modules/genai/pg-vector-knowledge-base/vector-database-setup.md b/content/en/docs/appstore/platform-supported-content/modules/genai/pg-vector-knowledge-base/vector-database-setup.md
@@ -18,6 +18,8 @@ This procedure describes a setup based on a PostgreSQL database with the pgvecto
 
 ## Managing a PostgreSQL Database with Amazon RDS {#aws-database}
 
+A PostgreSQL database in Amazon RDS includes the required extension for `pgvector` pre-installed. When you connect using the [PgVector Knowledge Base](https://marketplace.mendix.com/link/component/225063) module, this extension activates automatically, allowing the database to function as a vector database for knowledge bases.
+
 ### Creating a PostgreSQL Database with Amazon RDS {#aws-database-create}
 
 {{% alert color="info" %}}
@@ -78,6 +80,8 @@ If no action is taken, resources in AWS will stay around indefinitely. Make sure
 
 ## Managing a PostgreSQL Database with Microsoft Azure {#azure-database}
 
+A PostgreSQL database in Microsoft includes the required `pgvector` extension (called *vector*) pre-installed. The steps below describe how to enable its use. When you connect using the [PgVector Knowledge Base](https://marketplace.mendix.com/link/component/225063) module, this extension activates automatically allowing the database to function as a vector database for knowledge bases.
+
 ### Creating a PostgreSQL Database with Microsoft Azure {#azure-database-create}
 
 {{% alert color="info" %}}
@@ -159,7 +163,7 @@ If no action is taken, resources on Azure will stay around indefinitely. Make su
 
    2. Use the master username and master password that you set in the **Settings** when you [created the PostgreSQL Database with Amazon RDS](#aws-database-create) or for the admin user in the [Azure Portal](#azure-database-create) as your username and password.
 
-   3. Save and test the configuration.
+   3. Save and test the configuration. This will activate the `pgvector` extension and the vector database is ready to be used.
 
 ## Setup Alternatives {#setup-alternatives}