Skip to content

Commit 497e3d0

Browse files
Merge pull request #8490 from tswarmerdam-mx/genaicommons
Genaicommons additional terminology and small changes
2 parents 49e572b + ab69af1 commit 497e3d0

File tree

4 files changed

+54
-18
lines changed

4 files changed

+54
-18
lines changed

content/en/docs/appstore/platform-supported-content/modules/genai/concepts/rag-example-implementation.md

Lines changed: 43 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,75 @@
11
---
2-
title: "RAG Example Implementation in the GenAI Showcase App"
2+
title: "RAG in a Mendix App"
33
url: /appstore/modules/genai/rag/
44

55
linktitle: "Retrieval Augmented Generation (RAG)"
66
weight: 30
7-
description: "Describes the retrieval augmented generation (RAG) example implementation in the GenAI Showcase App"
7+
description: "Describes the retrieval augmented generation (RAG) pattern and the example implementation in the GenAI Showcase App"
88
---
99

1010
## Introduction {#introduction}
1111

12-
Retrieval augmented generation (RAG) is a framework for an AI-based search with a private or external knowledge base that combines embeddings-based knowledge retrieval with a text generation model. The starting point will be a collection of data to be considered as the private knowledge base. The final goal is that an end user of the app can ask questions about the data and the assistant responses will only be based on this knowledge base.
12+
Retrieval augmented generation (RAG) is a framework for an AI-based search using a private or external knowledge base that combines embeddings-based knowledge retrieval with a text generation model. The starting point is a collection of data to be considered as the private knowledge base. The final goal is that an end user of the app can ask questions about the data and the assistant's responses are only be based on this knowledge base.
1313

1414
{{% alert color="info" %}}This document describes how to set up RAG with PgVector. If you want to use the Bedrock Retrieval Augmented Generation capabilities, see [Bedrock Retrieval Augmented Generation](/appstore/modules/genai/using-gen-ai/#rag).{{% /alert %}}
1515

16+
### Terminology
17+
18+
To understand the basics of the RAG pattern, it is important to know the common terminology. As the [showcase example](https://marketplace.mendix.com/link/component/220475) and the relevant platform-supported modules depend on [GenAI Commons](/appstore/modules/genai/commons/), relevant entities will be linked for reference.
19+
20+
#### Embedding vector
21+
22+
Also called **Embedding** and sometimes shortened to **Vector**, this is a mathematical representation of an input string generated by the LLM of choice. It consists of an ordered set of numbers (typically written as [ 0.006, 0.108, ...]), and the total number of elements is called the **dimension**. An embeddings model can convert any string into a vector of fixed dimension.
23+
24+
Every LLM will have its algorithm for generating vectors, but the convention is that conceptually similar strings result in similar vectors. This enables **similarity search** where strings can be matched to a given search string input in terms of semantic meaning (i.e. content/tone/style/...) instead of exact character matches. Minimizing the **cosine distance** between each element and the vector representation of the search string input is a common mathematical technique to search through a collection of vectors and to find the most similar elements.
25+
26+
#### Chunk
27+
28+
In the context of GenAI Commons in a Mendix app, embedding vectors are generated using a [Chunk](/appstore/modules/genai/commons/#chunk-entity). Each object represents a discrete piece of information and contains its original string representation, as well as (after the embedding operation) the vector representation of that string according to the LLM of choice.
29+
30+
#### Knowledge base
31+
32+
This is the place to store discrete pieces of information. If information and its vector representation are stored together, a knowledge base can also be called a **vector database**. Common vector databases have built-in logic to execute similarity searches based on a search vector.
33+
34+
In the context of GenAI Commons in a Mendix app, we use the [PgVector Knowledge Base](https://marketplace.mendix.com/link/component/225063) module to store and retrieve vectors.
35+
36+
#### Knowledge base chunk
37+
38+
In most use cases, more information needs to be stored than just the original input string and its vector representation. A [KnowledgeBaseChunk](/appstore/modules/genai/commons/#knowledgebasechunk-entity) is an extension of [Chunk](/appstore/modules/genai/commons/#chunk-entity) that can hold additional information that is typically required for useful insertion and retrieval from a Mendix application.
39+
40+
#### Metadata
41+
42+
If additional conventional filtering is needed during similarity searches, such additional data can be stored in the knowledge base as well. [Metadata](/appstore/modules/genai/commons/#metadata-entity) objects are key-value pairs that are inserted along with the chunks and contain this additional information. The filtering is applied on an exact string-match basis for the key-value pair. Records are only retrieved if they match all records of the metadata in the collection provided as part of the search step.
43+
44+
{{% alert color="info" %}}The example described in the remainder of this document does not include the more advanced use case of metadata filtering nor does it cover the construction of complex input strings. If you want to see how this can work in practice, take a look at the *RAG with Semantic Search on Historical Data* example in the [GenAI Showcase app](https://marketplace.mendix.com/link/component/220475). {{% /alert %}}
45+
1646
## High-level Flow {#rag-high-level}
1747

1848
The complete technical flow can be split up into the following three steps at a high level:
1949

2050
1. Prepare the knowledge base (once per document)
2151
1. Data is chunked into smaller, partially overlapping, pieces of information.
22-
2. For each data chunk, the embedding vector will be retrieved from OpenAI's embeddings API.
52+
2. For each data chunk, the embedding vector will be retrieved from the LLM's embeddings operations.
2353
3. Data chunks (or their identifier) are stored in a vector database together with their embedding vector.
2454

2555
2. Query the knowledge base (once per search)
2656
1. User query is sent to the embeddings API to retrieve the embedding vector of the query.
2757
2. A pre-defined number of most-relevant data chunks is retrieved from the vector database. This set is selected based on cosine similarity to the user query embedding vector.
2858

2959
3. Invoke the text generation model (once per search)
30-
1. User query and the relevant data chunks are sent to the chat completions API.
60+
1. User query and the relevant data chunks are sent to the LLM's chat completions operation.
3161
2. Through prompt engineering, the text generation model is instructed to only base the answer on the data chunks that were sent as part of the request. This prevents the model from hallucinating.
3262
3. The assistant response is returned to the user.
3363

34-
In summary, in the first step, you need to provide the private knowledge base, such as a text snippet. You need to prepare the content for RAG, which happens only once. If the content changes, you need to provide it again for RAG. The last two steps happen every time an end-user triggers the RAG flow, for example, by asking a question about the data.
64+
In summary, in the first step, you need to provide the private knowledge base, such as a text snippet. You need to prepare the content for RAG, which happens only once. If the content changes, you need to update the data in the knowledge base. The last two steps happen every time an end-user triggers the RAG flow, for example, by asking a question about the data.
3565

3666
## RAG Example in the GenAI Showcase App {#rag-showcase-app}
3767

3868
### Prerequisites {#prerequisites}
3969

40-
Before you start experimenting with the end-to-end process, make sure that you have covered the following prerequisites:
70+
Before you start experimenting with the end-to-end process, make sure that you have access to a (remote) PostgreSQL database with the [pgvector](https://github.com/pgvector/pgvector) extension available. If you do not have one yet, [learn more](/appstore/modules/genai/pgvector-setup/) about how a PostgreSQL vector database can be set up to explore use cases with knowledge bases.
4171

42-
You have access to a (remote) PostgreSQL database with the [pgvector](https://github.com/pgvector/pgvector) extension available.
43-
44-
{{% alert color="info" %}}If you have access to an Amazon Web Services (AWS) account, Mendix recommends you use a [free-tier RDS](https://aws.amazon.com/rds/faqs/#product-faqs#amazon-rds-faqs#free-tier) setup described in the [Creating a PostgreSQL Database with Amazon RDS](/appstore/modules/genai/pgvector-setup/#aws-database-create) section. This is convenient, since PostgreSQL databases in Amazon RDS by default have the required pgvector extension available.{{% /alert %}}
72+
{{% alert color="info" %}}If you have access to an Amazon Web Services (AWS) account or Microsoft Azure account, Mendix recommends you use a setup described in the [Creating a PostgreSQL Database with Amazon RDS](/appstore/modules/genai/pgvector-setup/#aws-database-create) or [Managing a PostgreSQL Database with Microsoft Azure](/appstore/modules/genai/pgvector-setup/#azure-database) section. This is convenient, since these PostgreSQL databases in the cloud have the required pgvector extension available by default.{{% /alert %}}
4573

4674
### Steps {#steps}
4775

@@ -88,8 +116,12 @@ If you would like to build your own RAG setup, feel free to learn from the GenAI
88116
* Find top-k nearest neighbors (select query; typically using cosine distance/similarity optimization as recommended by OpenAI).
89117

90118
* Remove individual records (delete) or tables (drop table).
119+
120+
* Similarity search is only guaranteed to work if the embeddings model chosen for the retrieval step is the same as the model used at the time of population: different models use different algorithms to generate vectors and might even produce vectors of different dimensionalities, making cosine distance calculation impossible.
121+
122+
* How you construct the input string affects similarity search results. In the similarity search example for **tickets** in the showcase application, the input string at the time of insertion is a concatenation of multiple attributes of each ticket record in the Mendix database. However, in the search step, the user's input—possibly just a brief description—is used to find similar tickets. While this discrepancy may lower overall similarity, the most relevant records will still appear at the top.
91123

92-
{{% alert color="info" %}}Example queries in the form of SQL statements are available for inspiration in the source code of the [PgVector Knowledge Base module](/appstore/modules/genai/pgvector/) which comes automatically with GenAI Showcase App.{{% /alert %}}
124+
{{% alert color="info" %}}Reusable queries in the form of SQL statements are available in the source code of the [PgVector Knowledge Base module](/appstore/modules/genai/pgvector/) which comes automatically with GenAI Showcase App.{{% /alert %}}
93125

94126
## Read More {#read-more}
95127

content/en/docs/appstore/platform-supported-content/modules/genai/genai-commons.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -191,7 +191,7 @@ This entity represents a collection of chunks. It is a wrapper entity for [Chunk
191191

192192
#### `Chunk` {#chunk-entity}
193193

194-
A piece of information (InputText) and the corresponding embeddings vector retrieved from an Embeddings API.
194+
A piece of information (InputText) and the corresponding embeddings vector retrieved from an Embeddings API. This is the relevant entity if you need to generate embedding vectors but do not need to store them in a knowledge base.
195195

196196
| Attribute | Description |
197197
| --- | --- |
@@ -201,7 +201,7 @@ A piece of information (InputText) and the corresponding embeddings vector retri
201201

202202
#### `KnowledgeBaseChunk` {#knowledgebasechunk-entity}
203203

204-
This entity represents a discrete piece of knowledge that can be used in embed and store operations. It is a specialization of [Chunk](#chunk-entity).
204+
This entity represents a discrete piece of knowledge that can be used for embedding and storage operations. As a specialization of [Chunk](#chunk-entity), it is the appropriate entity to use when both generating embedding vectors and storing them in a knowledge base.
205205

206206
| Attribute | Description |
207207
| --- | --- |
@@ -217,7 +217,7 @@ An optional collection of metadata. This is a wrapper entity for one or more [Me
217217

218218
#### `Metadata` {#metadata-entity}
219219

220-
This entity represents additional information that is to be stored with the [KnowledgeBaseChunk](#knowledgebasechunk-entity) in the knowledge base. It can be used for custom filtering during retrieval.
220+
This entity represents additional information to be stored with the [KnowledgeBaseChunk](#knowledgebasechunk-entity) in the knowledge base. At the insertion stage, you can link multiple metadata objects to a KnowledgeBaseChunk as needed. These metadata objects consist of key-value pairs used for custom filtering during retrieval. Retrieval operates on an exact string-match basis for each key-value pair, returning records only if they match all metadata records specified in the search criteria.
221221

222222
| Attribute | Description |
223223
| --- | --- |

content/en/docs/appstore/platform-supported-content/modules/genai/pg-vector-knowledge-base/_index.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -86,7 +86,7 @@ A typical pattern for populating a knowledge base is as follows:
8686

8787
1. Create a new `ChunkCollection`. See the [Initialize ChunkCollection](/appstore/modules/genai/commons/) section.
8888
2. For each knowledge item that needs to be inserted, do the following:
89-
* Use [Initialize MetadataCollection with Metadata](/appstore/modules/genai/commons/) and [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) as many times as needed to create a collection of the necessary metadata for the knowledge base item.
89+
* Use [Initialize MetadataCollection with Metadata](/appstore/modules/genai/commons/) and [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) to create a collection of the necessary metadata for the knowledge base item.
9090
* With both collections as input parameters, use [Add KnowledgeBaseChunk to ChunkCollection](/appstore/modules/genai/commons/) for the knowledge item.
9191
3. Call an embeddings endpoint with the `ChunkCollection` to generate an embedding vector for each `KnowledgeBaseChunk`
9292
4. With the `ChunkCollection`, use [(Re)populate Knowledge Base](#repopulate-knowledge-base) to store the chunks.
@@ -118,15 +118,15 @@ Currently, four operations are available for on-demand retrieval of data chunks
118118
A typical pattern for retrieval from a knowledge base uses GenAI Commons operations and can be illustrated as follows:
119119

120120
1. Use [Initialize MetadataCollection with Metadata](/appstore/modules/genai/commons/) to set up a `MetadataCollection` for filtering with its first key-value pair added immediately.
121-
2. Use [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) as many times as needed to create a collection of the necessary metadata.
121+
2. Use [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) (iteratively) to create a collection of the necessary metadata.
122122
3. Do the retrieval. For example, you could use [Retrieve Nearest Neighbors](#retrieve-nearest-neighbors) to find chunks based on vector similarity.
123123

124124
For scenarios in which the created chunks were based on Mendix objects at the time of population and these objects need to be used in logic after the retrieval step, two additional operations are available. The Java actions [Retrieve & Associate](#retrieve-associate) and [Retrieve Nearest Neighbors & Associate](#retrieve-nearest-neighbors-associate) take care of the chunk retrieval and set the association towards the original object, if applicable.
125125

126126
A typical pattern for this retrieval is as follows:
127127

128128
1. Use [Initialize MetadataCollection with Metadata](/appstore/modules/genai/commons/) to set up a `MetadataCollection` for filtering with its first key-value pair added immediately.
129-
2. Use [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) as many times as needed to create a collection of the necessary metadata.
129+
2. Use [Add Metadata to MetadataCollection](/appstore/modules/genai/commons/) (iteratively) to create a collection of the necessary metadata.
130130
3. Do the retrieval. For example, you could use [Retrieve Nearest Neighbors & Associate](#retrieve-nearest-neighbors-associate) to find chunks based on vector similarity.
131131
4. For each retrieved chunk, retrieve the original Mendix object and do custom logic.
132132

content/en/docs/appstore/platform-supported-content/modules/genai/pg-vector-knowledge-base/vector-database-setup.md

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,8 @@ This procedure describes a setup based on a PostgreSQL database with the pgvecto
1818

1919
## Managing a PostgreSQL Database with Amazon RDS {#aws-database}
2020

21+
A PostgreSQL database in Amazon RDS includes the required extension for `pgvector` pre-installed. When you connect using the [PgVector Knowledge Base](https://marketplace.mendix.com/link/component/225063) module, this extension activates automatically, allowing the database to function as a vector database for knowledge bases.
22+
2123
### Creating a PostgreSQL Database with Amazon RDS {#aws-database-create}
2224

2325
{{% alert color="info" %}}
@@ -78,6 +80,8 @@ If no action is taken, resources in AWS will stay around indefinitely. Make sure
7880

7981
## Managing a PostgreSQL Database with Microsoft Azure {#azure-database}
8082

83+
A PostgreSQL database in Microsoft includes the required `pgvector` extension (called *vector*) pre-installed. The steps below describe how to enable its use. When you connect using the [PgVector Knowledge Base](https://marketplace.mendix.com/link/component/225063) module, this extension activates automatically allowing the database to function as a vector database for knowledge bases.
84+
8185
### Creating a PostgreSQL Database with Microsoft Azure {#azure-database-create}
8286

8387
{{% alert color="info" %}}
@@ -159,7 +163,7 @@ If no action is taken, resources on Azure will stay around indefinitely. Make su
159163

160164
2. Use the master username and master password that you set in the **Settings** when you [created the PostgreSQL Database with Amazon RDS](#aws-database-create) or for the admin user in the [Azure Portal](#azure-database-create) as your username and password.
161165

162-
3. Save and test the configuration.
166+
3. Save and test the configuration. This will activate the `pgvector` extension and the vector database is ready to be used.
163167

164168
## Setup Alternatives {#setup-alternatives}
165169

0 commit comments

Comments
 (0)