You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/guides/genai-leveraging-rag/index.md
+28-33Lines changed: 28 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,8 +10,6 @@ params:
10
10
time: 35 minutes
11
11
---
12
12
13
-
14
-
15
13
## Introduction
16
14
17
15
Retrieval-Augmented Generation (RAG) is a powerful framework that enhances large language models (LLMs) by integrating information retrieval from external knowledge sources. This guide focuses on a specialized RAG implementation using graph databases like Neo4j, which excel in managing highly connected, relational data. Unlike traditional RAG setups with vector databases, combining RAG with graph databases offers better context-awareness and relationship-driven insights.
@@ -22,25 +20,24 @@ In this guide, you will:
22
20
* Configure a GenAI stack with Docker, incorporating Neo4j and an AI model.
23
21
* Analyze a real-world case study that highlights the effectiveness of this approach for handling specialized queries.
24
22
25
-
## Understanding RAG
23
+
## Understanding RAG
26
24
27
-
RAG is a hybrid framework that enhances the capabilities of large language models by integrating information retrieval. It combines three core components:
25
+
RAG is a hybrid framework that enhances the capabilities of large language models by integrating information retrieval. It combines three core components:
28
26
29
-
-**Information retrieval** from an external knowledge base
30
-
-**Large Language Model (LLM)** for generating responses
31
-
-**Vector embeddings** to enable semantic search
27
+
-**Information retrieval** from an external knowledge base
28
+
-**Large Language Model (LLM)** for generating responses
29
+
-**Vector embeddings** to enable semantic search
32
30
33
31
In a RAG system, vector embeddings are used to represent the semantic meaning of text in a way that a machine can understand and process. For instance, the words "dog" and "puppy" will have similar embeddings because they share similar meanings. By integrating these embeddings into the RAG framework, the system can combine the generative power of large language models with the ability to pull in highly relevant, contextually-aware data from external sources.
34
32
35
-
The system operates as follows:
36
-
1. Questions get turned into mathematical patterns that capture their meaning
33
+
The system operates as follows:
34
+
1. Questions get turned into mathematical patterns that capture their meaning
37
35
2. These patterns help find matching information in a database
38
-
3. The found information gets added to the original question before passed to LLM
39
-
4. The LLM generates responses that blend the model's inherent knowledge with the this extra information.
36
+
3. The LLM generates responses that blend the model's inherent knowledge with the this extra information.
40
37
41
38
To hold this vector information in an efficient manner, we need a special type of database.
42
39
43
-
## Introduction to Graph databases
40
+
## Introduction to Graph databases
44
41
45
42
Graph databases, such as Neo4j, are specifically designed for managing highly connected data. Unlike traditional relational databases, graph databases prioritize both the entities and the relationships between them, making them ideal for tasks where connections are as important as the data itself.
46
43
@@ -65,7 +62,7 @@ RAG: Disabled
65
62
I'm happy to help! Unfortunately, I'm a large language model, I don't have access to real-time information or events that occurred after my training data cutoff in 2024. Therefore, I cannot provide you with any important events that happened in 2024. My apologize for any inconvenience this may cause. Is there anything else I can help you with?
66
63
```
67
64
68
-
## Setting up GenAI stack with GPU acceleration on Linux
65
+
## Setting up GenAI stack with GPU acceleration on Linux
69
66
70
67
To set up and run the GenAI stack on a Linux host, execute one of the following commands, either for GPU or CPU powered:
71
68
@@ -79,10 +76,12 @@ nano .env
79
76
```
80
77
In the `.env` file, make sure following lines are commented out. Set your own credentials for security
81
78
79
+
```txt
82
80
NEO4J_URI=neo4j://database:7687
83
81
NEO4J_USERNAME=neo4j
84
82
NEO4J_PASSWORD=password
85
83
OLLAMA_BASE_URL=http://llm-gpu:11434
84
+
```
86
85
87
86
### CPU powered
88
87
@@ -94,15 +93,16 @@ nano .env
94
93
```
95
94
In the `.env` file, make sure following lines are commented out. Set your own credentials for security
96
95
96
+
```txt
97
97
NEO4J_URI=neo4j://database:7687
98
98
NEO4J_USERNAME=neo4j
99
99
NEO4J_PASSWORD=password
100
100
OLLAMA_BASE_URL=http://llm:11434
101
+
```
101
102
102
-
### Setting up on other platforms
103
-
104
-
For instructions on how to set up the stack on other platforms, refer to [this page](https://github.com/docker/genai-stack).
103
+
### Setting up on other platforms
105
104
105
+
For instructions on how to set up the stack on other platforms, refer to [this page](https://github.com/docker/genai-stack).
106
106
107
107
### Initial startup
108
108
@@ -118,42 +118,41 @@ docker compose logs
118
118
119
119
Wait for specific lines in the logs indicating that the download is complete and the stack is ready. These lines typically confirm successful setup and initialization.
120
120
121
+
```text
121
122
pull-model-1 exited with code 0
122
123
database-1 | 2024-12-29 09:35:53.269+0000 INFO Started.
123
124
pdf_bot-1 | You can now view your Streamlit app in your browser.
124
125
loader-1 | You can now view your Streamlit app in your browser.
125
126
bot-1 | You can now view your Streamlit app in your browser.
127
+
```
126
128
127
-
128
-
You can now access the interface at [http://localhost:8501/](http://localhost:8501/) to ask questions. For example, you can try the sample question:
129
+
You can now access the interface at [http://localhost:8501/](http://localhost:8501/) to ask questions. For example, you can try the sample question:
129
130
130
131
When we see those lines in the logs, web apps are ready to be used.
131
132
132
-
Since our goal is to teach AI about things it does not yet know, we begin by asking it a simple question about Nifi at
133
+
Since our goal is to teach AI about things it does not yet know, we begin by asking it a simple question about Nifi at
133
134
[http://localhost:8501/](http://localhost:8501/).
134
135

135
136
136
137
```text
137
-
Question: What is Apache Nifi?
138
+
Question: What is Apache Nifi?
138
139
RAG: Disabled
139
140
Hello! I'm here to help you with your question about Apache NiFi. Unfortunately, I don't know the answer to that question. I'm just an AI and my knowledge cutoff is December 2022, so I may not be familiar with the latest technologies or software. Can you please provide more context or details about Apache NiFi? Maybe there's something I can help you with related to it.
140
141
```
141
142
142
143
As we can see, AI does not know anything about this subject because it did not exist during the time of its training, also known as the information cutoff point.
143
144
144
-
Now it's time to teach the AI some new tricks. First, connect to [http://localhost:8502/](http://localhost:8502/). Instead of using the "neo4j" tag, change it to the "apache-nifi" tag, then select the **Import** button.
145
-
145
+
Now it's time to teach the AI some new tricks. First, connect to [http://localhost:8502/](http://localhost:8502/). Instead of using the "neo4j" tag, change it to the "apache-nifi" tag, then select the **Import** button.
146
146
147
147

148
148
149
-
150
-
After the import is successful, we can access Neo4j to verify the data.
149
+
After the import is successful, we can access Neo4j to verify the data.
151
150
152
151
After logging in to [http://localhost:7474/](http://localhost:7474/) using the credentials from the `.env` file, you can run queries on Neo4j. Using the Neo4j Cypher query language, you can check for the data stored in the database.
153
152
154
153
To count the data, run the following query:
155
154
156
-
```cypher
155
+
```text
157
156
MATCH (n)
158
157
RETURN DISTINCT labels(n) AS NodeTypes, count(*) AS Count
159
158
ORDER BY Count DESC;
@@ -167,25 +166,23 @@ Results will appear below. What we are seeing here is the information system dow
167
166
168
167
You can also run the following query to visualize the data:
169
168
170
-
```cypher
169
+
```text
171
170
CALL db.schema.visualization()
172
171
```
173
172
174
173
To check the relationships in the database, run the following query:
175
174
176
-
```cypher
175
+
```text
177
176
CALL db.relationshipTypes()
178
177
```
179
178
180
-
181
-
182
179
Now, we are ready to enable our LLM to use this information. Go back to [http://localhost:8501/](http://localhost:8501/), enable the **RAG** checkbox, and ask the same question again. The LLM will now provide a more detailed answer.
183
180
184
181

185
182
186
183
The system delivers comprehensive, accurate information by pulling from current technical documentation.
187
184
```text
188
-
Question: What is Apache Nifi?
185
+
Question: What is Apache Nifi?
189
186
RAG: Enabled
190
187
191
188
Answer:
@@ -203,15 +200,13 @@ Keep in mind that new questions will be added to Stack Overflow, and due to the
203
200
204
201
Feel free to start over with another [Stack Overflow tag](https://stackoverflow.com/tags). To drop all data in Neo4j, you can use the following command in the Neo4j Web UI:
205
202
206
-
207
-
```cypher
203
+
```txt
208
204
MATCH (n)
209
205
DETACH DELETE n;
210
206
```
211
207
212
208
For optimal results, choose a tag that the LLM is not familiar with.
213
209
214
-
215
210
### When to leverage RAG for optimal results
216
211
217
212
Retrieval-Augmented Generation (RAG) is particularly effective in scenarios where standard Large Language Models (LLMs) fall short. The three key areas where RAG excels are knowledge limitations, business requirements, and cost efficiency. Below, we explore these aspects in more detail.
0 commit comments