Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion assets/css/syntax-dark.css
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@
/* LineHighlight */
.hl {
min-width: fit-content;
background-color: var(--color-gray-800);
background-color: var(--color-gray-700);
}
.lntd:first-child .hl,
& > .chroma > code > .hl {
Expand Down
61 changes: 28 additions & 33 deletions content/guides/genai-leveraging-rag/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -10,8 +10,6 @@
time: 35 minutes
---



## Introduction

Retrieval-Augmented Generation (RAG) is a powerful framework that enhances large language models (LLMs) by integrating information retrieval from external knowledge sources. This guide focuses on a specialized RAG implementation using graph databases like Neo4j, which excel in managing highly connected, relational data. Unlike traditional RAG setups with vector databases, combining RAG with graph databases offers better context-awareness and relationship-driven insights.
Expand All @@ -22,25 +20,24 @@
* Configure a GenAI stack with Docker, incorporating Neo4j and an AI model.
* Analyze a real-world case study that highlights the effectiveness of this approach for handling specialized queries.

## Understanding RAG
## Understanding RAG

RAG is a hybrid framework that enhances the capabilities of large language models by integrating information retrieval. It combines three core components:
RAG is a hybrid framework that enhances the capabilities of large language models by integrating information retrieval. It combines three core components:

- **Information retrieval** from an external knowledge base
- **Large Language Model (LLM)** for generating responses
- **Vector embeddings** to enable semantic search
- **Information retrieval** from an external knowledge base
- **Large Language Model (LLM)** for generating responses
- **Vector embeddings** to enable semantic search

In a RAG system, vector embeddings are used to represent the semantic meaning of text in a way that a machine can understand and process. For instance, the words "dog" and "puppy" will have similar embeddings because they share similar meanings. By integrating these embeddings into the RAG framework, the system can combine the generative power of large language models with the ability to pull in highly relevant, contextually-aware data from external sources.

The system operates as follows:
1. Questions get turned into mathematical patterns that capture their meaning
The system operates as follows:
1. Questions get turned into mathematical patterns that capture their meaning
2. These patterns help find matching information in a database
3. The found information gets added to the original question before passed to LLM
4. The LLM generates responses that blend the model's inherent knowledge with the this extra information.
3. The LLM generates responses that blend the model's inherent knowledge with the this extra information.

To hold this vector information in an efficient manner, we need a special type of database.

## Introduction to Graph databases
## Introduction to Graph databases

Graph databases, such as Neo4j, are specifically designed for managing highly connected data. Unlike traditional relational databases, graph databases prioritize both the entities and the relationships between them, making them ideal for tasks where connections are as important as the data itself.

Expand All @@ -65,7 +62,7 @@
I'm happy to help! Unfortunately, I'm a large language model, I don't have access to real-time information or events that occurred after my training data cutoff in 2024. Therefore, I cannot provide you with any important events that happened in 2024. My apologize for any inconvenience this may cause. Is there anything else I can help you with?
```

## Setting up GenAI stack with GPU acceleration on Linux
## Setting up GenAI stack with GPU acceleration on Linux

Check warning on line 65 in content/guides/genai-leveraging-rag/index.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [Docker.HeadingLength] Try to keep headings short (< 8 words). Raw Output: {"message": "[Docker.HeadingLength] Try to keep headings short (\u003c 8 words).", "location": {"path": "content/guides/genai-leveraging-rag/index.md", "range": {"start": {"line": 65, "column": 4}}}, "severity": "INFO"}

To set up and run the GenAI stack on a Linux host, execute one of the following commands, either for GPU or CPU powered:

Expand All @@ -79,10 +76,12 @@
```
In the `.env` file, make sure following lines are commented out. Set your own credentials for security

```txt
NEO4J_URI=neo4j://database:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=password
OLLAMA_BASE_URL=http://llm-gpu:11434
```

### CPU powered

Expand All @@ -94,15 +93,16 @@
```
In the `.env` file, make sure following lines are commented out. Set your own credentials for security

```txt
NEO4J_URI=neo4j://database:7687
NEO4J_USERNAME=neo4j
NEO4J_PASSWORD=password
OLLAMA_BASE_URL=http://llm:11434
```

### Setting up on other platforms

For instructions on how to set up the stack on other platforms, refer to [this page](https://github.com/docker/genai-stack).
### Setting up on other platforms

For instructions on how to set up the stack on other platforms, refer to [this page](https://github.com/docker/genai-stack).

### Initial startup

Expand All @@ -118,42 +118,41 @@

Wait for specific lines in the logs indicating that the download is complete and the stack is ready. These lines typically confirm successful setup and initialization.

```text
pull-model-1 exited with code 0
database-1 | 2024-12-29 09:35:53.269+0000 INFO Started.
pdf_bot-1 | You can now view your Streamlit app in your browser.
loader-1 | You can now view your Streamlit app in your browser.
bot-1 | You can now view your Streamlit app in your browser.
```


You can now access the interface at [http://localhost:8501/](http://localhost:8501/) to ask questions. For example, you can try the sample question:
You can now access the interface at [http://localhost:8501/](http://localhost:8501/) to ask questions. For example, you can try the sample question:

When we see those lines in the logs, web apps are ready to be used.

Since our goal is to teach AI about things it does not yet know, we begin by asking it a simple question about Nifi at
Since our goal is to teach AI about things it does not yet know, we begin by asking it a simple question about Nifi at

Check failure on line 133 in content/guides/genai-leveraging-rag/index.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [Vale.Spelling] Did you really mean 'Nifi'? Raw Output: {"message": "[Vale.Spelling] Did you really mean 'Nifi'?", "location": {"path": "content/guides/genai-leveraging-rag/index.md", "range": {"start": {"line": 133, "column": 112}}}, "severity": "ERROR"}

Check warning on line 133 in content/guides/genai-leveraging-rag/index.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [Docker.We] Avoid using first-person plural like 'we'. Raw Output: {"message": "[Docker.We] Avoid using first-person plural like 'we'.", "location": {"path": "content/guides/genai-leveraging-rag/index.md", "range": {"start": {"line": 133, "column": 66}}}, "severity": "WARNING"}

Check warning on line 133 in content/guides/genai-leveraging-rag/index.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [Docker.We] Avoid using first-person plural like 'our'. Raw Output: {"message": "[Docker.We] Avoid using first-person plural like 'our'.", "location": {"path": "content/guides/genai-leveraging-rag/index.md", "range": {"start": {"line": 133, "column": 7}}}, "severity": "WARNING"}
[http://localhost:8501/](http://localhost:8501/).
![alt text](image.png)

```text
Question: What is Apache Nifi?
Question: What is Apache Nifi?
RAG: Disabled
Hello! I'm here to help you with your question about Apache NiFi. Unfortunately, I don't know the answer to that question. I'm just an AI and my knowledge cutoff is December 2022, so I may not be familiar with the latest technologies or software. Can you please provide more context or details about Apache NiFi? Maybe there's something I can help you with related to it.
```

As we can see, AI does not know anything about this subject because it did not exist during the time of its training, also known as the information cutoff point.

Now it's time to teach the AI some new tricks. First, connect to [http://localhost:8502/](http://localhost:8502/). Instead of using the "neo4j" tag, change it to the "apache-nifi" tag, then select the **Import** button.

Now it's time to teach the AI some new tricks. First, connect to [http://localhost:8502/](http://localhost:8502/). Instead of using the "neo4j" tag, change it to the "apache-nifi" tag, then select the **Import** button.

![alt text](image-1.png)


After the import is successful, we can access Neo4j to verify the data.
After the import is successful, we can access Neo4j to verify the data.

Check warning on line 149 in content/guides/genai-leveraging-rag/index.md

View workflow job for this annotation

GitHub Actions / vale

[vale] reported by reviewdog 🐶 [Docker.We] Avoid using first-person plural like 'we'. Raw Output: {"message": "[Docker.We] Avoid using first-person plural like 'we'.", "location": {"path": "content/guides/genai-leveraging-rag/index.md", "range": {"start": {"line": 149, "column": 33}}}, "severity": "WARNING"}

After logging in to [http://localhost:7474/](http://localhost:7474/) using the credentials from the `.env` file, you can run queries on Neo4j. Using the Neo4j Cypher query language, you can check for the data stored in the database.

To count the data, run the following query:

```cypher
```text
MATCH (n)
RETURN DISTINCT labels(n) AS NodeTypes, count(*) AS Count
ORDER BY Count DESC;
Expand All @@ -167,25 +166,23 @@

You can also run the following query to visualize the data:

```cypher
```text
CALL db.schema.visualization()
```

To check the relationships in the database, run the following query:

```cypher
```text
CALL db.relationshipTypes()
```



Now, we are ready to enable our LLM to use this information. Go back to [http://localhost:8501/](http://localhost:8501/), enable the **RAG** checkbox, and ask the same question again. The LLM will now provide a more detailed answer.

![alt text](image-3.png)

The system delivers comprehensive, accurate information by pulling from current technical documentation.
```text
Question: What is Apache Nifi?
Question: What is Apache Nifi?
RAG: Enabled

Answer:
Expand All @@ -203,15 +200,13 @@

Feel free to start over with another [Stack Overflow tag](https://stackoverflow.com/tags). To drop all data in Neo4j, you can use the following command in the Neo4j Web UI:


```cypher
```txt
MATCH (n)
DETACH DELETE n;
```

For optimal results, choose a tag that the LLM is not familiar with.


### When to leverage RAG for optimal results

Retrieval-Augmented Generation (RAG) is particularly effective in scenarios where standard Large Language Models (LLMs) fall short. The three key areas where RAG excels are knowledge limitations, business requirements, and cost efficiency. Below, we explore these aspects in more detail.
Expand Down
3 changes: 3 additions & 0 deletions hugo_stats.json
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,10 @@
"Docker-subscription",
"Download",
"Entra-ID",
"Entra-ID-OIDC",
"Entra-ID-SAML-2.0",
"Entra-ID/Azure-AD-OIDC-and-SAML-2.0",
"Entra-ID/Azure-AD-SAML-2.0-and-OIDC",
"External-cloud-storage",
"Fedora",
"For-Mac-with-Apple-silicon",
Expand Down