Skip to content

Commit ef7af62

Browse files
mfranzondvdksn
authored andcommitted
add rag-ollama application example
Signed-off-by: David Karlsson <[email protected]>
1 parent d271b4b commit ef7af62

File tree

4 files changed

+286
-0
lines changed

4 files changed

+286
-0
lines changed

content/guides/use-case/_index.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -38,6 +38,10 @@ grid_genai:
3838
description: Explore an app that can summarize text.
3939
link: /guides/use-case/nlp/text-summarization/
4040
icon: summarize
41+
- title: RAG Ollama application
42+
description: Explore how to containerize a RAG application.
43+
link: /guides/use-case/rag-ollama/
44+
icon: article
4145
---
4246

4347
Explore this collection of use-case guides designed to help you leverage Docker
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
---
2+
description: Containerize RAG application using Ollama and Docker
3+
keywords: python, generative ai, genai, llm, ollama, rag, qdrant
4+
title: Build a RAG application using Ollama and Docker
5+
linkTitle: RAG Ollama application
6+
toc_min: 1
7+
toc_max: 2
8+
---
9+
10+
The Retrieval Augmented Generation (RAG) guide teaches you how to containerize an existing RAG application using Docker. The example application is a RAG that acts like a sommelier, giving you the best pairings between wines and food. In this guide, you’ll learn how to:
11+
12+
* Containerize and run a RAG application
13+
* Set up a local environment to run the complete RAG stack locally for development
14+
15+
Start by containerizing an existing RAG application.
16+
17+
{{< button text="Containerize a RAG app" url="containerize.md" >}}
Lines changed: 107 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,107 @@
1+
---
2+
title: Containerize a RAG application
3+
linkTitle: Containerize your app
4+
weight: 10
5+
keywords: python, generative ai, genai, llm, ollama, containerize, intitialize, qdrant
6+
description: Learn how to containerize a RAG application.
7+
---
8+
9+
## Overview
10+
11+
This section walks you through containerizing a RAG application using Docker.
12+
13+
> [!NOTE]
14+
> You can see more samples of containerized GenAI applications in the [GenAI Stack](https://github.com/docker/genai-stack) demo applications.
15+
16+
## Get the sample application
17+
18+
The sample application used in this guide is an example of RAG application, made by three main components, which are the building blocks for every RAG application. A Large Language Model hosted somewhere, in this case it is hosted in a container and served via [Ollama](https://ollama.ai/). A vector database, [Qdrant](https://qdrant.tech/), to store the embeddings of local data, and a web application, using [Streamlit](https://streamlit.io/) to offer the best user experience to the user.
19+
20+
Clone the sample application. Open a terminal, change directory to a directory that you want to work in, and run the following command to clone the repository:
21+
22+
```console
23+
$ git clone https://github.com/mfranzon/winy.git
24+
```
25+
26+
You should now have the following files in your `winy` directory.
27+
28+
```text
29+
├── winy/
30+
│ ├── .gitignore
31+
│ ├── app/
32+
│ │ ├── main.py
33+
│ │ ├── Dockerfile
34+
| | └── requirements.txt
35+
│ ├── tools/
36+
│ │ ├── create_db.py
37+
│ │ ├── create_embeddings.py
38+
│ │ ├── requirements.txt
39+
│ │ ├── test.py
40+
| | └── download_model.sh
41+
│ ├── docker-compose.yaml
42+
│ ├── wine_database.db
43+
│ ├── LICENSE
44+
│ └── README.md
45+
```
46+
47+
## Containerizing your application: Essentials
48+
49+
Containerizing an application involves packaging it along with its dependencies into a container, which ensures consistency across different environments. Here’s what you need to containerize an app like Winy :
50+
51+
1. Dockerfile: A Dockerfile that contains instructions on how to build a Docker image for your application. It specifies the base image, dependencies, configuration files, and the command to run your application.
52+
53+
2. Docker Compose File: Docker Compose is a tool for defining and running multi-container Docker applications. A Compose file allows you to configure your application's services, networks, and volumes in a single file.
54+
55+
## Run the application
56+
57+
Inside the `winy` directory, run the following command in a
58+
terminal.
59+
60+
```console
61+
$ docker compose up --build
62+
```
63+
64+
Docker builds and runs your application. Depending on your network connection, it may take several minutes to download all the dependencies. You'll see a message like the following in the terminal when the application is running.
65+
66+
```console
67+
server-1 | You can now view your Streamlit app in your browser.
68+
server-1 |
69+
server-1 | URL: http://0.0.0.0:8501
70+
server-1 |
71+
```
72+
73+
Open a browser and view the application at [http://localhost:8501](http://localhost:8501). You should see a simple Streamlit application.
74+
75+
The application requires a Qdrant database service and an LLM service to work properly. If you have access to services that you ran outside of Docker, specify the connection information in the `docker-compose.yaml`.
76+
77+
```yaml
78+
winy:
79+
build:
80+
context: ./app
81+
dockerfile: Dockerfile
82+
environment:
83+
- QDRANT_CLIENT=http://qdrant:6333 # Specifies the url for the qdrant database
84+
- OLLAMA=http://ollama:11434 # Specifies the url for the ollama service
85+
container_name: winy
86+
ports:
87+
- "8501:8501"
88+
depends_on:
89+
- qdrant
90+
- ollama
91+
```
92+
93+
If you don't have the services running, continue with this guide to learn how you can run some or all of these services with Docker.
94+
Remember that the `ollama` service is empty; it doesn't have any model. For this reason you need to pull a model before starting to use the RAG application. All the instructions are in the following page.
95+
96+
In the terminal, press `ctrl`+`c` to stop the application.
97+
98+
## Summary
99+
100+
In this section, you learned how you can containerize and run your RAG
101+
application using Docker.
102+
103+
## Next steps
104+
105+
In the next section, you'll learn how to properly configure the application with your preferred LLM model, completely locally, using Docker.
106+
107+
{{< button text="Develop your application" url="develop.md" >}}
Lines changed: 158 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,158 @@
1+
---
2+
title: Use containers for RAG development
3+
linkTitle: Develop your app
4+
weight: 10
5+
keywords: python, local, development, generative ai, genai, llm, rag, ollama
6+
description: Learn how to develop your generative RAG application locally.
7+
---
8+
9+
## Prerequisites
10+
11+
Complete [Containerize a RAG application](containerize.md).
12+
13+
## Overview
14+
15+
In this section, you'll learn how to set up a development environment to access all the services that your generative RAG application needs. This includes:
16+
17+
- Adding a local database
18+
- Adding a local or remote LLM service
19+
20+
> [!NOTE]
21+
> You can see more samples of containerized GenAI applications in the [GenAI Stack](https://github.com/docker/genai-stack) demo applications.
22+
23+
## Add a local database
24+
25+
You can use containers to set up local services, like a database. In this section, you'll explore the database service in the `docker-compose.yaml` file.
26+
27+
To run the database service:
28+
29+
1. In the cloned repository's directory, open the `docker-compose.yaml` file in an IDE or text editor.
30+
31+
2. In the `docker-compose.yaml` file, you'll see the following:
32+
33+
```yaml
34+
services:
35+
qdrant:
36+
image: qdrant/qdrant
37+
container_name: qdrant
38+
ports:
39+
- "6333:6333"
40+
volumes:
41+
- qdrant_data:/qdrant/storage
42+
```
43+
44+
> [!NOTE]
45+
> To learn more about Qdrant, see the [Qdrant Official Docker Image](https://hub.docker.com/r/qdrant/qdrant).
46+
47+
3. Start the application. Inside the `winy` directory, run the following command in a terminal.
48+
49+
```console
50+
$ docker compose up --build
51+
```
52+
53+
4. Access the application. Open a browser and view the application at [http://localhost:8501](http://localhost:8501). You should see a simple Streamlit application.
54+
55+
5. Stop the application. In the terminal, press `ctrl`+`c` to stop the application.
56+
57+
## Add a local or remote LLM service
58+
59+
The sample application supports both [Ollama](https://ollama.ai/). This guide provides instructions for the following scenarios:
60+
- Run Ollama in a container
61+
- Run Ollama outside of a container
62+
63+
While all platforms can use any of the previous scenarios, the performance and
64+
GPU support may vary. You can use the following guidelines to help you choose the appropriate option:
65+
- Run Ollama in a container if you're on Linux, and using a native installation of the Docker Engine, or Windows 10/11, and using Docker Desktop, you
66+
have a CUDA-supported GPU, and your system has at least 8 GB of RAM.
67+
- Run Ollama outside of a container if running Docker Desktop on a Linux Machine.
68+
69+
Choose one of the following options for your LLM service.
70+
71+
{{< tabs >}}
72+
{{< tab name="Run Ollama in a container" >}}
73+
74+
When running Ollama in a container, you should have a CUDA-supported GPU. While you can run Ollama in a container without a supported GPU, the performance may not be acceptable. Only Linux and Windows 11 support GPU access to containers.
75+
76+
To run Ollama in a container and provide GPU access:
77+
1. Install the prerequisites.
78+
- For Docker Engine on Linux, install the [NVIDIA Container Toolkilt](https://github.com/NVIDIA/nvidia-container-toolkit).
79+
- For Docker Desktop on Windows 10/11, install the latest [NVIDIA driver](https://www.nvidia.com/Download/index.aspx) and make sure you are using the [WSL2 backend](/manuals/desktop/wsl/_index.md#turn-on-docker-desktop-wsl-2)
80+
2. The `docker-compose.yaml` file already contains the necessary instructions. In your own apps, you'll need to add the Ollama service in your `docker-compose.yaml`. The following is
81+
the updated `docker-compose.yaml`:
82+
83+
```yaml
84+
ollama:
85+
image: ollama/ollama
86+
container_name: ollama
87+
ports:
88+
- "8000:8000"
89+
deploy:
90+
resources:
91+
reservations:
92+
devices:
93+
- driver: nvidia
94+
count: 1
95+
capabilities: [gpu]
96+
```
97+
98+
> [!NOTE]
99+
> For more details about the Compose instructions, see [Turn on GPU access with Docker Compose](/manuals/compose/gpu-support.md).
100+
101+
3. Once the Ollama container is up and running it is possible to use the `download_model.sh` inside the `tools` folder with this command:
102+
103+
```console
104+
. ./download_model.sh <model-name>
105+
```
106+
107+
Pulling an Ollama model can take several minutes.
108+
109+
{{< /tab >}}
110+
{{< tab name="Run Ollama outside of a container" >}}
111+
112+
To run Ollama outside of a container:
113+
114+
1. [Install](https://github.com/jmorganca/ollama) and run Ollama on your host
115+
machine.
116+
2. Pull the model to Ollama using the following command.
117+
118+
```console
119+
$ ollama pull llama2
120+
```
121+
122+
3. Remove the `ollama` service from the `docker-compose.yaml` and update properly the connection variables in `winy` service:
123+
124+
```diff
125+
- OLLAMA=http://ollama:11434
126+
+ OLLAMA=<your-url>
127+
```
128+
129+
{{< /tab >}}
130+
{{< /tabs >}}
131+
132+
## Run your RAG application
133+
134+
At this point, you have the following services in your Compose file:
135+
- Server service for your main RAG application
136+
- Database service to store vectors in a Qdrant database
137+
- (optional) Ollama service to run the LLM
138+
service
139+
140+
Once the application is running, open a browser and access the application at [http://localhost:8501](http://localhost:8501).
141+
142+
Depending on your system and the LLM service that you chose, it may take several
143+
minutes to answer.
144+
145+
## Summary
146+
147+
In this section, you learned how to set up a development environment to provide
148+
access all the services that your GenAI application needs.
149+
150+
Related information:
151+
- [Dockerfile reference](/reference/dockerfile.md)
152+
- [Compose file reference](/reference/compose-file/_index.md)
153+
- [Ollama Docker image](https://hub.docker.com/r/ollama/ollama)
154+
- [GenAI Stack demo applications](https://github.com/docker/genai-stack)
155+
156+
## Next steps
157+
158+
See samples of more GenAI applications in the [GenAI Stack demo applications](https://github.com/docker/genai-stack).

0 commit comments

Comments
 (0)