Skip to content

Commit dc60abb

Browse files
Merge pull request #603 from dminnear-rh/update-rag-llm-docs
add information about elasticsearch to rag-llm docs
2 parents caf2c3d + 555ffef commit dc60abb

File tree

4 files changed

+62
-45
lines changed

4 files changed

+62
-45
lines changed

content/patterns/rag-llm-gitops/_index.md

Lines changed: 17 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,14 @@ date: 2024-07-25
44
tier: tested
55
summary: The goal of this demo is to showcase a Chatbot LLM application augmented with data from Red Hat product documentation running on Red Hat OpenShift. It deploys an LLM application that connects to multiple LLM providers such as OpenAI, Hugging Face, and NVIDIA NIM. The application generates a project proposal for a Red Hat product.
66
rh_products:
7-
- Red Hat OpenShift Container Platform
8-
- Red Hat OpenShift GitOps
7+
- Red Hat OpenShift Container Platform
8+
- Red Hat OpenShift GitOps
9+
- Red Hat OpenShift AI
910
partners:
10-
- EDB
11-
- Elastic
11+
- EDB
12+
- Elastic
1213
industries:
13-
- General
14+
- General
1415
aliases: /ai/
1516
# uncomment once this exists
1617
# pattern_logo: retail.png
@@ -26,16 +27,15 @@ ci: ragllm
2627

2728
## Introduction
2829

29-
This deployment is based on the _validated pattern framework_, using GitOps for
30+
This deployment is based on the _Validated Patterns framework_, using GitOps for
3031
seamless provisioning of all operators and applications. It deploys a Chatbot
3132
application that harnesses the power of Large Language Models (LLMs) combined
3233
with the Retrieval-Augmented Generation (RAG) framework.
3334

3435
The pattern uses the [Red Hat OpenShift AI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai) to deploy and serve LLM models at scale.
3536

36-
The application uses either the [EDB Postgres for Kubernetes operator](https://catalog.redhat.com/software/container-stacks/detail/5fb41c88abd2a6f7dbe1b37b)
37-
(default), or Redis, to store embeddings of Red Hat product documentation, running on Red Hat
38-
OpenShift Container Platform to generate project proposals for specific Red Hat products.
37+
The pattern provides several options for the RAG DB vector store including EDB Postgres (the default), Elasticsearch,
38+
Redis, and Microsoft SQL Server.
3939

4040
## Demo Description & Architecture
4141

@@ -47,7 +47,7 @@ The application generates a project proposal for a Red Hat product.
4747
- Leveraging [Red Hat OpenShift AI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai) to deploy and serve LLM models powered by NVIDIA GPU accelerator.
4848
- LLM Application augmented with content from Red Hat product documentation.
4949
- Multiple LLM providers (OpenAI, Hugging Face, NVIDIA).
50-
- Vector Database, such as EDB Postgres for Kubernetes, or Redis, to store embeddings of Red Hat product documentation.
50+
- Vector Database, such as EDB Postgres, Elasticsearch, or Microsoft SQL Server to store embeddings of Red Hat product documentation.
5151
- Monitoring dashboard to provide key metrics such as ratings.
5252
- GitOps setup to deploy e2e demo (frontend / vector database / served models).
5353

@@ -57,31 +57,29 @@ The application generates a project proposal for a Red Hat product.
5757

5858
_Figure 3. Schematic diagram for workflow of RAG demo with Red Hat OpenShift._
5959

60-
6160
#### RAG Data Ingestion
6261

6362
![ingestion](https://gitlab.com/osspa/portfolio-architecture-examples/-/raw/main/images/schematic-diagrams/rag-demo-vp-ingress-sd.png)
6463

6564
_Figure 4. Schematic diagram for Ingestion of data for RAG._
6665

67-
6866
#### RAG Augmented Query
6967

70-
71-
![query](https://gitlab.com/osspa/portfolio-architecture-examples/-/raw/main/images/schematic-diagrams/rag-demo-vp-query-sd.png)
68+
![query](/images/rag-llm-gitops/rag-augmented-query.png)
7269

7370
_Figure 5. Schematic diagram for RAG demo augmented query._
7471

75-
In Figure 5, we can see RAG augmented query. The Mistral-7B model is used for
72+
In Figure 5, we can see RAG augmented query. The `granite-3.3-8b-instruct` model is used for
7673
language processing. LangChain is used to integrate different tools of the LLM-based
7774
application together and to process the PDF files and web pages. A vector
78-
database provider such as EDB Postgres for Kubernetes (or Redis), is used to
79-
store vectors. HuggingFace TGI is used to serve the Mistral-7B model. Gradio is
75+
database provider such as EDB Postgres for Kubernetes (or Elasticsearch), is used to
76+
store vectors. vLLM is used to serve the `granite-3.3-8b-instruct` model. Gradio is
8077
used for user interface and object storage to store language model and other
8178
datasets. Solution components are deployed as microservices in the Red Hat
8279
OpenShift Container Platform cluster.
8380

8481
#### Download diagrams
82+
8583
View and download all of the diagrams above in our open source tooling site.
8684

8785
[Open Diagrams](https://www.redhat.com/architect/portfolio/tool/index.html?#gitlab.com/osspa/portfolio-architecture-examples/-/raw/main/diagrams/rag-demo-vp.drawio)
@@ -92,14 +90,13 @@ _Figure 6. Proposed demo architecture with OpenShift AI_
9290

9391
### Components deployed
9492

95-
- **Hugging Face Text Generation Inference Server:** The pattern deploys a Hugging Face TGIS server. The server deploys `mistral-community/Mistral-7B-v0.2` model. The server will require a GPU node.
93+
- **vLLM Inference Server:** The pattern deploys a vLLM server. The server deploys `ibm-granite/granite-3.3-8b-instruct` model. The server will require a GPU node.
9694
- **EDB Postgres for Kubernetes / Redis Server:** A Vector Database server is deployed to store vector embeddings created from Red Hat product documentation.
9795
- **Populate VectorDb Job:** The job creates the embeddings and populates the vector database.
9896
- **LLM Application:** This is a Chatbot application that can generate a project proposal by augmenting the LLM with the Red Hat product documentation stored in vector db.
99-
- **Prometheus:** Deploys a prometheus instance to store the various metrics from the LLM application and TGIS server.
97+
- **Prometheus:** Deploys a prometheus instance to store the various metrics from the LLM application and vLLM inference server.
10098
- **Grafana:** Deploys Grafana application to visualize the metrics.
10199

102-
103100
![Overview](https://gitlab.com/osspa/portfolio-architecture-examples/-/raw/main/images/intro-marketectures/rag-demo-vp-marketing-slide.png)
104101

105102
_Figure 1. Overview of the validated pattern for RAG Demo with Red Hat OpenShift_
Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,32 +1,41 @@
11
---
2-
title: Deploying a different database
2+
title: Deploying a different database
33
weight: 12
44
aliases: /rag-llm-gitops/deploy-different-db/
55
---
66

77
# Deploying a different database
88

9-
This pattern supports two types of vector databases, EDB Postgres for Kubernetes, and Redis. By default the pattern will deploy EDB Postgres for Kubernetes as a vector database. To deploy change the global.db.type parameter to the REDIS value in your local branch in `values-global.yaml`.
9+
This pattern supports several types of vector databases, EDB Postgres for Kubernetes, Elasticsearch, Redis, Microsoft SQL Server, and the cloud-deployed Azure SQL Server. By default the pattern will deploy EDB Postgres for Kubernetes as a vector database. To use a different vector database, change the `global.db.type` parameter to `ELASTIC`, `MSSQL` etc. in your local branch in `values-global.yaml`.
1010

1111
```yaml
12-
---
1312
global:
1413
pattern: rag-llm-gitops
1514
options:
1615
useCSV: false
1716
syncPolicy: Automatic
1817
installPlanApproval: Automatic
19-
# Possible value for db.type = [REDIS, EDB]
18+
# Possible values for RAG vector DB db.type:
19+
# REDIS -> Redis (Local chart deploy)
20+
# EDB -> PGVector (Local chart deploy)
21+
# ELASTIC -> Elasticsearch (Local chart deploy)
22+
# MSSQL -> MS SQL Server (Local chart deploy)
23+
# AZURESQL -> Azure SQL (Pre-existing in Azure)
2024
db:
2125
index: docs
2226
type: EDB
23-
# Add for model ID
27+
# Models used by the inference service (should be a HuggingFace model ID)
2428
model:
25-
modelId: mistral-community/Mistral-7B-Instruct-v0.3
29+
vllm: ibm-granite/granite-3.3-8b-instruct
30+
embedding: sentence-transformers/all-mpnet-base-v2
31+
32+
storageClass: gp3-csi
33+
2634
main:
2735
clusterGroupName: hub
2836
multiSourceConfig:
2937
enabled: true
38+
clusterGroupChartVersion: 0.9.*
3039
```
3140
32-
41+
This is also where you are able to update both the LLM model served by the vLLM inference service as well as the embedding model used by the vector database.

content/patterns/rag-llm-gitops/getting-started.md

Lines changed: 29 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -10,27 +10,30 @@ aliases: /rag-llm-gitops/getting-started/
1010
- You have the OpenShift Container Platform installation program and the pull secret for your cluster. You can get these from [Install OpenShift on AWS with installer-provisioned infrastructure](https://console.redhat.com/openshift/install/aws/installer-provisioned).
1111
- Red Hat Openshift cluster running in AWS.
1212

13+
It is also possible to deploy the RAG-LLM Gitops pattern to Azure. Since these docs focus mostly on the AWS deployment, it's recommended that you reference [RAG-LLM pattern on Microsoft Azure](https://validatedpatterns.io/patterns/azure-rag-llm-gitops/)
14+
for more details about installing this pattern on Azure.
15+
1316
## Procedure
1417

1518
1. Create the installation configuration file using the steps described in [Creating the installation configuration file](https://docs.openshift.com/container-platform/latest/installing/installing_aws/ipi/installing-aws-customizations.html#installation-initializing_installing-aws-customizations).
1619

1720
> **Note:**
1821
> Supported regions are `us-east-1` `us-east-2` `us-west-1` `us-west-2` `ca-central-1` `sa-east-1` `eu-west-1` `eu-west-2` `eu-west-3` `eu-central-1` `eu-north-1` `ap-northeast-1` `ap-northeast-2` `ap-northeast-3` `ap-southeast-1` `ap-southeast-2` and `ap-south-1`. For more information about installing on AWS see, [Installation methods](https://docs.openshift.com/container-platform/latest/installing/installing_aws/preparing-to-install-on-aws.html).
19-
>
2022
2123
2. Customize the generated `install-config.yaml` creating one control plane node with instance type `m5.2xlarge` and 3 worker nodes with instance type `m5.2xlarge`. A sample YAML file is shown here:
24+
2225
```yaml
2326
additionalTrustBundlePolicy: Proxyonly
2427
apiVersion: v1
2528
baseDomain: aws.validatedpatterns.io
2629
compute:
27-
- architecture: amd64
28-
hyperthreading: Enabled
29-
name: worker
30-
platform:
31-
aws:
32-
type: m5.2xlarge
33-
replicas: 3
30+
- architecture: amd64
31+
hyperthreading: Enabled
32+
name: worker
33+
platform:
34+
aws:
35+
type: m5.2xlarge
36+
replicas: 3
3437
controlPlane:
3538
architecture: amd64
3639
hyperthreading: Enabled
@@ -44,18 +47,18 @@ aliases: /rag-llm-gitops/getting-started/
4447
name: kevstestcluster
4548
networking:
4649
clusterNetwork:
47-
- cidr: 10.128.0.0/14
48-
hostPrefix: 23
50+
- cidr: 10.128.0.0/14
51+
hostPrefix: 23
4952
machineNetwork:
50-
- cidr: 10.0.0.0/16
53+
- cidr: 10.0.0.0/16
5154
networkType: OVNKubernetes
5255
serviceNetwork:
53-
- 172.30.0.0/16
56+
- 172.30.0.0/16
5457
platform:
5558
aws:
5659
region: us-east-1
5760
publish: External
58-
pullSecret: '<pull-secret>'
61+
pullSecret: "<pull-secret>"
5962
sshKey: |
6063
ssh-ed25519 <public-key> [email protected]
6164
```
@@ -67,31 +70,35 @@ aliases: /rag-llm-gitops/getting-started/
6770
```sh
6871
$ git clone [email protected]:your-username/rag-llm-gitops.git
6972
```
73+
7074
5. Go to your repository: Ensure you are in the root directory of your git repository by using the following command:
7175

7276
```sh
7377
$ cd rag-llm-gitops
7478
```
79+
7580
6. Create a local copy of the secret values file by running the following command:
7681

7782
```sh
7883
$ cp values-secret.yaml.template ~/values-secret-rag-llm-gitops.yaml
7984
```
85+
8086
> **Note:**
81-
>For this demo, editing this file is unnecessary as the default configuration works out of the box upon installation.
87+
> For this demo, editing this file is unnecessary as the default configuration works out of the box upon installation.
8288
8389
7. Add the remote upstream repository by running the following command:
8490

8591
```sh
8692
$ git remote add -f upstream [email protected]:validatedpatterns/rag-llm-gitops.git
8793
```
94+
8895
8. Create a local branch by running the following command:
8996

9097
```sh
9198
$ git checkout -b my-test-branch main
9299
```
93100

94-
9. By default the pattern deploys the EDB Postgres for Kubernetes as a vector database. To deploy Redis, change the `global.db.type` parameter to the `REDIS` value in your local branch in `values-global.yaml`. For more information see, [Deploying a different databases](/rag-llm-gitops/deploy-different-db/) to change the vector database.
101+
9. By default the pattern deploys the EDB Postgres for Kubernetes as a vector database. To deploy Elasticsearch, change the `global.db.type` parameter to the `ELASTIC` value in your local branch in `values-global.yaml`. For more information see, [Deploying a different databases](/rag-llm-gitops/deploy-different-db/) to change the vector database.
95102

96103
10. By default instance types for the GPU nodes are `g5.2xlarge`. Follow the [Customize GPU provisioning nodes](/rag-llm-gitops/gpuprovisioning/) to change the GPU instance types.
97104

@@ -100,6 +107,7 @@ aliases: /rag-llm-gitops/getting-started/
100107
```sh
101108
$ git push origin my-test-branch
102109
```
110+
103111
12. Ensure you have logged in to the cluster at both command line and the console by using the login credentials presented to you when you installed the cluster. For example:
104112

105113
```sh
@@ -109,18 +117,24 @@ aliases: /rag-llm-gitops/getting-started/
109117
INFO Access the OpenShift web-console here: https://console-openshift-console.apps.demo1.openshift4-beta-abcorp.com
110118
INFO Login to the console with user: kubeadmin, password: <provided>
111119
```
120+
112121
13. Add GPU nodes to your existing cluster deployment by running the following command:
113122

114123
```sh
115124
$ ./pattern.sh make create-gpu-machineset
116125
```
126+
117127
> **Note:**
118128
> You may need to create a file `config` in your home directory and populate it with the region name.
129+
>
119130
> 1. Run the following:
131+
>
120132
> ```sh
121133
> vi ~/.aws/config
122134
> ```
135+
>
123136
> 2. Add the following:
137+
>
124138
> ```sh
125139
> [default]
126140
> region = us-east-1
@@ -136,7 +150,6 @@ aliases: /rag-llm-gitops/getting-started/
136150

137151
> **Note:**
138152
> This deploys everything you need to run the demo application including the Nividia GPU Operator and the Node Feature Discovery Operator used to determine your GPU nodes.
139-
>
140153

141154
## Verify the Installation
142155

@@ -167,5 +180,3 @@ aliases: /rag-llm-gitops/getting-started/
167180
- Click the `Generate` button, a project proposal should be generated. The project proposal also contains the reference of the RAG content. The project proposal document can be Downloaded in the form of a PDF document.
168181
169182
![Routes](/images/rag-llm-gitops/proposal.png)
170-
171-
164 KB
Loading

0 commit comments

Comments
 (0)