Skip to content

Commit bc118ed

Browse files
author
Martin Jackson
committed
Merge remote-tracking branch 'upstream/main' into agof_v2_blog
2 parents f629f15 + b53b914 commit bc118ed

26 files changed

+376
-376
lines changed

content/patterns/coco-pattern/_index.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -24,7 +24,7 @@ include::modules/comm-attributes.adoc[]
2424
= About coco-pattern
2525

2626
Confidential computing is a technology for securing data in use. It uses a https://en.wikipedia.org/wiki/Trusted_execution_environment[Trusted Execution Environment] provided within the hardware of the processor to prevent access from others who have access to the system.
27-
https://confidentialcontainers.org/[Confidential containers] is a project to standardize the consumption of confidential computing by making the security boundary for confidential computing to be a Kubernetes pod. [Kata containers](https://katacontainers.io/) is used to establish the boundary via a shim VM.
27+
https://confidentialcontainers.org/[Confidential containers] is a project to standardize the consumption of confidential computing by making the security boundary for confidential computing to be a Kubernetes pod. https://katacontainers.io/[Kata containers] is used to establish the boundary via a shim VM.
2828

2929
A core goal of confidential computing is to use this technology to isolate the workload from both Kubernetes and hypervisor administrators.
3030

content/patterns/coco-pattern/coco-pattern-getting-started.adoc

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,7 @@ Logging into azure once the pods have been provisioned will show that each of th
4242

4343
=== `oc exec` testing
4444

45-
In a OpenShift cluster without confidential containers, Role Based Access Control (RBAC), may be used to prevent users from execing into a container to mutate it.
45+
In a OpenShift cluster without confidential containers, Role Based Access Control (RBAC), may be used to prevent users from using `oc exec` to access a container container to mutate it.
4646
However:
4747

4848
1. Cluster admins can always circumvent this capability

content/patterns/medical-diagnosis/cluster-sizing.adoc

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,12 +9,15 @@ aliases: /medical-diagnosis/cluster-sizing/
99
:_content-type: ASSEMBLY
1010
include::modules/comm-attributes.adoc[]
1111

12+
:aws_node: xlarge
13+
14+
1215
//Module to be included
1316
//:_content-type: CONCEPT
1417
//:imagesdir: ../../images
1518
[id="about-openshift-cluster-sizing-med"]
1619
== About OpenShift cluster sizing for the {med-pattern}
17-
20+
{aws_node}
1821
To understand cluster sizing requirements for the {med-pattern}, consider the following components that the {med-pattern} deploys on the datacenter or the hub OpenShift cluster:
1922

2023
|===

content/patterns/medical-diagnosis/getting-started.adoc

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -340,7 +340,7 @@ You can go to the command-line (make sure you have KUBECONFIG set, or are logged
340340
+
341341
[source,terminal]
342342
----
343-
$ oc scale deploymentconfig/image-generator --replicas=1 -n xraylab-1
343+
$ oc scale deployment/image-generator --replicas=1 -n xraylab-1
344344
----
345345
+
346346
Or you can go to the OpenShift UI and change the view from Administrator to Developer and select Topology. From there select the `xraylab-1` project.
@@ -357,7 +357,7 @@ image::medical-edge/dev-topology-pod-count.png[link="/images/medical-edge/dev-to
357357
+
358358
Alternatively, you can have the same outcome on the Administrator console.
359359
+
360-
Go to the OpenShift UI under Workloads, select Deploymentconfigs for Project `xraylab-1`.
360+
Go to the OpenShift UI under Workloads, select Deployments for Project `xraylab-1`.
361361
Click `image-generator` and increase the pod count to 1.
362362
+
363363
image::medical-edge/start-image-flow.png[link="/images/medical-edge/start-image-flow.png"]
@@ -375,14 +375,14 @@ You can change some of the parameters and watch how the changes effect the dashb
375375
+
376376
[source,terminal]
377377
----
378-
$ oc scale deploymentconfig/image-generator --replicas=2
378+
$ oc scale deployment/image-generator --replicas=2
379379
----
380380
+
381381
Check the dashboard.
382382
+
383383
[source,terminal]
384384
----
385-
$ oc scale deploymentconfig/image-generator --replicas=0
385+
$ oc scale deployment/image-generator --replicas=0
386386
----
387387
+
388388
Watch the dashboard stop processing images.

content/patterns/medical-diagnosis/troubleshooting.adoc

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -59,16 +59,16 @@ Use the grafana dashboard to assist with debugging and identifying the issue
5959
'''
6060
Problem:: No information is being processed in the dashboard
6161

62-
Solution:: Most often this is due to the image-generator deploymentConfig needing to be scaled up. The image-generator by design is *scaled to 0*;
62+
Solution:: Most often this is due to the image-generator deployment needing to be scaled up. The image-generator by design is *scaled to 0*;
6363
+
6464
[source,terminal]
6565
----
66-
$ oc scale -n xraylab-1 dc/image-generator --replicas=1
66+
$ oc scale -n xraylab-1 deploy/image-generator --replicas=1
6767
----
6868
+
6969
Alternatively, complete the following steps:
7070

71-
. Navigate to the {rh-ocp} web console, and select *Workloads → DeploymentConfigs*
71+
. Navigate to the {rh-ocp} web console, and select *Workloads → Deployments*
7272
. Select `image-generator` and scale the pod to 1 or more.
7373
//AI: Needs review
7474

Lines changed: 9 additions & 146 deletions
Original file line numberDiff line numberDiff line change
@@ -1,155 +1,18 @@
11
---
2-
title: GPU provisioning
2+
title: Customize GPU provisioning nodes
33
weight: 20
44
aliases: /rag-llm-gitops/gpuprovisioning/
55
---
6-
# GPU provisioning
6+
# Customizing GPU provisioning nodes
77

8-
Use the instructions to add nodes with GPU in OpenShift cluster running in AWS cloud. Nodes with GPU will be tainted to allow only pods that required GPU to be scheduled to these nodes
8+
By default, GPU nodes use the instance type `g5.2xlarge`. If you need to change the instance type—such as to address performance requirements, carry out these steps:
99

10-
More details can be found in following documents [Openshift AI](https://ai-on-openshift.io/odh-rhoai/nvidia-gpus/), [NVIDIA on OpenShift](https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html)
10+
1. In your local branch of the `rag-llm-gitops` git repository change to the `ansible/playbooks/templates` directory.
1111

12-
## Add machineset
12+
2. Edit the file `gpu-machine-sets.j2` changing the `instanceType` to for example `g5.4xlarge`. Save and exit.
1313

14-
The easiest way is to use existing machineset manifest and update certain elements. Use worker machineset manifest and modify some of the entries (naming conventions provided as reference only, use own if required.), keep other entries as is:
14+
3. Push the changes to the origin remote repository by running the following command:
1515

16-
```yaml
17-
apiVersion: machine.openshift.io/v1beta1
18-
kind: MachineSet
19-
metadata:
20-
name: <clustername>-gpu-<AWSregion>
21-
..............
22-
spec:
23-
replicas: 1
24-
selector:
25-
matchLabels:
26-
................
27-
machine.openshift.io/cluster-api-machineset: <clustername>-gpu-<AWSregion>
28-
template:
29-
metadata:
30-
labels:
31-
........
32-
machine.openshift.io/cluster-api-machineset: <clustername>-gpu-<AWSregion>
33-
spec:
34-
...................
35-
metadata:
36-
labels:
37-
node-role.kubernetes.io/odh-notebook: '' <--- Put your label if needed
38-
providerSpec:
39-
value:
40-
........................
41-
instanceType: g5.2xlarge <---- Change vm type if needed
42-
.............
43-
taints:
44-
- effect: NoSchedule
45-
key: odh-notebook <--- Use own taint name or skip all together
46-
value: 'true'
47-
```
48-
49-
Use `kubectl` or `oc` command line to create new machineset `oc apply -f gpu_machineset.yaml`
50-
51-
Depending on type of EC2 instance creation of the new machines make take some time. Please note that all nodes with GPU will have labels(`node-role.kubernetes.io/odh-notebook`in our case) and taints (`odh-notebook `) that we have specified in machineset applied automatically
52-
53-
## Install Node Feature Operator
54-
55-
From OperatorHub install Node Feature Discovery Operator , accepting defaults . Once Operator has been installed , create `NodeFeatureDiscovery`instance . Use default entries unless you something specific is needed . Node Feature Discovery Operator will add labels to nodes based on available hardware resources
56-
57-
## Install NVIDIA GPU Operator
58-
59-
NVIDIA GPU Operator will provision daemonsets with drivers for the GPU to be used by workload running on these nodes . Detailed instructions are available in NVIDIA Documentation [NVIDIA on OpenShift](https://docs.nvidia.com/datacenter/cloud-native/openshift/latest/index.html) . Following simplified steps for specific setup :
60-
61-
- Install NVIDIA GPU Operator from OperatorHub
62-
- Once operator is ready create `ClusterPolicy` custom resource. Unless required you can use default settings with adding `tolerations` if machineset in first section has been created with taint. Failing to add `tolerations` will prevent drivers to be installed on GPU enabled node :
63-
64-
```yaml
65-
apiVersion: nvidia.com/v1
66-
kind: ClusterPolicy
67-
metadata:
68-
name: gpu-cluster-policy
69-
spec:
70-
vgpuDeviceManager:
71-
enabled: true
72-
migManager:
73-
enabled: true
74-
operator:
75-
defaultRuntime: crio
76-
initContainer: {}
77-
runtimeClass: nvidia
78-
use_ocp_driver_toolkit: true
79-
dcgm:
80-
enabled: true
81-
gfd:
82-
enabled: true
83-
dcgmExporter:
84-
config:
85-
name: ''
86-
enabled: true
87-
serviceMonitor:
88-
enabled: true
89-
driver:
90-
certConfig:
91-
name: ''
92-
enabled: true
93-
kernelModuleConfig:
94-
name: ''
95-
licensingConfig:
96-
configMapName: ''
97-
nlsEnabled: false
98-
repoConfig:
99-
configMapName: ''
100-
upgradePolicy:
101-
autoUpgrade: true
102-
drain:
103-
deleteEmptyDir: false
104-
enable: false
105-
force: false
106-
timeoutSeconds: 300
107-
maxParallelUpgrades: 1
108-
maxUnavailable: 25%
109-
podDeletion:
110-
deleteEmptyDir: false
111-
force: false
112-
timeoutSeconds: 300
113-
waitForCompletion:
114-
timeoutSeconds: 0
115-
virtualTopology:
116-
config: ''
117-
devicePlugin:
118-
config:
119-
default: ''
120-
name: ''
121-
enabled: true
122-
mig:
123-
strategy: single
124-
sandboxDevicePlugin:
125-
enabled: true
126-
validator:
127-
plugin:
128-
env:
129-
- name: WITH_WORKLOAD
130-
value: 'false'
131-
nodeStatusExporter:
132-
enabled: true
133-
daemonsets:
134-
rollingUpdate:
135-
maxUnavailable: '1'
136-
tolerations:
137-
- effect: NoSchedule
138-
key: odh-notebook
139-
value: 'true'
140-
updateStrategy: RollingUpdate
141-
sandboxWorkloads:
142-
defaultWorkload: container
143-
enabled: false
144-
gds:
145-
enabled: false
146-
vgpuManager:
147-
enabled: false
148-
vfioManager:
149-
enabled: true
150-
toolkit:
151-
enabled: true
152-
installDir: /usr/local/nvidia
153-
```
154-
155-
Provisioning NVIDIA daemonsets and compiling drivers may take some time (5-10 minutes)
16+
```sh
17+
$ git push origin my-test-branch
18+
```

content/patterns/rag-llm-gitops/_index.md

Lines changed: 52 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
title: AI Generation with LLM and RAG
33
date: 2024-07-25
44
tier: tested
5-
summary: The goal of this demo is to demonstrate a Chatbot LLM application augmented with data from Red Hat product documentation running on Red Hat OpenShift. It deploys an LLM application that connects to multiple LLM providers such as OpenAI, Hugging Face, and NVIDIA NIM. The application generates a project proposal for a Red Hat product.
5+
summary: The goal of this demo is to showcase a Chatbot LLM application augmented with data from Red Hat product documentation running on Red Hat OpenShift. It deploys an LLM application that connects to multiple LLM providers such as OpenAI, Hugging Face, and NVIDIA NIM. The application generates a project proposal for a Red Hat product.
66
rh_products:
77
- Red Hat OpenShift Container Platform
88
- Red Hat OpenShift GitOps
@@ -19,7 +19,7 @@ links:
1919
ci: ai
2020
---
2121

22-
# Document Generation Demo with LLM and RAG
22+
# Document generation demo with LLM and RAG
2323

2424
## Introduction
2525

@@ -34,16 +34,9 @@ The application uses either the [EDB Postgres for Kubernetes operator](https://c
3434
(default), or Redis, to store embeddings of Red Hat product documentation, running on Red Hat
3535
OpenShift Container Platform to generate project proposals for specific Red Hat products.
3636

37-
## Pre-requisites
38-
39-
- Podman
40-
- Red Hat Openshift cluster running in AWS. Supported regions are us-west-2 and us-east-1.
41-
- GPU Node to run Hugging Face Text Generation Inference server on Red Hat OpenShift cluster.
42-
- Create a fork of the [rag-llm-gitops](https://github.com/validatedpatterns/rag-llm-gitops.git) git repository.
43-
4437
## Demo Description & Architecture
4538

46-
The goal of this demo is to demonstrate a Chatbot LLM application augmented with data from Red Hat product documentation running on [Red Hat OpenShift AI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai). It deploys an LLM application that connects to multiple LLM providers such as OpenAI, Hugging Face, and NVIDIA NIM.
39+
The goal of this demo is to showcase a Chatbot LLM application augmented with data from Red Hat product documentation running on [Red Hat OpenShift AI](https://www.redhat.com/en/technologies/cloud-computing/openshift/openshift-ai). It deploys an LLM application that connects to multiple LLM providers such as OpenAI, Hugging Face, and NVIDIA NIM.
4740
The application generates a project proposal for a Red Hat product.
4841

4942
### Key Features
@@ -55,6 +48,55 @@ The application generates a project proposal for a Red Hat product.
5548
- Monitoring dashboard to provide key metrics such as ratings.
5649
- GitOps setup to deploy e2e demo (frontend / vector database / served models).
5750

51+
#### RAG Demo Workflow
52+
53+
![Overview of workflow](https://gitlab.com/osspa/portfolio-architecture-examples/-/raw/main/images/schematic-diagrams/rag-demo-vp-sd.png)
54+
55+
_Figure 3. Schematic diagram for workflow of RAG demo with Red Hat OpenShift._
56+
57+
58+
#### RAG Data Ingestion
59+
60+
![ingestion](https://gitlab.com/osspa/portfolio-architecture-examples/-/raw/main/images/schematic-diagrams/rag-demo-vp-ingress-sd.png)
61+
62+
_Figure 4. Schematic diagram for Ingestion of data for RAG._
63+
64+
65+
#### RAG Augmented Query
66+
67+
68+
![query](https://gitlab.com/osspa/portfolio-architecture-examples/-/raw/main/images/schematic-diagrams/rag-demo-vp-query-sd.png)
69+
70+
_Figure 5. Schematic diagram for RAG demo augmented query._
71+
72+
In Figure 5, we can see RAG augmented query. The Mistral-7B model is used for
73+
language processing. LangChain is used to integrate different tools of the LLM-based
74+
application together and to process the PDF files and web pages. A vector
75+
database provider such as EDB Postgres for Kubernetes (or Redis), is used to
76+
store vectors. HuggingFace TGI is used to serve the Mistral-7B model. Gradio is
77+
used for user interface and object storage to store language model and other
78+
datasets. Solution components are deployed as microservices in the Red Hat
79+
OpenShift Container Platform cluster.
80+
81+
#### Download diagrams
82+
View and download all of the diagrams above in our open source tooling site.
83+
84+
[Open Diagrams](https://www.redhat.com/architect/portfolio/tool/index.html?#gitlab.com/osspa/portfolio-architecture-examples/-/raw/main/diagrams/rag-demo-vp.drawio)
85+
86+
![Diagram](/images/rag-llm-gitops/diagram-edb.png)
87+
88+
_Figure 6. Proposed demo architecture with OpenShift AI_
89+
90+
### Components deployed
91+
92+
- **Hugging Face Text Generation Inference Server:** The pattern deploys a Hugging Face TGIS server. The server deploys `mistral-community/Mistral-7B-v0.2` model. The server will require a GPU node.
93+
- **EDB Postgres for Kubernetes / Redis Server:** A Vector Database server is deployed to store vector embeddings created from Red Hat product documentation.
94+
- **Populate VectorDb Job:** The job creates the embeddings and populates the vector database.
95+
- **LLM Application:** This is a Chatbot application that can generate a project proposal by augmenting the LLM with the Red Hat product documentation stored in vector db.
96+
- **Prometheus:** Deploys a prometheus instance to store the various metrics from the LLM application and TGIS server.
97+
- **Grafana:** Deploys Grafana application to visualize the metrics.
98+
99+
58100
![Overview](https://gitlab.com/osspa/portfolio-architecture-examples/-/raw/main/images/intro-marketectures/rag-demo-vp-marketing-slide.png)
59101

60102
_Figure 1. Overview of the validated pattern for RAG Demo with Red Hat OpenShift_
Lines changed: 41 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,41 @@
1+
---
2+
title: Customize the demo application
3+
weight: 11
4+
aliases: /rag-llm-gitops/getting-started/
5+
---
6+
7+
# Add an OpenAI provider
8+
9+
You can optionally add additional providers. The application supports the following providers
10+
11+
- Hugging Face
12+
- OpenAI
13+
- NVIDIA
14+
15+
## Procedure
16+
17+
1. Click the `Application box` icon in the header, and select `Retrieval-Augmented-Generation (RAG) LLM Demonstration UI`
18+
19+
![Launch Application](/images/rag-llm-gitops/launch-application-main_menu.png)
20+
21+
- It should launch the application
22+
23+
![Application](/images/rag-llm-gitops/application.png)
24+
25+
2. Click the `Configuration` tab to add a new provider.
26+
27+
3. Click the `Add Provider` button.
28+
29+
![Addprovider](/images/rag-llm-gitops/provider-1.png)
30+
31+
4. Complete the details and click the `Add` button.
32+
33+
![Addprovider](/images/rag-llm-gitops/provider-2.png)
34+
35+
The provider is now available to select in the `Providers` dropdown under the `Chatbot` tab.
36+
37+
![Routes](/images/rag-llm-gitops/add_provider-3.png)
38+
39+
## Generate the proposal document using OpenAI provider
40+
41+
Follow the instructions in the section "Generate the proposal document" in [Getting Started](/rag-llm-gitops/getting-started/) to generate the proposal document using the OpenAI provider.

0 commit comments

Comments
 (0)