You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This Validated Pattern implements an enterprise-ready question-and-answer chatbot utilizing retrieval-augmented generation (RAG) & capability reasoning via large language model (LLM). The application is based on the publicly-available
33
-
https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA[OPEA Chat QnA] example application created by the https://opea.dev/[Open Platform for Enterprise AI (OPEA)] community. OPEA defines itself as "a framework that enables the
34
-
creation and evaluation of open, multi-provider, robust, and composable generative AI (GenAI) solutions. It harnesses the best innovations across the ecosystem while keeping enterprise-level needs front and center. It simplifies the
33
+
https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA[OPEA Chat QnA] example application created by the https://opea.dev/[Open Platform for Enterprise AI (OPEA)] community.
34
+
+
35
+
OPEA provides a framework that enables the creation and evaluation of open, multi-provider, robust, and composable generative AI (GenAI) solutions. It harnesses the best innovations across the ecosystem while keeping enterprise-level needs front and center. It simplifies the
35
36
implementation of enterprise-grade composite GenAI solutions, starting with a focus on Retrieval Augmented Generative AI (RAG). The platform is designed to facilitate efficient integration of secure,
36
-
performant, and cost-effective GenAI workflows into business systems and manage its deployments, leading to quicker GenAI adoption and business value."
37
-
37
+
performant, and cost-effective GenAI workflows into business systems and manage its deployments, leading to quicker GenAI adoption and business value.
38
+
+
38
39
This pattern aims to leverage the strengths of OPEA's framework in combination with other OpenShift-centric toolings in order to deploy a modern, LLM-backed reasoning application stack capable of leveraging https://www.amd.com/en/products/accelerators/instinct.html[AMD's Instinct]
39
40
hardware acceleration in an enterprise-ready & distributed manner. The pattern utilizes https://www.redhat.com/en/technologies/cloud-computing/openshift/gitops[Red Hat OpenShift GitOps] to bring a continuous delivery approach to the application's development & usage based on declarative
40
41
Git-driven workflows, backed by a centralized, single-source-of-truth git repository.
@@ -56,7 +57,7 @@ Components::
56
57
* AI/ML Services
57
58
** Text Embeddings Inference (TEI)
58
59
** Document Retriever
59
-
** Reranking Service
60
+
** Retriever Service
60
61
** LLM-TGI (Text Generation Inference) from OPEA
61
62
** vLLM accelerated by AMD Instinct GPUs
62
63
** Redis Vector Database
@@ -73,12 +74,12 @@ Components::
73
74
{empty} +
74
75
75
76
.Overview of the solution
76
-
image::/images/amd-rag-chat-qna/amd-rag-chat-qna-overview.png[alt=OPEA Chat QnA accelerated with AMD Instinct Validated Pattern architecture,65%,65%]
77
+
image::/images/qna-chat-amd/amd-rag-chat-qna-overview.png[alt=OPEA Chat QnA accelerated with AMD Instinct Validated Pattern architecture,65%,65%]
77
78
78
79
{empty} +
79
80
80
81
.Overview of application flow
81
-
image::/images/amd-rag-chat-qna/amd-rag-chat-qna-flow.png[OPEA Chat QnA accelerated with AMD Instinct application flow]
82
+
image::/images/qna-chat-amd/amd-rag-chat-qna-flow.png[OPEA Chat QnA accelerated with AMD Instinct application flow]
Copy file name to clipboardExpand all lines: content/patterns/amd-rag-chat-qna/amd-rag-chat-qna-getting-started.adoc
+8-6Lines changed: 8 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -87,7 +87,7 @@ To upload the model that is needed for this article, you need to create a workbe
87
87
** Container size: `Medium`
88
88
** Accelerator: `AMD`
89
89
** Cluster storage: Make sure the storage is at least 50GB
90
-
** Connection: Click on Attach existing connections button and attach the connection named model-store created in the previous step (Figure 4). This will pass on the connection values to the workbench when it is started, which will be used to upload the model.
90
+
** Connection: Click on Attach existing connections button and attach the connection named model-store created in the previous step. This will pass on the connection values to the workbench when it is started, which will be used to upload the model.
91
91
* Create the workbench by clicking on the `Create workbench` button. This workbench will be started and will move to `Running` status soon.
92
92
+
93
93
image::/images/qna-chat-amd/rhoai-create-workbench.png[Red Hat OpenShift AI - Create Workbench - Attach existing connections]
@@ -101,12 +101,13 @@ Open Workbench::
101
101
Open workbench named `chatqna` by following these steps:
102
102
* Once `chatqna` workbench is in `Running` status, open the workbench by clicking on its name, in `Workbenches` tab
103
103
* The workbench will open up in a new tab
104
-
** When the workbench is opened for the first time, you will be shown an _Authorize Access page_*
104
+
** When the workbench is opened for the first time, you will be shown an _Authorize Access page_
105
105
** Click `Allow selected permissions` button in this page
106
106
107
107
Clone Code Repository:: [[clone_pattern_code]]
108
108
109
109
Now that the workbench is created and running, follow these steps to setup the project:
110
+
110
111
* In the open workbench, click on the `Terminal` icon in the `Launcher` tab.
111
112
* Clone the following repository in the Terminal by running the following command:
112
113
@@ -174,7 +175,7 @@ Once all the prerequisites are met, install the ChatQnA application
174
175
+
175
176
* Configure secrets for Hugging Face and inference endpoint
* Modify the `value` field in `~/values-secret-qna-chat-amd.yaml` file
180
181
+
@@ -183,13 +184,13 @@ Once all the prerequisites are met, install the ChatQnA application
183
184
- name: huggingface
184
185
fields:
185
186
- name: token
186
-
value: null <- CHANGE THIS TO YOUR HUGGING_FACE TOKEN
187
-
vaultPolicy: validatePatternDefaultPolicy
187
+
value: null <- CHANGE THIS TO YOUR HUGGING_FACE TOKEN
188
+
vaultPolicy: validatePatternDefaultPolicy
188
189
189
190
- name: rhoai_model
190
191
fields:
191
192
- name: inference_endpoint
192
-
value: null <- CHANGE THIS TO YOUR MODEL'S INFERENCE ENDPOINT
193
+
value: null <- CHANGE THIS TO YOUR MODEL'S INFERENCE ENDPOINT
193
194
----
194
195
+
195
196
* Deploy the application
@@ -243,6 +244,7 @@ Query ChatQnA without RAG::
243
244
+
244
245
Since we have not yet provided any external knowledge base regarding the above query to the application, it does not the correct answer to this query and returns a generic response:
245
246
+
247
+
246
248
image::/images/qna-chat-amd/chatqna-ui-no-rag.png[ChatQnA UI - response without RAG,40%,40%]
247
249
248
250
Query ChatQnA with RAG - add external knowledge base::
0 commit comments