Skip to content

Commit dee835c

Browse files
committed
revisions after peer review
1 parent 5a9c79e commit dee835c

File tree

4 files changed

+16
-13
lines changed

4 files changed

+16
-13
lines changed

content/patterns/amd-rag-chat-qna/_index.adoc

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -30,11 +30,12 @@ include::modules/comm-attributes.adoc[]
3030

3131
Background::
3232
This Validated Pattern implements an enterprise-ready question-and-answer chatbot utilizing retrieval-augmented generation (RAG) & capability reasoning via large language model (LLM). The application is based on the publicly-available
33-
https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA[OPEA Chat QnA] example application created by the https://opea.dev/[Open Platform for Enterprise AI (OPEA)] community. OPEA defines itself as "a framework that enables the
34-
creation and evaluation of open, multi-provider, robust, and composable generative AI (GenAI) solutions. It harnesses the best innovations across the ecosystem while keeping enterprise-level needs front and center. It simplifies the
33+
https://github.com/opea-project/GenAIExamples/tree/main/ChatQnA[OPEA Chat QnA] example application created by the https://opea.dev/[Open Platform for Enterprise AI (OPEA)] community.
34+
+
35+
OPEA provides a framework that enables the creation and evaluation of open, multi-provider, robust, and composable generative AI (GenAI) solutions. It harnesses the best innovations across the ecosystem while keeping enterprise-level needs front and center. It simplifies the
3536
implementation of enterprise-grade composite GenAI solutions, starting with a focus on Retrieval Augmented Generative AI (RAG). The platform is designed to facilitate efficient integration of secure,
36-
performant, and cost-effective GenAI workflows into business systems and manage its deployments, leading to quicker GenAI adoption and business value."
37-
37+
performant, and cost-effective GenAI workflows into business systems and manage its deployments, leading to quicker GenAI adoption and business value.
38+
+
3839
This pattern aims to leverage the strengths of OPEA's framework in combination with other OpenShift-centric toolings in order to deploy a modern, LLM-backed reasoning application stack capable of leveraging https://www.amd.com/en/products/accelerators/instinct.html[AMD's Instinct]
3940
hardware acceleration in an enterprise-ready & distributed manner. The pattern utilizes https://www.redhat.com/en/technologies/cloud-computing/openshift/gitops[Red Hat OpenShift GitOps] to bring a continuous delivery approach to the application's development & usage based on declarative
4041
Git-driven workflows, backed by a centralized, single-source-of-truth git repository.
@@ -56,7 +57,7 @@ Components::
5657
* AI/ML Services
5758
** Text Embeddings Inference (TEI)
5859
** Document Retriever
59-
** Reranking Service
60+
** Retriever Service
6061
** LLM-TGI (Text Generation Inference) from OPEA
6162
** vLLM accelerated by AMD Instinct GPUs
6263
** Redis Vector Database
@@ -73,12 +74,12 @@ Components::
7374
{empty} +
7475

7576
.Overview of the solution
76-
image::/images/amd-rag-chat-qna/amd-rag-chat-qna-overview.png[alt=OPEA Chat QnA accelerated with AMD Instinct Validated Pattern architecture,65%,65%]
77+
image::/images/qna-chat-amd/amd-rag-chat-qna-overview.png[alt=OPEA Chat QnA accelerated with AMD Instinct Validated Pattern architecture,65%,65%]
7778

7879
{empty} +
7980

8081
.Overview of application flow
81-
image::/images/amd-rag-chat-qna/amd-rag-chat-qna-flow.png[OPEA Chat QnA accelerated with AMD Instinct application flow]
82+
image::/images/qna-chat-amd/amd-rag-chat-qna-flow.png[OPEA Chat QnA accelerated with AMD Instinct application flow]
8283

8384
{empty} +
8485

content/patterns/amd-rag-chat-qna/amd-rag-chat-qna-getting-started.adoc

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -87,7 +87,7 @@ To upload the model that is needed for this article, you need to create a workbe
8787
** Container size: `Medium`
8888
** Accelerator: `AMD`
8989
** Cluster storage: Make sure the storage is at least 50GB
90-
** Connection: Click on Attach existing connections button and attach the connection named model-store created in the previous step (Figure 4). This will pass on the connection values to the workbench when it is started, which will be used to upload the model.
90+
** Connection: Click on Attach existing connections button and attach the connection named model-store created in the previous step. This will pass on the connection values to the workbench when it is started, which will be used to upload the model.
9191
* Create the workbench by clicking on the `Create workbench` button. This workbench will be started and will move to `Running` status soon.
9292
+
9393
image::/images/qna-chat-amd/rhoai-create-workbench.png[Red Hat OpenShift AI - Create Workbench - Attach existing connections]
@@ -101,12 +101,13 @@ Open Workbench::
101101
Open workbench named `chatqna` by following these steps:
102102
* Once `chatqna` workbench is in `Running` status, open the workbench by clicking on its name, in `Workbenches` tab
103103
* The workbench will open up in a new tab
104-
** When the workbench is opened for the first time, you will be shown an _Authorize Access page_*
104+
** When the workbench is opened for the first time, you will be shown an _Authorize Access page_
105105
** Click `Allow selected permissions` button in this page
106106

107107
Clone Code Repository:: [[clone_pattern_code]]
108108

109109
Now that the workbench is created and running, follow these steps to setup the project:
110+
110111
* In the open workbench, click on the `Terminal` icon in the `Launcher` tab.
111112
* Clone the following repository in the Terminal by running the following command:
112113

@@ -174,7 +175,7 @@ Once all the prerequisites are met, install the ChatQnA application
174175
+
175176
* Configure secrets for Hugging Face and inference endpoint
176177
+
177-
cp values-secret.yaml.template `~/values-secret-qna-chat-amd.yaml`
178+
cp values-secret.yaml.template ~/values-secret-qna-chat-amd.yaml
178179
+
179180
* Modify the `value` field in `~/values-secret-qna-chat-amd.yaml` file
180181
+
@@ -183,13 +184,13 @@ Once all the prerequisites are met, install the ChatQnA application
183184
- name: huggingface
184185
fields:
185186
- name: token
186-
value: null <- CHANGE THIS TO YOUR HUGGING_FACE TOKEN
187-
vaultPolicy: validatePatternDefaultPolicy
187+
value: null <- CHANGE THIS TO YOUR HUGGING_FACE TOKEN
188+
vaultPolicy: validatePatternDefaultPolicy
188189
189190
- name: rhoai_model
190191
fields:
191192
- name: inference_endpoint
192-
value: null <- CHANGE THIS TO YOUR MODEL'S INFERENCE ENDPOINT
193+
value: null <- CHANGE THIS TO YOUR MODEL'S INFERENCE ENDPOINT
193194
----
194195
+
195196
* Deploy the application
@@ -243,6 +244,7 @@ Query ChatQnA without RAG::
243244
+
244245
Since we have not yet provided any external knowledge base regarding the above query to the application, it does not the correct answer to this query and returns a generic response:
245246
+
247+
246248
image::/images/qna-chat-amd/chatqna-ui-no-rag.png[ChatQnA UI - response without RAG,40%,40%]
247249

248250
Query ChatQnA with RAG - add external knowledge base::

0 commit comments

Comments
 (0)