This Validated Pattern implements an enterprise-ready Question & Answer chat application based on the Open Platform for Enterprise AI (OPEA) framework, accelerated by AMD Instinct GPUs. It provides a production-grade deployment pattern that combines OPEA's open-source AI capabilities with AMD's hardware acceleration, all orchestrated through OpenShift's enterprise platform.
- 🚀 AMD Instinct GPU acceleration for high-performance AI inference
- 🔒 Enterprise-grade security with HashiCorp Vault integration
- 🤖 OPEA-based AI/ML pipeline with specialized services for document processing
- 📊 Vector database support for efficient similarity search and retrieval
- 🔄 GitOps-based deployment and management through Red Hat Validated Patterns
The solution consists of several key components:
-
AI/ML Services
- Text Embeddings Inference (TEI)
- Document Retriever
- Reranking Service
- LLM-TGI (Text Generation Inference) from OPEA
- vLLM accelerated by AMD Instinct GPUs
- Redis Vector Database
-
Infrastructure
- Red Hat OpenShift AI (RHOAI)
- AMD GPU Operator
- OpenShift Data Foundation (ODF)
- Kernel Module Management (KMM)
- Node Feature Discovery (NFD)
-
Security
- HashiCorp Vault
- External Secrets Operator
- Secure secret management across deployments
- OpenShift 4.[16-18] cluster
- AMD Instinct GPU(s)
- Llama-3.1-8B-Instruct model (or other compatible model)
-
Clone the repository:
git clone https://github.com/your-org/qna-chat-amd.git cd qna-chat-amd -
Configure your environment:
cp values-secret.yaml.template values-secret.yaml # Edit values-secret.yaml with your configuration (already added to .gitignore) -
Deploy the pattern:
./pattern.sh make install
Run the test suite:
make testRun the linter:
make lintThis project is licensed under the Apache License 2.0 - see the LICENSE file for details.
