|
3 | 3 | ## Table of Contents |
4 | 4 |
|
5 | 5 | - [Overview](#overview) |
6 | | -- [Important Setup Instructions](#️-important-setup-instructions) |
7 | 6 | - [Architecture](#architecture) |
8 | 7 | - [Architecture Steps](#architecture-steps) |
9 | 8 | - [Plan Your Deployment](#plan-your-deployment) |
10 | 9 | - [Cost](#cost) |
11 | 10 | - [Sample Cost Table](#sample-cost-table) |
12 | 11 | - [Third party dependencies disclaimer](#third-party-dependencies-disclaimer) |
13 | 12 | - [Quick Start Guide](#quick-start-guide) |
| 13 | + - [Important Setup Instructions](#️-important-setup-instructions) |
14 | 14 | - [Option 1: Automated Setup with Makefile (Recommended)](#option-1-automated-setup-with-makefile-recommended) |
15 | 15 | - [Prerequisites](#prerequisites) |
16 | 16 | - [How to Create a Hugging Face Token](#how-to-create-a-hugging-face-token) |
|
40 | 40 | ## Overview |
41 | 41 | This solution implements a comprehensive, scalable ML inference architecture using Amazon EKS, leveraging both Graviton processors for cost-effective CPU-based inference and GPU instances for accelerated inference. The system provides a complete end-to-end platform for deploying large language models with agentic AI capabilities, including RAG (Retrieval Augmented Generation) and intelligent document processing. |
42 | 42 |
|
43 | | -## ⚠️ Important Setup Instructions |
44 | | - |
45 | | -**Before proceeding with this solution, ensure you have:** |
46 | | - |
47 | | -1. **AWS CLI configured** with appropriate permissions for EKS, ECR, CloudFormation, and other AWS services |
48 | | -2. **kubectl installed** and configured to access your target AWS region |
49 | | -3. **Docker installed** and running (required for building and pushing container images) |
50 | | -4. **Sufficient AWS service quotas** - This solution requires multiple EC2 instances, EKS clusters, and other AWS resources |
51 | | -5. **Valid Hugging Face token** - Required for accessing models (see instructions below) |
52 | | -6. **Tavily API key** - Required for web search functionality in agentic applications |
53 | | - |
54 | | -**Recommended Setup Verification:** |
55 | | -```bash |
56 | | -# Verify AWS CLI access |
57 | | -aws sts get-caller-identity |
58 | | - |
59 | | -# Verify kubectl installation |
60 | | -kubectl version --client |
61 | | - |
62 | | -# Verify Docker is running |
63 | | -docker ps |
64 | | - |
65 | | -# Check available AWS regions and quotas |
66 | | -aws ec2 describe-regions |
67 | | -aws service-quotas get-service-quota --service-code ec2 --quota-code L-1216C47A |
68 | | -``` |
69 | | - |
70 | | -**Cost Awareness:** This solution will incur AWS charges. Review the cost breakdown section below and set up billing alerts before deployment. |
71 | 43 |
|
72 | 44 | ## Architecture |
73 | 45 |
|
@@ -182,6 +154,35 @@ Please review and comply with all relevant licenses and terms of service for eac |
182 | 154 |
|
183 | 155 | ## Quick Start Guide |
184 | 156 |
|
| 157 | +### ⚠️ Important Setup Instructions |
| 158 | + |
| 159 | +**Before proceeding with this solution, ensure you have:** |
| 160 | + |
| 161 | +1. **AWS CLI configured** with appropriate permissions for EKS, ECR, CloudFormation, and other AWS services |
| 162 | +2. **kubectl installed** and configured to access your target AWS region |
| 163 | +3. **Docker installed** and running (required for building and pushing container images) |
| 164 | +4. **Sufficient AWS service quotas** - This solution requires multiple EC2 instances, EKS clusters, and other AWS resources |
| 165 | +5. **Valid Hugging Face token** - Required for accessing models (see instructions below) |
| 166 | +6. **Tavily API key** - Required for web search functionality in agentic applications |
| 167 | + |
| 168 | +**Recommended Setup Verification:** |
| 169 | +```bash |
| 170 | +# Verify AWS CLI access |
| 171 | +aws sts get-caller-identity |
| 172 | + |
| 173 | +# Verify kubectl installation |
| 174 | +kubectl version --client |
| 175 | + |
| 176 | +# Verify Docker is running |
| 177 | +docker ps |
| 178 | + |
| 179 | +# Check available AWS regions and quotas |
| 180 | +aws ec2 describe-regions |
| 181 | +aws service-quotas get-service-quota --service-code ec2 --quota-code L-1216C47A |
| 182 | +``` |
| 183 | + |
| 184 | +**Cost Awareness:** This solution will incur AWS charges. Review the cost breakdown section below and set up billing alerts before deployment. |
| 185 | + |
185 | 186 | The whole solution is including two parts, Agentic AI platform and Agentic AI application, let us go through the Agentic AI platform firstly |
186 | 187 |
|
187 | 188 | We provide two approaches to set up the Agentic AI platform: |
|
0 commit comments