You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This solution implements a comprehensive, scalable ML inference architecture using Amazon EKS, leveraging both Graviton processors for cost-effective CPU-based inference and GPU instances for accelerated inference. The system provides a complete end-to-end platform for deploying large language models with agentic AI capabilities, including RAG (Retrieval Augmented Generation) and intelligent document processing.
**Cost Awareness:** This solution will incur AWS charges. Review the cost breakdown section below and set up billing alerts before deployment.
185
164
165
+
>NOTE: For detailed instructions on Deployment options for this guidance, running model infrenece and Agentic AI workflows and uninstallation please see
166
+
this [Detailed Installation Guide](https://aws-solutions-library-samples.github.io/compute/scalabale-model-inference-and-agentic-ai-on-amazon-eks.html)
167
+
168
+
<!--
186
169
The whole solution is including two parts, Agentic AI platform and Agentic AI application, let us go through the Agentic AI platform firstly
187
170
188
171
We provide two approaches to set up the Agentic AI platform:
@@ -323,7 +306,6 @@ This installs:
323
306
4. Go to "Tracing" menu and set up tracing
324
307
5. Record the Public Key (PK) and Secret Key (SK) - you'll need these for the agentic applications
325
308
326
-
327
309
#### Step 5: Deploy Model Gateway
328
310
329
311
Set up the unified API gateway:
@@ -777,8 +759,11 @@ query = "Analyze the medical documents and create a summary report saved to a fi
777
759
result = supervisor_agent(query)
778
760
# System will retrieve relevant docs, analyze them, and save results using MCP tools
779
761
```
762
+
-->
763
+
764
+
## Important Notes
780
765
781
-
####🔍 Architecture Benefits
766
+
### 🔍 Architecture Benefits
782
767
783
768
1.**Modularity**: Each agent has specific responsibilities
784
769
2.**Scalability**: Agents can be scaled independently
@@ -787,15 +772,15 @@ result = supervisor_agent(query)
787
772
5.**Observability**: Comprehensive monitoring and tracing via Strands SDK
788
773
6.**Standards Compliance**: Uses MCP for tool integration and OpenTelemetry for tracing
789
774
790
-
####🔧 Key Improvements
775
+
### 🔧 Key Improvements
791
776
792
777
##### Unified Architecture
793
778
-**Single Codebase**: No separate "enhanced" versions - all functionality is built into the standard agents
794
779
-**Built-in Tracing**: OpenTelemetry tracing is automatically enabled through Strands SDK
795
780
-**Simplified Deployment**: One main application with all features included
796
781
-**Consistent API**: All agents use the same tracing and configuration patterns
797
782
798
-
#####Enhanced Developer Experience
783
+
#### Enhanced Developer Experience
799
784
-**Automatic Instrumentation**: No manual trace management required
800
785
-**Multiple Export Options**: Console, OTLP, Jaeger, Langfuse support out of the box
801
786
-**Environment-based Configuration**: Easy setup through environment variables
0 commit comments