-
Notifications
You must be signed in to change notification settings - Fork 168
Description
π― Purpose
This design document outlines the key architectural decisions for implementing the AI Middleware Integration / Plugin Framework (#319). We're seeking community input on these decisions before implementation begins to ensure we build the right foundation for extensible gateway capabilities.
π Related Issues
- Primary Epic: [Feature Request]: AI Middleware Integration / Plugin Framework for extensible gateway capabilitiesΒ #319 - AI Middleware Integration / Plugin Framework
- Enables: [SECURITY FEATURE]: Gateway-Level Input Validation & Output Sanitization (prevent traversal)Β #221 (Gateway-Level Input Validation), [SECURITY FEATURE]: Guardrails - Input/Output Sanitization & PII MaskingΒ #229 (Guardrails), [SECURITY FEATURE]: Policy-as-Code Engine - Rego PrototypeΒ #271 (Policy-as-Code), [SECURITY FEATURE]: Gateway-Level Rate Limiting, DDoS Protection & Abuse DetectionΒ #257 (Rate Limiting & DDoS Protection)
- Timeline: Target v0.6.0 (August 19, 2025) for core framework
ποΈ Current Architecture Context
The MCP Gateway currently has:
- FastAPI application with middleware pipeline
- SQLAlchemy 2.x async for persistence
- Service-based architecture (tool_service, resource_service, etc.)
- HTMX-based Admin UI for management
- Authentication middleware (JWT + Basic Auth)
- Configuration-driven approach with Pydantic settings
π₯ Key Architectural Decisions for Discussion
1. Plugin Architecture Pattern
2. Plugin Execution Models
3. Configuration & Discovery Strategy
4. Pipeline Integration Approach
5. Security & Isolation Model
ADR-014: Plugin Architecture and AI Middleware Support
- Status: Proposed (Under Discussion)
- Date: 2025-07-08
- Deciders: Community Discussion Required
Context
The MCP Gateway needs a robust plugin framework to support:
- AI Safety Middleware (LlamaGuard, OpenAI Moderation, custom filters)
- Input/Output Processing (PII masking, content validation, sanitization)
- Policy Enforcement (Rego-based rules, business logic, compliance)
- Custom Authentication (Enterprise SSO, role-based access)
- Observability Extensions (custom metrics, audit logging)
Current middleware pipeline is limited to FastAPI middleware and doesn't support:
- Dynamic plugin registration
- External service integration
- Request/response transformation
- Conditional execution based on context
- Plugin-specific configuration management
Decision Points Requiring Community Input
π― Decision 1: Plugin Architecture Pattern
Options:
A) Self-Contained Plugins Only
class BasePlugin(ABC):
async def process(self, payload: Any, context: Dict) -> PluginResult:
# All logic runs in-process
pass
B) Hybrid: Self-Contained + External Service Integration
class BasePlugin(ABC):
execution_mode: PluginExecutionMode # SELF_CONTAINED | EXTERNAL_SERVICE
class ExternalServicePlugin(BasePlugin):
async def call_external_service(self, payload: Any) -> Any:
# HTTP calls to microservices
pass
C) Microservice-Only Architecture
# All plugins are external services
class PluginConfig:
service_url: str
auth_config: Dict[str, Any]
flowchart TD
A[Request] --> PM[Plugin Manager]
subgraph "Option A: Self-Contained"
PM --> P1[Plugin 1<br/>In-Process]
P1 --> P2[Plugin 2<br/>In-Process]
end
subgraph "Option B: Hybrid"
PM --> P3[Self-Contained<br/>Plugin]
PM --> P4[External Service<br/>via HTTP]
P4 --> EXT1[LlamaGuard API]
P4 --> EXT2[OpenAI Moderation]
end
subgraph "Option C: Microservice-Only"
PM --> MS1[Service 1]
PM --> MS2[Service 2]
MS1 --> EXT3[External API]
end
π³οΈ Community Question: Which approach best balances flexibility, performance, and operational complexity?
π― Decision 2: Plugin Execution Models
Options:
A) Sequential Execution
# Plugins execute one after another
for plugin in sorted_plugins:
result = await plugin.process(payload, context)
if not result.continue_processing:
break
payload = result.modified_payload or payload
B) Parallel Execution with Dependency Resolution
# Independent plugins run concurrently
async with asyncio.TaskGroup() as tg:
tasks = [tg.create_task(plugin.process(payload, context))
for plugin in independent_plugins]
C) Pipeline with Branching Logic
# Conditional execution based on context
if context.get("content_type") == "sensitive":
await pii_scanner.process(payload, context)
if context.get("requires_moderation"):
await moderation_plugin.process(payload, context)
flowchart LR
subgraph "Sequential (A)"
A1[Plugin A] --> A2[Plugin B] --> A3[Plugin C]
end
subgraph "Parallel (B)"
B1[Plugin A]
B2[Plugin B]
B3[Plugin C]
B1 --> B4[Merge Results]
B2 --> B4
B3 --> B4
end
subgraph "Conditional (C)"
C1{Content Type?}
C1 -->|Sensitive| C2[PII Scanner]
C1 -->|Public| C3[Basic Validation]
C2 --> C4[Moderation Check]
C3 --> C4
end
π³οΈ Community Question: Should we support all three models, or focus on one initially?
π― Decision 3: Configuration & Discovery Strategy
Options:
A) Database-Driven Configuration
# Store plugin configs in SQLAlchemy models
class PluginConfiguration(Base):
__tablename__ = "plugin_configurations"
name: str
config: JSON
enabled: bool
B) File-Based Configuration with Hot Reload
# plugins.yaml
plugins:
- name: "llama-guard"
type: "ai_middleware"
service_url: "http://llama-guard:8080"
enabled: true
C) Hybrid: Database + File Overrides
# File-based defaults, database overrides
config = load_file_config()
config.update(load_database_config())
D) Discovery via Environment/Registry
# Auto-discovery via service registry (Kubernetes, Consul, etc.)
plugins = discover_plugins_from_environment()
π³οΈ Community Question: How should plugin configuration be managed for different deployment scenarios?
π― Decision 4: Pipeline Integration Approach
Options:
A) FastAPI Middleware Integration
# Extend existing middleware pipeline
class PluginMiddleware(BaseHTTPMiddleware):
async def dispatch(self, request, call_next):
# Run input plugins
# Call next middleware
# Run output plugins
B) Service Layer Integration
# Integrate at service level
class ToolService:
async def execute_tool(self, tool_call):
# Run input plugins
result = await self._execute_core_logic(tool_call)
# Run output plugins
return result
C) Dedicated Plugin Pipeline
# Separate pipeline that FastAPI calls
class PluginPipeline:
async def process_request(self, request) -> ProcessedRequest:
pass
async def process_response(self, response) -> ProcessedResponse:
pass
sequenceDiagram
participant Client
participant Gateway
participant Pipeline
participant Core
participant Plugin
Client->>Gateway: Request
rect rgb(240, 248, 255)
note over Gateway,Pipeline: Option A: Middleware Integration
Gateway->>Pipeline: Request
Pipeline->>Plugin: Input Processing
Plugin-->>Pipeline: Modified Request
Pipeline->>Core: Core Logic
Core-->>Pipeline: Response
Pipeline->>Plugin: Output Processing
Plugin-->>Pipeline: Modified Response
Pipeline-->>Gateway: Final Response
end
Gateway-->>Client: Response
π³οΈ Community Question: Where in the request/response cycle should plugins be integrated?
π― Decision 5: Security & Isolation Model
Options:
A) Process Isolation (Containers/Sandboxing)
# Each plugin runs in isolated container
class IsolatedPlugin:
container_image: str
resource_limits: Dict[str, Any]
B) In-Process with Resource Limits
# Plugins run in same process with limits
class PluginExecutor:
async def execute_with_limits(self, plugin, timeout=30):
# Memory/CPU/time limits
pass
C) External Service Model (Network Isolation)
# All plugins are external services
# Security handled by network policies
class ExternalPlugin:
endpoint: str
auth_method: str
tls_verify: bool
π³οΈ Community Question: What level of isolation is appropriate for different plugin types?
π Implementation Phases for Discussion
Phase 1: Core Framework (v0.6.0)
- Plugin interface definitions
- Basic plugin manager
- Configuration schema
- Simple pipeline integration
Phase 2: Advanced Features (v0.7.0)
- External service integration
- Admin UI for plugin management
- Health monitoring
- Performance metrics
Phase 3: AI Middleware (v0.8.0)
- LlamaGuard integration
- OpenAI Moderation plugin
- PII detection/masking
- Policy-as-Code engine
π Trade-off Analysis
Decision | Pros | Cons | Community Impact |
---|---|---|---|
Hybrid Architecture | β
Flexibility β Performance options β Enterprise-ready |
β Complexity β More testing needed |
π’ Supports both simple and enterprise use cases |
Sequential Execution | β
Simple β Predictable β Easy debugging |
β Slower β Limited parallelism |
π Good starting point, can evolve |
Database Configuration | β
Dynamic updates β Multi-tenant ready β Audit trail |
β Migration complexity β Runtime dependencies |
π Aligns with current architecture |
π€ Open Questions for Community
-
Plugin Marketplace: Should we design for a future plugin marketplace/registry?
-
Multi-Tenancy: How should plugins be scoped? Per-server? Per-user? Global?
-
Plugin Dependencies: Should plugins be able to depend on other plugins?
-
Versioning: How do we handle plugin versioning and compatibility?
-
Testing: What testing framework should we provide for plugin developers?
-
Documentation: Should we auto-generate plugin documentation from schemas?
π― Success Criteria
A successful plugin framework should:
- β Enable [SECURITY FEATURE]: Guardrails - Input/Output Sanitization & PII MaskingΒ #229 (Guardrails), [SECURITY FEATURE]: Policy-as-Code Engine - Rego PrototypeΒ #271 (Policy-as-Code), [SECURITY FEATURE]: Gateway-Level Input Validation & Output Sanitization (prevent traversal)Β #221 (Input Validation)
- β Support both self-contained and external service plugins
- β Integrate seamlessly with existing FastAPI/SQLAlchemy architecture
- β Provide excellent developer experience for plugin authors
- β Scale from simple regex filters to enterprise AI safety systems
- β Maintain backward compatibility with existing gateway features
π¬ How to Participate
Please comment on this issue with:
- π³οΈ Your preferred options for each decision point
- π€ Additional considerations we might have missed
- π Use cases that would influence the design
- π§ Implementation suggestions or concerns
- π Examples from other systems you've worked with
Timeline: We need community input by July 22, 2025 to start implementation for v0.6.0.
π Next Steps
- Community discussion (July 8-22, 2025)
- Finalize ADR-014 based on feedback
- Create detailed implementation plan
- Begin development for v0.6.0 milestone
This design document will be updated based on community feedback and finalized as ADR-014 once consensus is reached.