[DESIGN]: Architecture Decisions and Discussions for AI Middleware and Plugin Framework (Enables #319)

## 🎯 Purpose

This design document outlines the key architectural decisions for implementing the **AI Middleware Integration / Plugin Framework** (#319). We're seeking community input on these decisions **before implementation begins** to ensure we build the right foundation for extensible gateway capabilities.

## 📋 Related Issues

- **Primary Epic:** #319 - AI Middleware Integration / Plugin Framework
- **Enables:** #221 (Gateway-Level Input Validation), #229 (Guardrails), #271 (Policy-as-Code), #257 (Rate Limiting & DDoS Protection)
- **Timeline:** Target v0.6.0 (August 19, 2025) for core framework

## 🏗️ Current Architecture Context

The MCP Gateway currently has:
- **FastAPI** application with middleware pipeline
- **SQLAlchemy 2.x async** for persistence 
- **Service-based architecture** (tool_service, resource_service, etc.)
- **HTMX-based Admin UI** for management
- **Authentication middleware** (JWT + Basic Auth)
- **Configuration-driven** approach with Pydantic settings

## 🔥 Key Architectural Decisions for Discussion

### 1. Plugin Architecture Pattern
### 2. Plugin Execution Models 
### 3. Configuration & Discovery Strategy
### 4. Pipeline Integration Approach
### 5. Security & Isolation Model

---

## ADR-014: Plugin Architecture and AI Middleware Support

- **Status:** *Proposed* (Under Discussion)
- **Date:** 2025-07-08
- **Deciders:** Community Discussion Required

### Context

The MCP Gateway needs a robust plugin framework to support:
- **AI Safety Middleware** (LlamaGuard, OpenAI Moderation, custom filters)
- **Input/Output Processing** (PII masking, content validation, sanitization)
- **Policy Enforcement** (Rego-based rules, business logic, compliance)
- **Custom Authentication** (Enterprise SSO, role-based access)
- **Observability Extensions** (custom metrics, audit logging)

Current middleware pipeline is limited to FastAPI middleware and doesn't support:
- Dynamic plugin registration
- External service integration 
- Request/response transformation
- Conditional execution based on context
- Plugin-specific configuration management

### Decision Points Requiring Community Input

#### 🎯 Decision 1: Plugin Architecture Pattern

**Options:**

**A) Self-Contained Plugins Only**
```python
class BasePlugin(ABC):
 async def process(self, payload: Any, context: Dict) -> PluginResult:
 # All logic runs in-process
 pass
```

**B) Hybrid: Self-Contained + External Service Integration**
```python
class BasePlugin(ABC):
 execution_mode: PluginExecutionMode # SELF_CONTAINED | EXTERNAL_SERVICE
 
class ExternalServicePlugin(BasePlugin):
 async def call_external_service(self, payload: Any) -> Any:
 # HTTP calls to microservices
 pass
```

**C) Microservice-Only Architecture**
```python
# All plugins are external services
class PluginConfig:
 service_url: str
 auth_config: Dict[str, Any]
```

```mermaid
flowchart TD
 A[Request] --> PM[Plugin Manager]
 
 subgraph "Option A: Self-Contained"
 PM --> P1[Plugin 1 In-Process]
 P1 --> P2[Plugin 2 In-Process]
 end
 
 subgraph "Option B: Hybrid"
 PM --> P3[Self-Contained Plugin]
 PM --> P4[External Service via HTTP]
 P4 --> EXT1[LlamaGuard API]
 P4 --> EXT2[OpenAI Moderation]
 end
 
 subgraph "Option C: Microservice-Only"
 PM --> MS1[Service 1]
 PM --> MS2[Service 2]
 MS1 --> EXT3[External API]
 end
```

**🗳️ Community Question:** Which approach best balances flexibility, performance, and operational complexity?

#### 🎯 Decision 2: Plugin Execution Models

**Options:**

**A) Sequential Execution**
```python
# Plugins execute one after another
for plugin in sorted_plugins:
 result = await plugin.process(payload, context)
 if not result.continue_processing:
 break
 payload = result.modified_payload or payload
```

**B) Parallel Execution with Dependency Resolution**
```python
# Independent plugins run concurrently
async with asyncio.TaskGroup() as tg:
 tasks = [tg.create_task(plugin.process(payload, context)) 
 for plugin in independent_plugins]
```

**C) Pipeline with Branching Logic**
```python
# Conditional execution based on context
if context.get("content_type") == "sensitive":
 await pii_scanner.process(payload, context)
if context.get("requires_moderation"):
 await moderation_plugin.process(payload, context)
```

```mermaid
flowchart LR
 subgraph "Sequential (A)"
 A1[Plugin A] --> A2[Plugin B] --> A3[Plugin C]
 end
 
 subgraph "Parallel (B)"
 B1[Plugin A] 
 B2[Plugin B]
 B3[Plugin C]
 B1 --> B4[Merge Results]
 B2 --> B4
 B3 --> B4
 end
 
 subgraph "Conditional (C)"
 C1{Content Type?}
 C1 -->|Sensitive| C2[PII Scanner]
 C1 -->|Public| C3[Basic Validation]
 C2 --> C4[Moderation Check]
 C3 --> C4
 end
```

**🗳️ Community Question:** Should we support all three models, or focus on one initially?

#### 🎯 Decision 3: Configuration & Discovery Strategy

**Options:**

**A) Database-Driven Configuration**
```python
# Store plugin configs in SQLAlchemy models
class PluginConfiguration(Base):
 __tablename__ = "plugin_configurations"
 name: str
 config: JSON
 enabled: bool
```

**B) File-Based Configuration with Hot Reload**
```yaml
# plugins.yaml
plugins:
 - name: "llama-guard"
 type: "ai_middleware" 
 service_url: "http://llama-guard:8080"
 enabled: true
```

**C) Hybrid: Database + File Overrides**
```python
# File-based defaults, database overrides
config = load_file_config()
config.update(load_database_config())
```

**D) Discovery via Environment/Registry**
```python
# Auto-discovery via service registry (Kubernetes, Consul, etc.)
plugins = discover_plugins_from_environment()
```

**🗳️ Community Question:** How should plugin configuration be managed for different deployment scenarios?

#### 🎯 Decision 4: Pipeline Integration Approach

**Options:**

**A) FastAPI Middleware Integration**
```python
# Extend existing middleware pipeline
class PluginMiddleware(BaseHTTPMiddleware):
 async def dispatch(self, request, call_next):
 # Run input plugins
 # Call next middleware
 # Run output plugins
```

**B) Service Layer Integration**
```python
# Integrate at service level
class ToolService:
 async def execute_tool(self, tool_call):
 # Run input plugins
 result = await self._execute_core_logic(tool_call)
 # Run output plugins
 return result
```

**C) Dedicated Plugin Pipeline**
```python
# Separate pipeline that FastAPI calls
class PluginPipeline:
 async def process_request(self, request) -> ProcessedRequest:
 pass
 async def process_response(self, response) -> ProcessedResponse:
 pass
```

```mermaid
sequenceDiagram
 participant Client
 participant Gateway
 participant Pipeline
 participant Core
 participant Plugin
 
 Client->>Gateway: Request
 
 rect rgb(240, 248, 255)
 note over Gateway,Pipeline: Option A: Middleware Integration
 Gateway->>Pipeline: Request
 Pipeline->>Plugin: Input Processing
 Plugin-->>Pipeline: Modified Request
 Pipeline->>Core: Core Logic
 Core-->>Pipeline: Response
 Pipeline->>Plugin: Output Processing
 Plugin-->>Pipeline: Modified Response
 Pipeline-->>Gateway: Final Response
 end
 
 Gateway-->>Client: Response
```

**🗳️ Community Question:** Where in the request/response cycle should plugins be integrated?

#### 🎯 Decision 5: Security & Isolation Model

**Options:**

**A) Process Isolation (Containers/Sandboxing)**
```python
# Each plugin runs in isolated container
class IsolatedPlugin:
 container_image: str
 resource_limits: Dict[str, Any]
```

**B) In-Process with Resource Limits**
```python
# Plugins run in same process with limits
class PluginExecutor:
 async def execute_with_limits(self, plugin, timeout=30):
 # Memory/CPU/time limits
 pass
```

**C) External Service Model (Network Isolation)**
```python
# All plugins are external services
# Security handled by network policies
class ExternalPlugin:
 endpoint: str
 auth_method: str
 tls_verify: bool
```

**🗳️ Community Question:** What level of isolation is appropriate for different plugin types?

---

## 🔄 Implementation Phases for Discussion

### Phase 1: Core Framework (v0.6.0)
- [ ] Plugin interface definitions
- [ ] Basic plugin manager 
- [ ] Configuration schema
- [ ] Simple pipeline integration

### Phase 2: Advanced Features (v0.7.0)
- [ ] External service integration
- [ ] Admin UI for plugin management
- [ ] Health monitoring
- [ ] Performance metrics

### Phase 3: AI Middleware (v0.8.0)
- [ ] LlamaGuard integration
- [ ] OpenAI Moderation plugin
- [ ] PII detection/masking
- [ ] Policy-as-Code engine

## 📊 Trade-off Analysis

| Decision | Pros | Cons | Community Impact |
|----------|------|------|------------------|
| **Hybrid Architecture** | ✅ Flexibility ✅ Performance options ✅ Enterprise-ready | ❌ Complexity ❌ More testing needed | 🏢 Supports both simple and enterprise use cases |
| **Sequential Execution** | ✅ Simple ✅ Predictable ✅ Easy debugging | ❌ Slower ❌ Limited parallelism | 🚀 Good starting point, can evolve |
| **Database Configuration** | ✅ Dynamic updates ✅ Multi-tenant ready ✅ Audit trail | ❌ Migration complexity ❌ Runtime dependencies | 🔄 Aligns with current architecture |

## 🤔 Open Questions for Community

1. **Plugin Marketplace:** Should we design for a future plugin marketplace/registry?

2. **Multi-Tenancy:** How should plugins be scoped? Per-server? Per-user? Global?

3. **Plugin Dependencies:** Should plugins be able to depend on other plugins?

4. **Versioning:** How do we handle plugin versioning and compatibility?

5. **Testing:** What testing framework should we provide for plugin developers?

6. **Documentation:** Should we auto-generate plugin documentation from schemas?

## 🎯 Success Criteria

A successful plugin framework should:

- ✅ **Enable** #229 (Guardrails), #271 (Policy-as-Code), #221 (Input Validation)
- ✅ **Support** both self-contained and external service plugins 
- ✅ **Integrate** seamlessly with existing FastAPI/SQLAlchemy architecture
- ✅ **Provide** excellent developer experience for plugin authors
- ✅ **Scale** from simple regex filters to enterprise AI safety systems
- ✅ **Maintain** backward compatibility with existing gateway features

## 💬 How to Participate

Please comment on this issue with:

- 🗳️ **Your preferred options** for each decision point
- 🤔 **Additional considerations** we might have missed
- 📋 **Use cases** that would influence the design
- 🔧 **Implementation suggestions** or concerns
- 📚 **Examples** from other systems you've worked with

**Timeline:** We need community input by **July 22, 2025** to start implementation for v0.6.0.

## 🔄 Next Steps

1. **Community discussion** (July 8-22, 2025)
2. **Finalize ADR-014** based on feedback 
3. **Create detailed implementation plan**
4. **Begin development** for v0.6.0 milestone

---

*This design document will be updated based on community feedback and finalized as ADR-014 once consensus is reached.*

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DESIGN]: Architecture Decisions and Discussions for AI Middleware and Plugin Framework (Enables #319) #313

🎯 Purpose

📋 Related Issues

🏗️ Current Architecture Context

🔥 Key Architectural Decisions for Discussion

1. Plugin Architecture Pattern

2. Plugin Execution Models

3. Configuration & Discovery Strategy

4. Pipeline Integration Approach

5. Security & Isolation Model

ADR-014: Plugin Architecture and AI Middleware Support

Context

Decision Points Requiring Community Input

🎯 Decision 1: Plugin Architecture Pattern

🎯 Decision 2: Plugin Execution Models

🎯 Decision 3: Configuration & Discovery Strategy

🎯 Decision 4: Pipeline Integration Approach

🎯 Decision 5: Security & Isolation Model

🔄 Implementation Phases for Discussion

Phase 1: Core Framework (v0.6.0)

Phase 2: Advanced Features (v0.7.0)

Phase 3: AI Middleware (v0.8.0)

📊 Trade-off Analysis

🤔 Open Questions for Community

🎯 Success Criteria

💬 How to Participate

🔄 Next Steps

Sub-issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Decision	Pros	Cons	Community Impact
Hybrid Architecture	✅ Flexibility ✅ Performance options ✅ Enterprise-ready	❌ Complexity ❌ More testing needed	🏢 Supports both simple and enterprise use cases
Sequential Execution	✅ Simple ✅ Predictable ✅ Easy debugging	❌ Slower ❌ Limited parallelism	🚀 Good starting point, can evolve
Database Configuration	✅ Dynamic updates ✅ Multi-tenant ready ✅ Audit trail	❌ Migration complexity ❌ Runtime dependencies	🔄 Aligns with current architecture

[DESIGN]: Architecture Decisions and Discussions for AI Middleware and Plugin Framework (Enables #319) #313

Description

🎯 Purpose

📋 Related Issues

🏗️ Current Architecture Context

🔥 Key Architectural Decisions for Discussion

1. Plugin Architecture Pattern

2. Plugin Execution Models

3. Configuration & Discovery Strategy

4. Pipeline Integration Approach

5. Security & Isolation Model

ADR-014: Plugin Architecture and AI Middleware Support

Context

Decision Points Requiring Community Input

🎯 Decision 1: Plugin Architecture Pattern

🎯 Decision 2: Plugin Execution Models

🎯 Decision 3: Configuration & Discovery Strategy

🎯 Decision 4: Pipeline Integration Approach

🎯 Decision 5: Security & Isolation Model

🔄 Implementation Phases for Discussion

Phase 1: Core Framework (v0.6.0)

Phase 2: Advanced Features (v0.7.0)

Phase 3: AI Middleware (v0.8.0)

📊 Trade-off Analysis

🤔 Open Questions for Community

🎯 Success Criteria

💬 How to Participate

🔄 Next Steps

Sub-issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions