Skip to content

[DESIGN]: Architecture Decisions and Discussions for AI Middleware and Plugin Framework (Enables #319)Β #313

@crivetimihai

Description

@crivetimihai

🎯 Purpose

This design document outlines the key architectural decisions for implementing the AI Middleware Integration / Plugin Framework (#319). We're seeking community input on these decisions before implementation begins to ensure we build the right foundation for extensible gateway capabilities.

πŸ“‹ Related Issues

πŸ—οΈ Current Architecture Context

The MCP Gateway currently has:

  • FastAPI application with middleware pipeline
  • SQLAlchemy 2.x async for persistence
  • Service-based architecture (tool_service, resource_service, etc.)
  • HTMX-based Admin UI for management
  • Authentication middleware (JWT + Basic Auth)
  • Configuration-driven approach with Pydantic settings

πŸ”₯ Key Architectural Decisions for Discussion

1. Plugin Architecture Pattern

2. Plugin Execution Models

3. Configuration & Discovery Strategy

4. Pipeline Integration Approach

5. Security & Isolation Model


ADR-014: Plugin Architecture and AI Middleware Support

  • Status: Proposed (Under Discussion)
  • Date: 2025-07-08
  • Deciders: Community Discussion Required

Context

The MCP Gateway needs a robust plugin framework to support:

  • AI Safety Middleware (LlamaGuard, OpenAI Moderation, custom filters)
  • Input/Output Processing (PII masking, content validation, sanitization)
  • Policy Enforcement (Rego-based rules, business logic, compliance)
  • Custom Authentication (Enterprise SSO, role-based access)
  • Observability Extensions (custom metrics, audit logging)

Current middleware pipeline is limited to FastAPI middleware and doesn't support:

  • Dynamic plugin registration
  • External service integration
  • Request/response transformation
  • Conditional execution based on context
  • Plugin-specific configuration management

Decision Points Requiring Community Input

🎯 Decision 1: Plugin Architecture Pattern

Options:

A) Self-Contained Plugins Only

class BasePlugin(ABC):
    async def process(self, payload: Any, context: Dict) -> PluginResult:
        # All logic runs in-process
        pass

B) Hybrid: Self-Contained + External Service Integration

class BasePlugin(ABC):
    execution_mode: PluginExecutionMode  # SELF_CONTAINED | EXTERNAL_SERVICE
    
class ExternalServicePlugin(BasePlugin):
    async def call_external_service(self, payload: Any) -> Any:
        # HTTP calls to microservices
        pass

C) Microservice-Only Architecture

# All plugins are external services
class PluginConfig:
    service_url: str
    auth_config: Dict[str, Any]
flowchart TD
    A[Request] --> PM[Plugin Manager]
    
    subgraph "Option A: Self-Contained"
        PM --> P1[Plugin 1<br/>In-Process]
        P1 --> P2[Plugin 2<br/>In-Process]
    end
    
    subgraph "Option B: Hybrid"
        PM --> P3[Self-Contained<br/>Plugin]
        PM --> P4[External Service<br/>via HTTP]
        P4 --> EXT1[LlamaGuard API]
        P4 --> EXT2[OpenAI Moderation]
    end
    
    subgraph "Option C: Microservice-Only"
        PM --> MS1[Service 1]
        PM --> MS2[Service 2]
        MS1 --> EXT3[External API]
    end
Loading

πŸ—³οΈ Community Question: Which approach best balances flexibility, performance, and operational complexity?

🎯 Decision 2: Plugin Execution Models

Options:

A) Sequential Execution

# Plugins execute one after another
for plugin in sorted_plugins:
    result = await plugin.process(payload, context)
    if not result.continue_processing:
        break
    payload = result.modified_payload or payload

B) Parallel Execution with Dependency Resolution

# Independent plugins run concurrently
async with asyncio.TaskGroup() as tg:
    tasks = [tg.create_task(plugin.process(payload, context)) 
             for plugin in independent_plugins]

C) Pipeline with Branching Logic

# Conditional execution based on context
if context.get("content_type") == "sensitive":
    await pii_scanner.process(payload, context)
if context.get("requires_moderation"):
    await moderation_plugin.process(payload, context)
flowchart LR
    subgraph "Sequential (A)"
        A1[Plugin A] --> A2[Plugin B] --> A3[Plugin C]
    end
    
    subgraph "Parallel (B)"
        B1[Plugin A] 
        B2[Plugin B]
        B3[Plugin C]
        B1 --> B4[Merge Results]
        B2 --> B4
        B3 --> B4
    end
    
    subgraph "Conditional (C)"
        C1{Content Type?}
        C1 -->|Sensitive| C2[PII Scanner]
        C1 -->|Public| C3[Basic Validation]
        C2 --> C4[Moderation Check]
        C3 --> C4
    end
Loading

πŸ—³οΈ Community Question: Should we support all three models, or focus on one initially?

🎯 Decision 3: Configuration & Discovery Strategy

Options:

A) Database-Driven Configuration

# Store plugin configs in SQLAlchemy models
class PluginConfiguration(Base):
    __tablename__ = "plugin_configurations"
    name: str
    config: JSON
    enabled: bool

B) File-Based Configuration with Hot Reload

# plugins.yaml
plugins:
  - name: "llama-guard"
    type: "ai_middleware" 
    service_url: "http://llama-guard:8080"
    enabled: true

C) Hybrid: Database + File Overrides

# File-based defaults, database overrides
config = load_file_config()
config.update(load_database_config())

D) Discovery via Environment/Registry

# Auto-discovery via service registry (Kubernetes, Consul, etc.)
plugins = discover_plugins_from_environment()

πŸ—³οΈ Community Question: How should plugin configuration be managed for different deployment scenarios?

🎯 Decision 4: Pipeline Integration Approach

Options:

A) FastAPI Middleware Integration

# Extend existing middleware pipeline
class PluginMiddleware(BaseHTTPMiddleware):
    async def dispatch(self, request, call_next):
        # Run input plugins
        # Call next middleware
        # Run output plugins

B) Service Layer Integration

# Integrate at service level
class ToolService:
    async def execute_tool(self, tool_call):
        # Run input plugins
        result = await self._execute_core_logic(tool_call)
        # Run output plugins
        return result

C) Dedicated Plugin Pipeline

# Separate pipeline that FastAPI calls
class PluginPipeline:
    async def process_request(self, request) -> ProcessedRequest:
        pass
    async def process_response(self, response) -> ProcessedResponse:
        pass
sequenceDiagram
    participant Client
    participant Gateway
    participant Pipeline
    participant Core
    participant Plugin
    
    Client->>Gateway: Request
    
    rect rgb(240, 248, 255)
        note over Gateway,Pipeline: Option A: Middleware Integration
        Gateway->>Pipeline: Request
        Pipeline->>Plugin: Input Processing
        Plugin-->>Pipeline: Modified Request
        Pipeline->>Core: Core Logic
        Core-->>Pipeline: Response
        Pipeline->>Plugin: Output Processing
        Plugin-->>Pipeline: Modified Response
        Pipeline-->>Gateway: Final Response
    end
    
    Gateway-->>Client: Response
Loading

πŸ—³οΈ Community Question: Where in the request/response cycle should plugins be integrated?

🎯 Decision 5: Security & Isolation Model

Options:

A) Process Isolation (Containers/Sandboxing)

# Each plugin runs in isolated container
class IsolatedPlugin:
    container_image: str
    resource_limits: Dict[str, Any]

B) In-Process with Resource Limits

# Plugins run in same process with limits
class PluginExecutor:
    async def execute_with_limits(self, plugin, timeout=30):
        # Memory/CPU/time limits
        pass

C) External Service Model (Network Isolation)

# All plugins are external services
# Security handled by network policies
class ExternalPlugin:
    endpoint: str
    auth_method: str
    tls_verify: bool

πŸ—³οΈ Community Question: What level of isolation is appropriate for different plugin types?


πŸ”„ Implementation Phases for Discussion

Phase 1: Core Framework (v0.6.0)

  • Plugin interface definitions
  • Basic plugin manager
  • Configuration schema
  • Simple pipeline integration

Phase 2: Advanced Features (v0.7.0)

  • External service integration
  • Admin UI for plugin management
  • Health monitoring
  • Performance metrics

Phase 3: AI Middleware (v0.8.0)

  • LlamaGuard integration
  • OpenAI Moderation plugin
  • PII detection/masking
  • Policy-as-Code engine

πŸ“Š Trade-off Analysis

Decision Pros Cons Community Impact
Hybrid Architecture βœ… Flexibility
βœ… Performance options
βœ… Enterprise-ready
❌ Complexity
❌ More testing needed
🏒 Supports both simple and enterprise use cases
Sequential Execution βœ… Simple
βœ… Predictable
βœ… Easy debugging
❌ Slower
❌ Limited parallelism
πŸš€ Good starting point, can evolve
Database Configuration βœ… Dynamic updates
βœ… Multi-tenant ready
βœ… Audit trail
❌ Migration complexity
❌ Runtime dependencies
πŸ”„ Aligns with current architecture

πŸ€” Open Questions for Community

  1. Plugin Marketplace: Should we design for a future plugin marketplace/registry?

  2. Multi-Tenancy: How should plugins be scoped? Per-server? Per-user? Global?

  3. Plugin Dependencies: Should plugins be able to depend on other plugins?

  4. Versioning: How do we handle plugin versioning and compatibility?

  5. Testing: What testing framework should we provide for plugin developers?

  6. Documentation: Should we auto-generate plugin documentation from schemas?

🎯 Success Criteria

A successful plugin framework should:

πŸ’¬ How to Participate

Please comment on this issue with:

  • πŸ—³οΈ Your preferred options for each decision point
  • πŸ€” Additional considerations we might have missed
  • πŸ“‹ Use cases that would influence the design
  • πŸ”§ Implementation suggestions or concerns
  • πŸ“š Examples from other systems you've worked with

Timeline: We need community input by July 22, 2025 to start implementation for v0.6.0.

πŸ”„ Next Steps

  1. Community discussion (July 8-22, 2025)
  2. Finalize ADR-014 based on feedback
  3. Create detailed implementation plan
  4. Begin development for v0.6.0 milestone

This design document will be updated based on community feedback and finalized as ADR-014 once consensus is reached.

Sub-issues

Metadata

Metadata

Assignees

Labels

designArchitecture and DesignenhancementNew feature or requesttriageIssues / Features awaiting triage

Type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions