[Design Proposal] LLM Provider management #286

menakaj · 2026-02-05T05:56:17Z

menakaj
Feb 5, 2026
Collaborator

LLM Provider Management Implementation Proposal

Problem Statement

Organizations deploying AI agents need centralized control over LLM (Large Language Model) providers to ensure secure, compliant, and cost-effective AI operations. Currently, the system needs:

Organization-Level LLM Provider Registry - A centralized catalog where organizations can register approved LLM providers (OpenAI, Anthropic, Azure OpenAI, etc.) with standardized configurations, making them available for use across all projects.
Template-Based Provider Creation - Standardized templates for common LLM providers (OpenAI, Anthropic, Azure OpenAI) that enforce best practices for authentication, security, and API configuration.
Policy Enforcement at the Provider Level - The ability to attach guardrails (PII redaction, prompt injection detection), rate limits, and security policies to providers centrally, ensuring consistent policy enforcement across all usages.
Multi-Gateway Deployment - Deploy LLM providers to multiple gateways simultaneously (dev, staging, production) with real-time configuration updates via WebSocket.
Model Catalog Management - Track which AI models are available from each provider, enabling model-specific routing and governance.
Version Control and Configuration Management - Track configuration changes over time, supporting rollback and audit requirements.

User Stories

AI Compliance Lead

Provider Registration & Governance:

As an AI Compliance Lead, I want to register approved LLM providers at the organization level with authentication credentials and network policies so that only compliant providers can be used across all projects.
As an AI Compliance Lead, I want to use pre-built templates (OpenAI, Anthropic, Azure OpenAI) to quickly register providers with security best practices built-in.
As an AI Compliance Lead, I want to view all registered providers across my organization with their security configurations so that I can audit compliance.

Policy Management:

As an AI Compliance Lead, I want to attach guardrail policies (PII detection, prompt injection detection, content filtering) to providers so that all agent interactions are automatically protected.
As an AI Compliance Lead, I want to configure rate limits per provider to prevent cost overruns and ensure fair resource allocation across projects.
As an AI Compliance Lead, I want policy updates to automatically redeploy to all gateways so that security changes take effect immediately.

Model Governance:

As an AI Compliance Lead, I want to specify which AI models are available from each provider so that I can restrict access to specific models (e.g., only GPT-4, not GPT-4-turbo).
As an AI Compliance Lead, I want to see which models are deployed where so that I can track model usage across the organization.

Multi-Gateway Deployment:

As an AI Compliance Lead, I want to deploy provider configurations to multiple gateways (dev, staging, production) so that security policies are consistent across environments.
As an AI Compliance Lead, I want deployment status visibility so that I can verify all gateways are running the latest configurations.

Project Developer

Provider Discovery:

As a project developer, I want to browse the organization's catalog of approved LLM providers so that I can choose the right provider for my AI agent.
As a project developer, I want to see which models are available from each provider so that I can select the appropriate model for my use case.

Provider Usage:

As a project developer, I want to reference organization-level providers in my agent configurations so that I inherit all security policies automatically.
As a project developer, I want to deploy my agent knowing that the LLM provider is already configured with authentication and security policies.

Gateway Operator

Real-Time Configuration:

As a gateway operator, I want my gateway to receive LLM provider configurations via WebSocket so that deployments happen in real-time without manual intervention.
As a gateway operator, I want to fetch full provider configurations (including OpenAPI specs) when deployment events arrive so that routes are generated dynamically.

Architecture Overview

The implemented solution uses a centralized provider registry with WebSocket-based deployment orchestration where Agent Manager serves as the control plane for LLM provider lifecycle management.

Key Components

┌─────────────────────────────────────────────────────────────────┐
│                    AGENT MANAGER (Control Plane)                │
│                                                                 │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │  LLM Provider Template Service                            │ │
│  │  - Pre-built templates (OpenAI, Anthropic, Azure)        │ │
│  │  - Template CRUD and versioning                          │ │
│  │  - Template validation                                    │ │
│  └───────────────────────────────────────────────────────────┘ │
│                         ↓                                       │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │  LLM Provider Service                                      │ │
│  │  - Provider CRUD operations                               │ │
│  │  - Configuration management (JSONB)                       │ │
│  │  - Policy attachment (guardrails, rate limits)           │ │
│  │  - Model catalog management                               │ │
│  │  - OpenAPI spec generation                                │ │
│  └───────────────────────────────────────────────────────────┘ │
│                         ↓                                       │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │  LLM Deployment Service                                    │ │
│  │  - Multi-gateway deployment orchestration                 │ │
│  │  - Deployment status tracking                             │ │
│  │  - WebSocket event broadcasting                           │ │
│  └───────────────────────────────────────────────────────────┘ │
│                         ↓                                       │
│  ┌───────────────────────────────────────────────────────────┐ │
│  │  Gateway Events Service                                    │ │
│  │  - llm.deployed events                                    │ │
│  │  - llm.undeployed events                                  │ │
│  │  - Event serialization and validation                     │ │
│  └───────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────┘
                           ↓
        WebSocket Event Stream (wss://control-plane/gateways/ws)
                           ↓
┌─────────────────────────────────────────────────────────────────┐
│                    GATEWAY INSTANCES (Data Plane)               │
│                                                                 │
│  ┌────────────────┐  ┌────────────────┐  ┌────────────────┐  │
│  │ Gateway (Prod) │  │ Gateway (Stg)  │  │ Gateway (Dev)  │  │
│  │ - Receives     │  │ - Receives     │  │ - Receives     │  │
│  │   llm.deployed │  │   llm.deployed │  │   llm.deployed │  │
│  │ - Fetches full │  │ - Fetches full │  │ - Fetches full │  │
│  │   config       │  │   config       │  │   config       │  │
│  │ - Generates    │  │ - Generates    │  │ - Generates    │  │
│  │   xDS routes   │  │   xDS routes   │  │   xDS routes   │  │
│  └────────────────┘  └────────────────┘  └────────────────┘  │
└─────────────────────────────────────────────────────────────────┘
                           ↓
              Upstream LLM Providers (OpenAI, Anthropic, etc.)

Design Principles

1. Template-Driven Configuration
LLM providers are created from templates that encode provider-specific knowledge (authentication methods, API endpoints, supported models). Templates ensure consistent, secure configurations.

2. Organization-Scoped Provider Registry
Providers are registered at the organization level, creating a shared catalog that all projects can reference. This enables centralized governance and policy enforcement.

3. JSONB Configuration Storage
Provider configurations (including policies, security settings, upstream details) are stored as JSONB in PostgreSQL, allowing flexible schema evolution without database migrations.

4. Event-Driven Deployment
Deployments trigger WebSocket events to connected gateways. Gateways fetch full configurations on-demand and apply them to Envoy via xDS.

5. Policy-as-Configuration
Guardrails, rate limits, and security policies are stored in the provider's configuration.policies[] array. UI provides structured APIs, but everything is stored as policies.

6. Artifact-Based Model
LLM providers are artifacts (like APIs) with shared lifecycle properties (handle, name, version, organization). This enables consistent CRUD operations and deployment tracking.

Data Model

Core Tables

1. `llm_provider_templates`

Templates define provider-specific configuration patterns (OpenAI, Anthropic, Azure OpenAI, etc.).

CREATE TABLE llm_provider_templates (
    uuid UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    organization_uuid UUID NOT NULL REFERENCES organizations(uuid) ON DELETE CASCADE,
    handle VARCHAR(255) NOT NULL,
    name VARCHAR(253) NOT NULL,
    description TEXT,
    created_by VARCHAR(255),
    configuration TEXT NOT NULL,  -- Template YAML/JSON
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

    CONSTRAINT uq_llm_template_handle_org UNIQUE(organization_uuid, handle)
);

CREATE INDEX idx_llm_provider_templates_org ON llm_provider_templates(organization_uuid);

Key Fields:

handle - Template identifier (e.g., "openai", "anthropic", "azure-openai")
configuration - Template configuration with placeholders for credentials
organization_uuid - Templates can be organization-specific or global

Example Template (OpenAI):

apiVersion: gateway.api-platform.wso2.com/v1alpha1
kind: LLMProviderTemplate
metadata:
  handle: openai
  name: OpenAI
  description: OpenAI GPT models template
spec:
  authentication:
    type: api-key
    header: Authorization
    valueFormat: "Bearer ${OPENAI_API_KEY}"
  upstream:
    url: https://api.openai.com/v1
    tls:
      minVersion: "1.3"
  supportedModels:
    - id: gpt-4
      name: GPT-4
      description: Most capable model
    - id: gpt-4-turbo
      name: GPT-4 Turbo
      description: Faster GPT-4 variant
    - id: gpt-3.5-turbo
      name: GPT-3.5 Turbo
      description: Fast and cost-effective
  defaultPolicies:
    - name: basic-ratelimit
      version: v0.1.1
      params:
        limits:
          - limit: 1000
            duration: "1m"

2. `llm_providers`

LLM providers are organization-level resources created from templates.

CREATE TABLE llm_providers (
    uuid UUID PRIMARY KEY,
    description TEXT,
    created_by VARCHAR(255),
    template_uuid UUID NOT NULL REFERENCES llm_provider_templates(uuid) ON DELETE RESTRICT,
    openapi_spec TEXT,          -- Generated OpenAPI specification
    model_list TEXT,            -- JSON array of available models
    status VARCHAR(20) NOT NULL DEFAULT 'CREATED',
    configuration JSONB NOT NULL,

    CONSTRAINT fk_llm_provider_artifact FOREIGN KEY (uuid)
        REFERENCES artifacts(uuid) ON DELETE CASCADE
);

CREATE INDEX idx_llm_providers_template ON llm_providers(template_uuid);

Key Fields:

uuid - Shared with artifacts table (LLM providers ARE artifacts)
template_uuid - Template this provider was created from
configuration - Full provider configuration (JSONB)
openapi_spec - Generated OpenAPI spec for this provider
model_list - JSON array of available models
status - CREATED, PENDING, DEPLOYED, FAILED

Configuration JSONB Structure:

{
  "name": "openai-gpt4",
  "version": "v1",
  "context": "/openai",
  "vhost": "openai.acme.local",
  "template": "openai",
  "upstream": {
    "url": "https://api.openai.com/v1",
    "auth": {
      "type": "api-key",
      "header": "Authorization",
      "secretRef": "openai-api-key-prod"
    },
    "tls": {
      "minVersion": "1.3"
    }
  },
  "accessControl": {
    "mode": "ALLOW_ALL",
    "exceptions": []
  },
  "rateLimiting": {
    "providerLevel": {
      "global": {
        "request": {
          "enabled": true,
          "count": 1000,
          "reset": {
            "duration": 1,
            "unit": "MINUTE"
          }
        },
        "token": {
          "enabled": true,
          "count": 100000,
          "reset": {
            "duration": 1,
            "unit": "HOUR"
          }
        }
      }
    }
  },
  "policies": [
    {
      "name": "pii-masking-regex",
      "version": "v1.0.0",
      "paths": [
        {
          "path": "/chat/completions",
          "methods": ["POST"],
          "params": {
            "rules": [
              {
                "name": "detect-email",
                "pattern": "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b",
                "action": "redact",
                "replacement": "[EMAIL]"
              }
            ]
          }
        }
      ]
    }
  ],
  "security": {
    "enabled": true,
    "apiKey": {
      "enabled": true,
      "in": "header",
      "key": "X-API-Key"
    }
  }
}

3. `artifacts`

LLM providers are a type of artifact, sharing the artifact table.

CREATE TABLE artifacts (
    uuid UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    handle VARCHAR(64) NOT NULL,
    name VARCHAR(253) NOT NULL,
    version VARCHAR(32) NOT NULL,
    kind VARCHAR(32) NOT NULL,  -- 'API', 'LLM_PROVIDER', 'LLM_PROXY', 'API_PRODUCT'
    organization_uuid UUID NOT NULL REFERENCES organizations(uuid) ON DELETE CASCADE,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
    updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,

    UNIQUE(organization_uuid, handle)
);

For LLM Providers:

kind = 'LLM_PROVIDER'
handle = Provider identifier (e.g., "openai-gpt4")
name = Human-readable name
version = Configuration version

4. `llm_provider_deployments`

Tracks which providers are deployed to which gateways.

CREATE TABLE llm_provider_deployments (
    uuid UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    llm_provider_uuid UUID NOT NULL REFERENCES llm_providers(uuid) ON DELETE CASCADE,
    gateway_uuid UUID NOT NULL REFERENCES gateways(uuid) ON DELETE CASCADE,
    environment_uuid UUID REFERENCES environments(uuid) ON DELETE SET NULL,
    deployment_status VARCHAR(20) DEFAULT 'pending',
    deployed_at TIMESTAMP,
    created_at TIMESTAMP NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMP NOT NULL DEFAULT NOW(),

    CONSTRAINT chk_llm_deployment_status
        CHECK (deployment_status IN ('pending', 'deployed', 'failed', 'undeployed')),
    UNIQUE(llm_provider_uuid, gateway_uuid)
);

CREATE INDEX idx_llm_deployments_provider ON llm_provider_deployments(llm_provider_uuid);
CREATE INDEX idx_llm_deployments_gateway ON llm_provider_deployments(gateway_uuid);
CREATE INDEX idx_llm_deployments_status ON llm_provider_deployments(deployment_status);

Status Lifecycle:

pending - Deployment event sent, awaiting gateway confirmation
deployed - Gateway successfully applied configuration
failed - Gateway reported deployment failure
undeployed - Removed from gateway

API Design

All LLM provider endpoints are scoped to organizations: /orgs/{orgName}/llm-providers/*

Template Management

1. List Templates

GET /orgs/{orgName}/llm-provider-templates

Response:
{
  "templates": [
    {
      "uuid": "template-uuid",
      "handle": "openai",
      "name": "OpenAI",
      "description": "OpenAI GPT models",
      "createdBy": "admin@acme.com",
      "createdAt": "2026-01-01T00:00:00Z"
    },
    {
      "uuid": "template-uuid-2",
      "handle": "anthropic",
      "name": "Anthropic Claude",
      "description": "Anthropic Claude models",
      "createdBy": "admin@acme.com",
      "createdAt": "2026-01-01T00:00:00Z"
    }
  ],
  "total": 2
}

2. Get Template Details

GET /orgs/{orgName}/llm-provider-templates/{templateId}

Response:
{
  "uuid": "template-uuid",
  "handle": "openai",
  "name": "OpenAI",
  "description": "OpenAI GPT models",
  "configuration": {
    "authentication": {...},
    "upstream": {...},
    "supportedModels": [...]
  },
  "createdBy": "admin@acme.com",
  "createdAt": "2026-01-01T00:00:00Z"
}

3. Create Custom Template

POST /orgs/{orgName}/llm-provider-templates
Content-Type: application/json

{
  "handle": "custom-llm",
  "name": "Custom LLM Provider",
  "description": "Custom on-premise LLM",
  "configuration": {
    "authentication": {
      "type": "bearer-token",
      "header": "Authorization"
    },
    "upstream": {
      "url": "https://llm.internal.acme.com/v1"
    },
    "supportedModels": [
      {
        "id": "custom-model-1",
        "name": "Custom Model 1"
      }
    ]
  }
}

Response: 201 Created
{
  "uuid": "new-template-uuid",
  "handle": "custom-llm",
  "name": "Custom LLM Provider",
  ...
}

Provider Management

1. Create LLM Provider

POST /orgs/{orgName}/llm-providers
Content-Type: application/json

{
  "templateId": "openai-template-uuid",
  "description": "Production OpenAI GPT-4 provider",
  "configuration": {
    "name": "openai-gpt4",
    "version": "v1",
    "context": "/openai",
    "vhost": "openai.acme.local",
    "template": "openai",
    "upstream": {
      "url": "https://api.openai.com/v1",
      "auth": {
        "type": "api-key",
        "header": "Authorization",
        "secretRef": "openai-api-key-prod"
      }
    },
    "accessControl": {
      "mode": "ALLOW_ALL"
    },
    "rateLimiting": {
      "providerLevel": {
        "global": {
          "request": {
            "enabled": true,
            "count": 1000,
            "reset": {
              "duration": 1,
              "unit": "MINUTE"
            }
          }
        }
      }
    },
    "security": {
      "enabled": true,
      "apiKey": {
        "enabled": true,
        "in": "header",
        "key": "X-API-Key"
      }
    }
  },
  "modelProviders": [
    {
      "id": "openai",
      "name": "OpenAI",
      "models": [
        {
          "id": "gpt-4",
          "name": "GPT-4",
          "description": "Most capable model"
        },
        {
          "id": "gpt-4-turbo",
          "name": "GPT-4 Turbo",
          "description": "Faster GPT-4 variant"
        }
      ]
    }
  ]
}

Response: 201 Created
{
  "uuid": "provider-uuid",
  "handle": "openai-gpt4",
  "name": "openai-gpt4",
  "version": "v1",
  "kind": "LLM_PROVIDER",
  "description": "Production OpenAI GPT-4 provider",
  "templateUuid": "openai-template-uuid",
  "status": "CREATED",
  "configuration": {...},
  "modelProviders": [...],
  "createdBy": "admin@acme.com",
  "createdAt": "2026-02-13T10:00:00Z"
}

Behavior:

Creates artifact record with kind = LLM_PROVIDER
Creates llm_provider record with configuration JSONB
Generates OpenAPI spec from template + configuration
Serializes model providers to model_list JSON
Sets initial status to CREATED

2. List LLM Providers

GET /orgs/{orgName}/llm-providers?limit=50&offset=0

Response:
{
  "providers": [
    {
      "uuid": "provider-uuid",
      "handle": "openai-gpt4",
      "name": "openai-gpt4",
      "version": "v1",
      "description": "Production OpenAI GPT-4",
      "status": "DEPLOYED",
      "templateUuid": "openai-template-uuid",
      "modelProviders": [
        {
          "id": "openai",
          "name": "OpenAI",
          "models": [
            {"id": "gpt-4", "name": "GPT-4"},
            {"id": "gpt-4-turbo", "name": "GPT-4 Turbo"}
          ]
        }
      ],
      "createdAt": "2026-02-13T10:00:00Z"
    }
  ],
  "total": 1,
  "limit": 50,
  "offset": 0
}

3. Get Provider Details

GET /orgs/{orgName}/llm-providers/{providerId}

Response:
{
  "uuid": "provider-uuid",
  "handle": "openai-gpt4",
  "name": "openai-gpt4",
  "version": "v1",
  "kind": "LLM_PROVIDER",
  "description": "Production OpenAI GPT-4",
  "status": "DEPLOYED",
  "templateUuid": "openai-template-uuid",
  "configuration": {
    "name": "openai-gpt4",
    "version": "v1",
    "upstream": {...},
    "accessControl": {...},
    "rateLimiting": {...},
    "policies": [...],
    "security": {...}
  },
  "modelProviders": [...],
  "openapi": "openapi: 3.0.0\n...",  // Generated OpenAPI spec
  "deployments": [
    {
      "gatewayId": "gw-1",
      "gatewayName": "prod-gateway-1",
      "environment": "production",
      "status": "deployed",
      "deployedAt": "2026-02-13T11:00:00Z"
    }
  ],
  "createdBy": "admin@acme.com",
  "createdAt": "2026-02-13T10:00:00Z",
  "updatedAt": "2026-02-13T11:00:00Z"
}

4. Update Provider

PUT /orgs/{orgName}/llm-providers/{providerId}
Content-Type: application/json

{
  "description": "Updated description",
  "configuration": {
    "rateLimiting": {
      "providerLevel": {
        "global": {
          "request": {
            "enabled": true,
            "count": 2000,  // Increased limit
            "reset": {
              "duration": 1,
              "unit": "MINUTE"
            }
          }
        }
      }
    }
  },
  "redeployToAll": true  // Auto-redeploy to all gateways
}

Response:
{
  "uuid": "provider-uuid",
  "handle": "openai-gpt4",
  "updatedAt": "2026-02-13T12:00:00Z",
  "deployments": [
    {
      "gatewayId": "gw-1",
      "status": "deployed",
      "message": "Configuration updated successfully"
    },
    {
      "gatewayId": "gw-2",
      "status": "deployed",
      "message": "Configuration updated successfully"
    }
  ]
}

Behavior:

Updates provider configuration JSONB
Regenerates OpenAPI spec
If redeployToAll = true, broadcasts update events to all deployed gateways
Returns deployment results

5. Delete Provider

DELETE /orgs/{orgName}/llm-providers/{providerId}

Response: 204 No Content

Behavior:

Checks if provider has active deployments
If deployed, returns HTTP 409 Conflict (must undeploy first)
If not deployed, deletes artifact and provider records (CASCADE)

Deployment Management

1. Deploy to Gateways

POST /orgs/{orgName}/llm-providers/{providerId}/deploy
Content-Type: application/json

{
  "base": "current",  // Or deployment UUID to use existing config
  "name": "prod-deployment-v1",
  "gatewayId": "gateway-uuid"
}

OR (for multi-gateway):

{
  "base": "current",
  "name": "prod-deployment-v1",
  "environmentId": "production-env-uuid"
}

Response:
{
  "deployments": [
    {
      "deploymentId": "deployment-uuid-1",
      "gatewayId": "gw-1",
      "gatewayName": "prod-gateway-1",
      "status": "pending",
      "message": "Deployment event sent to gateway"
    },
    {
      "deploymentId": "deployment-uuid-2",
      "gatewayId": "gw-2",
      "gatewayName": "prod-gateway-2",
      "status": "pending",
      "message": "Deployment event sent to gateway"
    }
  ]
}

Deployment Flow:

Validate provider exists and gateway(s) exist
Create deployment record(s) with status = pending
Serialize provider configuration to YAML
For each gateway:
- Check if gateway is connected (is_active = true)
- If connected: Send llm.deployed WebSocket event
- If offline: Leave deployment as pending
Return deployment status

WebSocket Event Payload:

{
  "type": "llm.deployed",
  "payload": {
    "llmProviderId": "provider-uuid",
    "deploymentId": "deployment-uuid",
    "gatewayId": "gateway-uuid"
  },
  "timestamp": "2026-02-13T12:00:00Z",
  "correlationId": "correlation-uuid"
}

Gateway Action:

Receive llm.deployed event via WebSocket
Call GET /internal/llm-providers/{providerId} to fetch full config
Parse configuration and generate Envoy xDS routes from OpenAPI spec
Apply configuration to Envoy control plane
Call POST /internal/deployments/{deploymentId}/status with result

2. Undeploy from Gateway

DELETE /orgs/{orgName}/llm-providers/{providerId}/deployments/{deploymentId}

Response: 204 No Content

Behavior:

Sends llm.undeployed event to gateway
Updates deployment status to undeployed
Gateway removes routes from Envoy

3. List Deployments

GET /orgs/{orgName}/llm-providers/{providerId}/deployments

Response:
{
  "deployments": [
    {
      "deploymentId": "deployment-uuid",
      "gatewayId": "gw-1",
      "gatewayName": "prod-gateway-1",
      "environment": "production",
      "status": "deployed",
      "deployedAt": "2026-02-13T11:00:00Z"
    }
  ]
}

Policy Management

Policies are stored in the configuration.policies[] array. These APIs provide a structured interface for managing policies.

1. Add Policy to Provider

POST /orgs/{orgName}/llm-providers/{providerId}/policies
Content-Type: application/json

{
  "name": "pii-masking-regex",
  "version": "v1.0.0",
  "paths": [
    {
      "path": "/chat/completions",
      "methods": ["POST"],
      "params": {
        "rules": [
          {
            "name": "detect-email",
            "pattern": "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b",
            "action": "redact",
            "replacement": "[EMAIL]"
          },
          {
            "name": "detect-ssn",
            "pattern": "\\b\\d{3}-\\d{2}-\\d{4}\\b",
            "action": "block",
            "errorMessage": "SSN detected in prompt"
          }
        ]
      }
    }
  ]
}

Response: 201 Created
{
  "policyId": "generated-uuid",
  "name": "pii-masking-regex",
  "version": "v1.0.0",
  "addedAt": "2026-02-13T13:00:00Z",
  "deployments": [
    {
      "gatewayId": "gw-1",
      "status": "deployed",
      "message": "Policy applied successfully"
    }
  ]
}

Backend Logic:

Load provider configuration from database
Add new policy to configuration.policies[] array
Save updated configuration JSONB
Trigger redeployment to all gateways
Return deployment results

Gateway Internal API

These endpoints are called by gateways using token authentication.

1. Get Provider Configuration

GET /internal/v1/llm-providers/{providerId}
Authorization: Bearer <gateway-token>

Response: zip archive of the provider

Purpose:
When gateway receives llm.deployed event, it calls this endpoint to fetch the complete provider configuration and OpenAPI spec for route generation.

Implementation Details

Technology Stack

Control Plane:

Language: Go 1.24+
Database: PostgreSQL with JSONB for configuration storage
ORM: GORM
WebSocket: gorilla/websocket
Dependency Injection: Wire

Key Services

1. LLM Provider Template Service

Responsibilities:

Template CRUD operations
Template validation
Template seeding (OpenAI, Anthropic, Azure templates)

Methods:

type LLMProviderTemplateService interface {
    Create(orgID, createdBy string, template *models.LLMProviderTemplate) error
    List(orgID string, limit, offset int) ([]*models.LLMProviderTemplate, int, error)
    GetByID(templateID, orgID string) (*models.LLMProviderTemplate, error)
    GetByHandle(handle, orgID string) (*models.LLMProviderTemplate, error)
    Exists(handle, orgID string) (bool, error)
}

2. LLM Provider Service

Responsibilities:

Provider CRUD operations
Configuration management (JSONB serialization)
Model catalog parsing and serialization
OpenAPI spec generation
Policy management

Methods:

type LLMProviderService interface {
    Create(orgID, createdBy string, provider *models.LLMProvider) (*models.LLMProvider, error)
    List(orgID string, limit, offset int) ([]*models.LLMProvider, int, error)
    GetByID(providerID, orgID string) (*models.LLMProvider, error)
    Update(providerID, orgID string, updates map[string]interface{}) error
    Delete(providerID, orgID string) error
    AddPolicy(providerID, orgID string, policy *models.LLMPolicy) error
    RemovePolicy(providerID, orgID, policyName string) error
}

3. LLM Deployment Service

Responsibilities:

Multi-gateway deployment orchestration
Deployment YAML generation
WebSocket event broadcasting
Deployment status tracking

Deployment Flow:

func (s *LLMProviderDeploymentService) DeployLLMProvider(
    providerID string,
    req *DeploymentRequest,
    orgID string,
) (*Deployment, error) {
    // 1. Validate provider and gateway
    provider, _ := s.providerRepo.GetByID(providerID, orgID)
    gateway, _ := s.gatewayRepo.GetByUUID(req.GatewayID)

    // 2. Generate deployment YAML
    yamlContent := s.generateDeploymentYAML(provider)

    // 3. Create deployment record
    deployment := &models.Deployment{
        UUID:         uuid.New(),
        APIUUID:      provider.UUID,
        GatewayUUID:  gateway.UUID,
        Content:      yamlContent,
        Status:       "pending",
    }
    s.deploymentRepo.Create(deployment)

    // 4. Send WebSocket event if gateway is connected
    if gateway.IsActive {
        s.eventsService.BroadcastDeploymentEvent(
            gateway.UUID.String(),
            &DeploymentEvent{
                LLMProviderID: providerID,
                DeploymentID:  deployment.UUID.String(),
                GatewayID:     gateway.UUID.String(),
            },
        )
    }

    return deployment, nil
}

Database Indexes

Performance-Critical Indexes:

-- LLM Provider lookups
CREATE INDEX idx_llm_providers_template ON llm_providers(template_uuid);

-- Deployment lookups
CREATE INDEX idx_llm_deployments_provider ON llm_provider_deployments(llm_provider_uuid);
CREATE INDEX idx_llm_deployments_gateway ON llm_provider_deployments(gateway_uuid);
CREATE INDEX idx_llm_deployments_status ON llm_provider_deployments(deployment_status);

-- Template lookups
CREATE INDEX idx_llm_provider_templates_org ON llm_provider_templates(organization_uuid);

Configuration Management

Provider Configuration Structure

The provider configuration is stored as JSONB and contains all deployment-related settings:

{
  "name": "openai-gpt4",
  "version": "v1",
  "context": "/openai",
  "vhost": "openai.acme.local",
  "template": "openai",

  "upstream": {
    "url": "https://api.openai.com/v1",
    "auth": {
      "type": "api-key",
      "header": "Authorization",
      "secretRef": "openai-api-key-prod"
    },
    "tls": {
      "minVersion": "1.3"
    }
  },

  "accessControl": {
    "mode": "ALLOW_ALL",
    "exceptions": []
  },

  "rateLimiting": {
    "providerLevel": {
      "global": {
        "request": {
          "enabled": true,
          "count": 1000,
          "reset": {"duration": 1, "unit": "MINUTE"}
        },
        "token": {
          "enabled": true,
          "count": 100000,
          "reset": {"duration": 1, "unit": "HOUR"}
        }
      }
    },
    "consumerLevel": {
      "global": {
        "request": {
          "enabled": true,
          "count": 100,
          "reset": {"duration": 1, "unit": "MINUTE"}
        }
      }
    }
  },

  "policies": [
    {
      "name": "pii-masking-regex",
      "version": "v1.0.0",
      "paths": [
        {
          "path": "/chat/completions",
          "methods": ["POST"],
          "params": {
            "rules": [
              {
                "name": "detect-email",
                "pattern": "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b",
                "action": "redact",
                "replacement": "[EMAIL]"
              }
            ]
          }
        }
      ]
    }
  ],

  "security": {
    "enabled": true,
    "apiKey": {
      "enabled": true,
      "in": "header",
      "key": "X-API-Key"
    }
  }
}

Out of Scope

Not Included in This Implementation

Cost Tracking - Real-time cost calculation based on token usage is not included. Gateways collect metrics, but cost analysis is out of scope.
Advanced Deployment Strategies - Blue/green deployments, canary releases, and phased rollouts are not supported.
Configuration Rollback - Automatic rollback on failed deployments is not implemented; manual intervention required.
Provider Approval Workflow - Multi-stage approval (pending → approved → active) is not enforced; providers are immediately available.
Provider Versioning - Historical configuration versions are not tracked; only current configuration is stored.
Gateway-Specific Overrides - Environment-specific configuration overrides (different API keys per environment) are not supported.
Drift Detection - Detecting when gateway runtime configuration diverges from control plane state is not implemented.
Policy Validation Against Gateway Schema - While gateways expose /policies endpoint, automatic validation of policy parameters against gateway schema is not implemented.
Model Usage Analytics - Tracking which models are used how often is out of scope.
Provider Health Checks - Periodic health checks to upstream LLM APIs (OpenAI, Anthropic) are not implemented.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Design Proposal] LLM Provider management #286

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

[Design Proposal] LLM Provider management #286

Uh oh!

Uh oh!

menakaj Feb 5, 2026 Collaborator

LLM Provider Management Implementation Proposal

Problem Statement

User Stories

AI Compliance Lead

Project Developer

Gateway Operator

Architecture Overview

Key Components

Design Principles

Data Model

Core Tables

1. llm_provider_templates

2. llm_providers

3. artifacts

4. llm_provider_deployments

API Design

Template Management

1. List Templates

2. Get Template Details

3. Create Custom Template

Provider Management

1. Create LLM Provider

2. List LLM Providers

3. Get Provider Details

4. Update Provider

5. Delete Provider

Deployment Management

1. Deploy to Gateways

2. Undeploy from Gateway

3. List Deployments

Policy Management

1. Add Policy to Provider

Gateway Internal API

1. Get Provider Configuration

Implementation Details

Technology Stack

Key Services

1. LLM Provider Template Service

2. LLM Provider Service

3. LLM Deployment Service

Database Indexes

Configuration Management

Provider Configuration Structure

Out of Scope

Not Included in This Implementation

Replies: 0 comments

menakaj
Feb 5, 2026
Collaborator

1. `llm_provider_templates`

2. `llm_providers`

3. `artifacts`

4. `llm_provider_deployments`