[Design Proposal] LLM Provider management #286
menakaj
started this conversation in
Design Proposals
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
LLM Provider Management Implementation Proposal
Problem Statement
Organizations deploying AI agents need centralized control over LLM (Large Language Model) providers to ensure secure, compliant, and cost-effective AI operations. Currently, the system needs:
Organization-Level LLM Provider Registry - A centralized catalog where organizations can register approved LLM providers (OpenAI, Anthropic, Azure OpenAI, etc.) with standardized configurations, making them available for use across all projects.
Template-Based Provider Creation - Standardized templates for common LLM providers (OpenAI, Anthropic, Azure OpenAI) that enforce best practices for authentication, security, and API configuration.
Policy Enforcement at the Provider Level - The ability to attach guardrails (PII redaction, prompt injection detection), rate limits, and security policies to providers centrally, ensuring consistent policy enforcement across all usages.
Multi-Gateway Deployment - Deploy LLM providers to multiple gateways simultaneously (dev, staging, production) with real-time configuration updates via WebSocket.
Model Catalog Management - Track which AI models are available from each provider, enabling model-specific routing and governance.
Version Control and Configuration Management - Track configuration changes over time, supporting rollback and audit requirements.
User Stories
AI Compliance Lead
Provider Registration & Governance:
As an AI Compliance Lead, I want to register approved LLM providers at the organization level with authentication credentials and network policies so that only compliant providers can be used across all projects.
As an AI Compliance Lead, I want to use pre-built templates (OpenAI, Anthropic, Azure OpenAI) to quickly register providers with security best practices built-in.
As an AI Compliance Lead, I want to view all registered providers across my organization with their security configurations so that I can audit compliance.
Policy Management:
As an AI Compliance Lead, I want to attach guardrail policies (PII detection, prompt injection detection, content filtering) to providers so that all agent interactions are automatically protected.
As an AI Compliance Lead, I want to configure rate limits per provider to prevent cost overruns and ensure fair resource allocation across projects.
As an AI Compliance Lead, I want policy updates to automatically redeploy to all gateways so that security changes take effect immediately.
Model Governance:
As an AI Compliance Lead, I want to specify which AI models are available from each provider so that I can restrict access to specific models (e.g., only GPT-4, not GPT-4-turbo).
As an AI Compliance Lead, I want to see which models are deployed where so that I can track model usage across the organization.
Multi-Gateway Deployment:
As an AI Compliance Lead, I want to deploy provider configurations to multiple gateways (dev, staging, production) so that security policies are consistent across environments.
As an AI Compliance Lead, I want deployment status visibility so that I can verify all gateways are running the latest configurations.
Project Developer
Provider Discovery:
As a project developer, I want to browse the organization's catalog of approved LLM providers so that I can choose the right provider for my AI agent.
As a project developer, I want to see which models are available from each provider so that I can select the appropriate model for my use case.
Provider Usage:
As a project developer, I want to reference organization-level providers in my agent configurations so that I inherit all security policies automatically.
As a project developer, I want to deploy my agent knowing that the LLM provider is already configured with authentication and security policies.
Gateway Operator
Real-Time Configuration:
As a gateway operator, I want my gateway to receive LLM provider configurations via WebSocket so that deployments happen in real-time without manual intervention.
As a gateway operator, I want to fetch full provider configurations (including OpenAPI specs) when deployment events arrive so that routes are generated dynamically.
Architecture Overview
The implemented solution uses a centralized provider registry with WebSocket-based deployment orchestration where Agent Manager serves as the control plane for LLM provider lifecycle management.
Key Components
Design Principles
1. Template-Driven Configuration
LLM providers are created from templates that encode provider-specific knowledge (authentication methods, API endpoints, supported models). Templates ensure consistent, secure configurations.
2. Organization-Scoped Provider Registry
Providers are registered at the organization level, creating a shared catalog that all projects can reference. This enables centralized governance and policy enforcement.
3. JSONB Configuration Storage
Provider configurations (including policies, security settings, upstream details) are stored as JSONB in PostgreSQL, allowing flexible schema evolution without database migrations.
4. Event-Driven Deployment
Deployments trigger WebSocket events to connected gateways. Gateways fetch full configurations on-demand and apply them to Envoy via xDS.
5. Policy-as-Configuration
Guardrails, rate limits, and security policies are stored in the provider's
configuration.policies[]array. UI provides structured APIs, but everything is stored as policies.6. Artifact-Based Model
LLM providers are artifacts (like APIs) with shared lifecycle properties (handle, name, version, organization). This enables consistent CRUD operations and deployment tracking.
Data Model
Core Tables
1.
llm_provider_templatesTemplates define provider-specific configuration patterns (OpenAI, Anthropic, Azure OpenAI, etc.).
Key Fields:
handle- Template identifier (e.g., "openai", "anthropic", "azure-openai")configuration- Template configuration with placeholders for credentialsorganization_uuid- Templates can be organization-specific or globalExample Template (OpenAI):
2.
llm_providersLLM providers are organization-level resources created from templates.
Key Fields:
uuid- Shared withartifactstable (LLM providers ARE artifacts)template_uuid- Template this provider was created fromconfiguration- Full provider configuration (JSONB)openapi_spec- Generated OpenAPI spec for this providermodel_list- JSON array of available modelsstatus- CREATED, PENDING, DEPLOYED, FAILEDConfiguration JSONB Structure:
{ "name": "openai-gpt4", "version": "v1", "context": "/openai", "vhost": "openai.acme.local", "template": "openai", "upstream": { "url": "https://api.openai.com/v1", "auth": { "type": "api-key", "header": "Authorization", "secretRef": "openai-api-key-prod" }, "tls": { "minVersion": "1.3" } }, "accessControl": { "mode": "ALLOW_ALL", "exceptions": [] }, "rateLimiting": { "providerLevel": { "global": { "request": { "enabled": true, "count": 1000, "reset": { "duration": 1, "unit": "MINUTE" } }, "token": { "enabled": true, "count": 100000, "reset": { "duration": 1, "unit": "HOUR" } } } } }, "policies": [ { "name": "pii-masking-regex", "version": "v1.0.0", "paths": [ { "path": "/chat/completions", "methods": ["POST"], "params": { "rules": [ { "name": "detect-email", "pattern": "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b", "action": "redact", "replacement": "[EMAIL]" } ] } } ] } ], "security": { "enabled": true, "apiKey": { "enabled": true, "in": "header", "key": "X-API-Key" } } }3.
artifactsLLM providers are a type of artifact, sharing the artifact table.
For LLM Providers:
kind= 'LLM_PROVIDER'handle= Provider identifier (e.g., "openai-gpt4")name= Human-readable nameversion= Configuration version4.
llm_provider_deploymentsTracks which providers are deployed to which gateways.
Status Lifecycle:
pending- Deployment event sent, awaiting gateway confirmationdeployed- Gateway successfully applied configurationfailed- Gateway reported deployment failureundeployed- Removed from gatewayAPI Design
All LLM provider endpoints are scoped to organizations:
/orgs/{orgName}/llm-providers/*Template Management
1. List Templates
2. Get Template Details
3. Create Custom Template
Provider Management
1. Create LLM Provider
Behavior:
kind = LLM_PROVIDERmodel_listJSONCREATED2. List LLM Providers
3. Get Provider Details
4. Update Provider
Behavior:
redeployToAll = true, broadcasts update events to all deployed gateways5. Delete Provider
Behavior:
Deployment Management
1. Deploy to Gateways
Deployment Flow:
status = pendingis_active = true)llm.deployedWebSocket eventpendingWebSocket Event Payload:
{ "type": "llm.deployed", "payload": { "llmProviderId": "provider-uuid", "deploymentId": "deployment-uuid", "gatewayId": "gateway-uuid" }, "timestamp": "2026-02-13T12:00:00Z", "correlationId": "correlation-uuid" }Gateway Action:
llm.deployedevent via WebSocketGET /internal/llm-providers/{providerId}to fetch full configPOST /internal/deployments/{deploymentId}/statuswith result2. Undeploy from Gateway
Behavior:
llm.undeployedevent to gatewayundeployed3. List Deployments
Policy Management
Policies are stored in the
configuration.policies[]array. These APIs provide a structured interface for managing policies.1. Add Policy to Provider
Backend Logic:
configuration.policies[]arrayGateway Internal API
These endpoints are called by gateways using token authentication.
1. Get Provider Configuration
Purpose:
When gateway receives
llm.deployedevent, it calls this endpoint to fetch the complete provider configuration and OpenAPI spec for route generation.Implementation Details
Technology Stack
Control Plane:
Key Services
1. LLM Provider Template Service
Responsibilities:
Methods:
2. LLM Provider Service
Responsibilities:
Methods:
3. LLM Deployment Service
Responsibilities:
Deployment Flow:
Database Indexes
Performance-Critical Indexes:
Configuration Management
Provider Configuration Structure
The provider configuration is stored as JSONB and contains all deployment-related settings:
{ "name": "openai-gpt4", "version": "v1", "context": "/openai", "vhost": "openai.acme.local", "template": "openai", "upstream": { "url": "https://api.openai.com/v1", "auth": { "type": "api-key", "header": "Authorization", "secretRef": "openai-api-key-prod" }, "tls": { "minVersion": "1.3" } }, "accessControl": { "mode": "ALLOW_ALL", "exceptions": [] }, "rateLimiting": { "providerLevel": { "global": { "request": { "enabled": true, "count": 1000, "reset": {"duration": 1, "unit": "MINUTE"} }, "token": { "enabled": true, "count": 100000, "reset": {"duration": 1, "unit": "HOUR"} } } }, "consumerLevel": { "global": { "request": { "enabled": true, "count": 100, "reset": {"duration": 1, "unit": "MINUTE"} } } } }, "policies": [ { "name": "pii-masking-regex", "version": "v1.0.0", "paths": [ { "path": "/chat/completions", "methods": ["POST"], "params": { "rules": [ { "name": "detect-email", "pattern": "\\b[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\\.[A-Z|a-z]{2,}\\b", "action": "redact", "replacement": "[EMAIL]" } ] } } ] } ], "security": { "enabled": true, "apiKey": { "enabled": true, "in": "header", "key": "X-API-Key" } } }Out of Scope
Not Included in This Implementation
Cost Tracking - Real-time cost calculation based on token usage is not included. Gateways collect metrics, but cost analysis is out of scope.
Advanced Deployment Strategies - Blue/green deployments, canary releases, and phased rollouts are not supported.
Configuration Rollback - Automatic rollback on failed deployments is not implemented; manual intervention required.
Provider Approval Workflow - Multi-stage approval (pending → approved → active) is not enforced; providers are immediately available.
Provider Versioning - Historical configuration versions are not tracked; only current configuration is stored.
Gateway-Specific Overrides - Environment-specific configuration overrides (different API keys per environment) are not supported.
Drift Detection - Detecting when gateway runtime configuration diverges from control plane state is not implemented.
Policy Validation Against Gateway Schema - While gateways expose
/policiesendpoint, automatic validation of policy parameters against gateway schema is not implemented.Model Usage Analytics - Tracking which models are used how often is out of scope.
Provider Health Checks - Periodic health checks to upstream LLM APIs (OpenAI, Anthropic) are not implemented.
Beta Was this translation helpful? Give feedback.
All reactions