This document describes the architecture of the Red Hat Lightspeed Agent for Google Cloud.
The Red Hat Lightspeed Agent for Google Cloud is an A2A-ready (Agent-to-Agent) service that provides AI-powered access to Red Hat Insights. It is built using Google's Agent Development Kit (ADK) and integrates with Red Hat's MCP (Model Context Protocol) server for Insights data access.
The system consists of two separate services:
- Marketplace Handler - Always running service that handles provisioning and client registration
- Lightspeed Agent - The AI agent that handles user interactions (deployed after provisioning)
┌─────────────────────────────────────────────────────────────────────────────────┐
│ Google Cloud Marketplace │
│ (Gemini Enterprise / Procurement Events) │
└─────────────────────────────────────────────────────────────────────────────────┘
│ │
│ Pub/Sub Events │ DCR Request
│ (Account/Entitlement) │ (software_statement)
▼ ▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│ Marketplace Handler Service │
│ (Cloud Run - Always Running) │
│ ┌───────────────────────────────────────────────────────────────────────────┐ │
│ │ FastAPI Application │ │
│ │ ┌──────────────────────────────────────────────────────────────────────┐ │ │
│ │ │ Hybrid /dcr Endpoint │ │ │
│ │ │ - Pub/Sub Events → Approve accounts and entitlements │ │ │
│ │ │ - DCR Requests → Create OAuth clients via GMA SSO API │ │ │
│ │ └──────────────────────────────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────────┘
│ │
│ Store │ Create Client
▼ ▼
┌─────────────────┐ ┌─────────────────────────┐
│ PostgreSQL │ │ Red Hat SSO │
│ Database │◀──────────────────────────────▶│ │
│ - Accounts │ │ - GMA SSO API │
│ - Entitlements │ │ - OIDC/OAuth │
│ - DCR Clients │ └─────────────────────────┘
└─────────────────┘
▲
│ Read/Write
▼
┌─────────────────────────────────────────────────────────────────────────────────┐
│ Lightspeed Agent Service │
│ (Cloud Run - Deployed After Provisioning) │
│ ┌───────────────────────────────────────────────────────────────────────────┐ │
│ │ FastAPI Application │ │
│ │ ┌─────────────┐ ┌─────────────┐ ┌─────────────────┐ │ │
│ │ │ A2A API │ │ Agent Card │ │ Health/Ready │ │ │
│ │ │ / │ │ /.well- │ │ /health :8002 │ │ │
│ │ │ (JSON-RPC) │ │ known/ │ │ /ready :8002 │ │ │
│ │ └──────┬──────┘ │ agent.json │ └─────────────────┘ │ │
│ │ │ └─────────────┘ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────────────┐ │ │
│ │ │ Authentication Layer │ │ │
│ │ │ (JWT Validation via Red Hat SSO) │ │ │
│ │ └─────────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────────────┐ │ │
│ │ │ Agent Core │ │ │
│ │ │ (Google ADK + Gemini) │ │ │
│ │ └─────────────────────────────────────────────────────────────────┘ │ │
│ │ │ │ │
│ │ ▼ │ │
│ │ ┌─────────────────────────────────────────────────────────────────┐ │ │
│ │ │ MCP Sidecar │ │ │
│ │ │ (Red Hat Lightspeed MCP Server) │ │ │
│ │ └─────────────────────────────────────────────────────────────────┘ │ │
│ └───────────────────────────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────────────────────────┘
│ │
▼ ▼
┌─────────────┐ ┌─────────────────────────┐
│ Gemini │ │ Red Hat Insights APIs │
│ API │ │ (via MCP Server) │
│ (Vertex) │ │ - Advisor │
└─────────────┘ │ - Vulnerability │
│ - Patch │
│ - Content │
└─────────────────────────┘
The system is split into two services for important operational reasons:
| Service | Purpose | Lifecycle |
|---|---|---|
| Marketplace Handler | Handles provisioning and DCR | Always running (minScale=1) |
| Lightspeed Agent | AI agent for user queries | Deployed after provisioning |
- Marketplace Handler must be always running to receive Pub/Sub events from Google Cloud Marketplace for account and entitlement approvals
- Agent can be deployed on-demand after a customer has been provisioned
- Separation of concerns: Provisioning logic is isolated from agent logic
- Independent scaling: Handler scales for provisioning traffic, Agent scales for user traffic
A separate FastAPI application for provisioning, providing:
- Hybrid /dcr Endpoint: Single endpoint handling both:
- Pub/Sub events (account and entitlement approvals, filtered by product)
- DCR requests (OAuth client creation)
- Health Endpoints: Kubernetes-compatible health checks on separate probe port (8003, configurable via
HANDLER_PROBE_PORT) - Database Access: PostgreSQL for persistent storage
The main AI agent FastAPI application, providing:
- A2A Endpoints: Agent-to-Agent protocol implementation (JSON-RPC)
- Agent Card:
/.well-known/agent.jsonwith capabilities and DCR extension - Health Endpoints: Kubernetes-compatible health and readiness checks on separate probe port (8002, configurable via
AGENT_PROBE_PORT)
Handles all authentication and authorization:
- Token Introspection: Validates tokens via Red Hat SSO introspection endpoint (RFC 7662)
- Scope Checking: Checks for required
api.consoleandapi.ocmscopes; rejects tokens carrying scopes outside the configured allowlist - Bypass for Discovery:
/.well-known/agent.jsonis public per A2A spec
The AI agent built with Google ADK:
- Gemini Model: Uses Gemini 2.5 Flash for natural language understanding
- Tool Orchestration: Manages tool calls to MCP server
- Session Management: Maintains conversation context
Runs as a sidecar container connecting to Red Hat Insights:
- Tool Discovery: Discovers available Insights tools
- Tool Execution: Executes tools and returns results
- Authentication: Handles service account authentication to Red Hat APIs
This flow happens when a customer purchases from Google Cloud Marketplace:
1. Customer purchases from Google Cloud Marketplace
2. Marketplace sends Pub/Sub event to Marketplace Handler
3. Handler receives POST /dcr with Pub/Sub message wrapper
4. Handler filters by product (SERVICE_CONTROL_SERVICE_NAME) — account events pass through
5. Handler extracts event type (ACCOUNT_CREATION_REQUESTED, ENTITLEMENT_CREATION_REQUESTED, etc.)
6. Handler calls Google Procurement API to approve account, then entitlement
7. Handler stores entitlement in PostgreSQL
8. Customer is now provisioned for the service
┌─────────────┐ ┌───────────────┐ ┌────────────────┐ ┌────────────┐
│ Customer │────▶│ Marketplace │────▶│ Pub/Sub │────▶│ Handler │
│ Purchases │ │ (Purchase) │ │ (Event Push) │ │ /dcr │
└─────────────┘ └───────────────┘ └────────────────┘ └─────┬──────┘
│
┌─────────────────┐ │
│ PostgreSQL │◀────────────┤
│ (Store) │ │
└─────────────────┘ │
▼
┌─────────────────────────────────────┐
│ Google Procurement API │
│ (Approve Entitlement) │
└─────────────────────────────────────┘
This flow happens when an admin configures the agent in Gemini Enterprise:
1. Admin configures agent in Gemini Enterprise
2. Gemini sends POST /dcr with software_statement JWT
3. Handler validates Google's JWT signature
4. Handler verifies order_id matches a provisioned entitlement
5. Handler calls GMA SSO API to create OAuth tenant client
6. Handler stores client mapping in PostgreSQL
7. Handler returns client_id, client_secret to Gemini
8. Gemini stores credentials for future OAuth flows
┌─────────────┐ ┌──────────────┐ ┌─────────────────┐ ┌────────────┐
│ Admin │────▶│ Gemini │────▶│ POST /dcr │────▶│ Handler │
│ Configures │ │ Enterprise │ │ software_stmt │ │ /dcr │
└─────────────┘ └──────────────┘ └─────────────────┘ └─────┬──────┘
│
┌──────────────────────────────────────────────┤
│ │
▼ ▼
┌─────────────────┐ ┌─────────────────┐
│ PostgreSQL │ │ Red Hat SSO │
│ (Check Order) │ │ (Create OAuth │
│ (Store Client)│ │ Client) │
└─────────────────┘ └─────────────────┘
Clients obtain access tokens directly from Red Hat SSO using their DCR-issued credentials. The agent does not participate in token issuance — it acts purely as a Resource Server.
1. Client authenticates directly with Red Hat SSO (e.g., client_credentials grant)
2. Red Hat SSO issues access token with api.console and api.ocm scopes
3. Client uses the token for A2A requests to the agent
This flow handles actual user interactions with the agent:
1. User sends query to / endpoint (A2A JSON-RPC)
2. JWT token validated against Red Hat SSO
3. Query passed to Agent Core
4. Agent processes query with Gemini
5. Agent calls MCP tools as needed
6. MCP sidecar queries Red Hat Insights APIs
7. Results aggregated and returned to user
src/lightspeed_agent/
├── api/ # Agent API layer
│ ├── app.py # FastAPI application factory (Agent)
│ └── a2a/ # A2A protocol
│ ├── router.py # A2A JSON-RPC endpoints
│ └── agent_card.py # AgentCard builder
├── auth/ # Authentication (shared)
│ ├── introspection.py # Token introspection (RFC 7662)
│ ├── middleware.py # Auth middleware
│ ├── dependencies.py # FastAPI dependencies
│ └── models.py # Auth data models
├── config/ # Configuration (shared)
│ └── settings.py # Pydantic settings
├── core/ # Agent core
│ └── agent.py # ADK agent definition
├── db/ # Database (shared)
│ ├── base.py # SQLAlchemy engine and Base
│ └── models.py # ORM models (accounts, entitlements, DCR clients, usage)
├── dcr/ # Dynamic Client Registration
│ ├── google_jwt.py # Google JWT validation
│ ├── gma_client.py # GMA SSO API client
│ ├── models.py # DCR Pydantic models
│ ├── repository.py # PostgreSQL repository
│ └── service.py # DCR business logic
├── marketplace/ # Marketplace Handler service
│ ├── app.py # Handler FastAPI app factory (port 8001)
│ ├── router.py # Hybrid /dcr endpoint (Pub/Sub + DCR)
│ ├── models.py # Marketplace Pydantic models
│ ├── repository.py # PostgreSQL repositories
│ ├── service.py # Procurement API integration
│ └── __main__.py # Entry point: python -m lightspeed_agent.marketplace
└── tools/ # MCP integration
├── mcp_config.py # MCP server configuration
├── mcp_headers.py # MCP auth headers
├── insights_tools.py # Insights tool wrappers
└── skills.py # Agent skills definition
| Image | Service | Port | Purpose |
|---|---|---|---|
lightspeed-agent |
Agent | 8000 (app), 8002 (probes) | A2A protocol, user queries |
marketplace-handler |
Handler | 8001 (app), 8003 (probes) | Pub/Sub events, DCR |
insights-mcp |
MCP Sidecar | 8081 | Red Hat Lightspeed tools |
| Service | Used By | Purpose | Required |
|---|---|---|---|
| Google Gemini | Agent | AI model for queries | Yes |
| Red Hat SSO | Both | User authentication, DCR | Yes |
| Red Hat Lightspeed MCP | Agent | Data access | Yes |
| PostgreSQL | Both | Data persistence | Yes (Production) |
| Google Cloud Pub/Sub | Handler | Marketplace events | Production |
| Google Procurement API | Handler | Entitlement approval, account validation | Production |
| Google Service Control | Agent | Usage reporting | Production |
- Both services are stateless and can scale horizontally
- State stored in PostgreSQL (shared by both services)
- Rate limits enforced via Redis (shared across replicas)
| Service | Min Instances | Max Instances | Notes |
|---|---|---|---|
| Marketplace Handler | 1 | 5 | Always running for Pub/Sub |
| Lightspeed Agent | 0 | 10 | Scale to zero when idle |
| Service | CPU | Memory | Notes |
|---|---|---|---|
| Marketplace Handler | 1 | 512Mi | Lightweight, event-driven |
| Lightspeed Agent | 2 | 2Gi | AI processing, MCP calls |
| MCP Sidecar | 0.5 | 256Mi | Red Hat Insights queries |
- Database connections pooled via SQLAlchemy
- HTTP connections to external services pooled via httpx
- Both services share the same PostgreSQL database
- A2A query endpoints require valid Bearer token from Red Hat SSO
- Tokens validated via Red Hat SSO introspection endpoint (RFC 7662)
- Required
api.consoleandapi.ocmscopes checked; returns 403 if missing or if token carries disallowed scopes
Certain endpoints must be publicly accessible per A2A protocol:
| Service | Endpoint | Port | Reason |
|---|---|---|---|
| Agent | /.well-known/agent.json |
8000 | A2A discovery (no auth per spec) |
| Handler | /dcr |
8001 | Pub/Sub push and DCR requests |
| Agent | /health, /ready |
8002 | Health probes (separate server, no auth) |
| Handler | /health, /ready |
8003 | Health probes (separate server, no auth) |
Both services are deployed with --allow-unauthenticated on Cloud Run.
Authentication is enforced at the application layer via OAuth middleware.
- Scope-based access control for authenticated endpoints
- Client ID extracted for usage tracking
- Organization ID used for multi-tenancy
- DCR requests validated via Google JWT signature
- Secrets stored in environment variables
- Production uses Google Secret Manager
- No secrets in code or configuration files
- DCR encryption key protects stored client secrets
- HTTPS enforced in production
- CORS disabled in production by default (server-to-server); configurable via
CORS_ALLOWED_ORIGINS - Rate limiting prevents abuse (runs before authentication to throttle unauthenticated floods)
- Request body size limits enforced via ASGI middleware (10 MB agent, 1 MB marketplace handler) to mitigate CWE-400 uncontrolled resource consumption
- Security headers on all responses (HSTS, X-Content-Type-Options, X-Frame-Options)
- AgentCard responses cached at the application level to reduce CPU cost under load
- Pub/Sub verification via message signature
The system uses PostgreSQL for persistence. For production deployments, the marketplace database (shared by both services) is separate from the session database (agent only).
┌─────────────────────────────────────────────────────────────────────────────┐
│ Marketplace Database (Shared) │
│ │
│ ┌────────────────────┐ ┌────────────────────┐ ┌────────────────────┐ │
│ │ marketplace_ │ │ marketplace_ │ │ dcr_clients │ │
│ │ accounts │ │ entitlements │ │ │ │
│ │ - id │ │ - id (order_id) │ │ - client_id │ │
│ │ - state │ │ - account_id │ │ - client_secret │ │
│ │ - provider_id │ │ - state │ │ - order_id │ │
│ └────────────────────┘ └────────────────────┘ └────────────────────┘ │
│ │
│ ┌────────────────────┐ │
│ │ usage_records │ │
│ │ - order_id │ │
│ │ - tokens │ │
│ │ - reported │ │
│ └────────────────────┘ │
│ │
│ Access: Marketplace Handler (read/write), Agent (read-only for validation) │
└─────────────────────────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────────────────────────┐
│ Session Database (Agent Only) │
│ │
│ ┌────────────────────┐ ┌────────────────────┐ ┌────────────────────┐ │
│ │ sessions │ │ events │ │ artifacts │ │
│ │ - session_id │ │ - event_id │ │ - artifact_id │ │
│ │ - user_id │ │ - session_id │ │ - session_id │ │
│ │ - state │ │ - content │ │ - content │ │
│ └────────────────────┘ └────────────────────┘ └────────────────────┘ │
│ │
│ Access: Agent only (read/write) │
└─────────────────────────────────────────────────────────────────────────────┘
| Variable | Service | Description |
|---|---|---|
DATABASE_URL |
Both | Marketplace database (accounts, orders, DCR clients) |
SESSION_DATABASE_URL |
Agent | Session database (ADK sessions). If empty, uses DATABASE_URL |
Status: Accepted
Context: Google Cloud Marketplace requires agents to implement DCR (RFC 7591) to create OAuth client credentials for each marketplace order. Options considered: (1) return tracking credentials without creating real OAuth clients, or (2) create actual OAuth clients in Red Hat SSO via the GMA SSO API.
Decision: Implement real DCR with Red Hat SSO via the GMA SSO API. Each order gets a real, functioning OAuth client with proper OAuth 2.0 flow and per-order isolation.
Consequences: Requires GMA API credentials (GMA_CLIENT_ID / GMA_CLIENT_SECRET) with api.iam.clients.gma scope. More complex setup but more robust architecture.
Status: Accepted
Context: Marketplace accounts, entitlements, DCR clients, and usage records need durable storage that survives container restarts and supports horizontal scaling.
Decision: Use PostgreSQL with SQLAlchemy async for all persistence.
Consequences: Adds SQLAlchemy and asyncpg dependencies. Enables horizontal scaling (multiple instances share state) and provides durability and auditability.
Status: Accepted
Context: Not all deployments have DCR enabled on Red Hat SSO, and development/testing environments may not need real DCR.
Decision: Make DCR mode configurable via DCR_ENABLED. When true (default), real OAuth clients are created via the GMA SSO API. When false, static credentials from environment variables are returned.
Consequences: Two code paths to maintain. Clear documentation needed for each mode. See Authentication for details.