Technical Guide for Production Deployment
This document describes how to configure Red Hat Build of Keycloak (RHBK) to provide JWT authentication for the Cost Management Metrics Operator with proper org_id claim support required by the ROS (Resource Optimization Service) backend.
graph TB
Operator["<b>Cost Management Operator</b><br/>Uploads metrics with JWT"]
Keycloak["<b>Red Hat Build of Keycloak (RHBK)</b><br/><br/>• Realm: kubernetes<br/>• Client: cost-management-operator<br/>• org_id claim mapper"]
Gateway["<b>Centralized API Gateway</b><br/>(Port 9080)<br/><br/>• JWT signature validation<br/>• Inject X-Rh-Identity header<br/>• Route to backend services"]
Ingress["<b>Ingress Service</b><br/>(Port 8081)<br/><br/>• Receive pre-authenticated requests<br/>• Extract org_id/account from X-Rh-Identity<br/>• Process upload<br/>• Publish to Kafka"]
Kafka["<b>Kafka</b><br/><br/>• Topic: platform.upload.ros<br/>• Message includes org_id"]
Backend["<b>ROS Backend Processor</b><br/><br/>• Consumes from Kafka<br/>• Creates XRHID header<br/>• Calls API with org_id"]
Operator -->|"① Get JWT<br/>(client_credentials)"| Keycloak
Operator -->|"② Upload<br/>Authorization: Bearer <JWT>"| Gateway
Gateway -->|"③ Validate JWT<br/>X-Rh-Identity: <base64>"| Ingress
Ingress -->|"④ Parse identity<br/>Publish message"| Kafka
Kafka -->|"⑤ Message with<br/>org_id metadata"| Backend
style Operator fill:#e1f5ff,stroke:#01579b,stroke-width:2px,color:#000
style Keycloak fill:#fff9c4,stroke:#f57f17,stroke-width:2px,color:#000
style Gateway fill:#fff59d,stroke:#333,stroke-width:2px,color:#000
style Ingress fill:#e8f5e9,stroke:#1b5e20,stroke-width:2px,color:#000
style Kafka fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000
style Backend fill:#fce4ec,stroke:#880e4f,stroke-width:2px,color:#000
Authentication Flow:
- Operator → Gateway:
Authorization: Bearer <JWT>(Standard OAuth 2.0) - Gateway → Backend Services:
X-Rh-Identity: <base64-XRHID>(Pre-authenticated) - Ingress → Kafka: Message with
org_idextracted from X-Rh-Identity - Backend Processor → Cost Management On-Premise API:
X-Rh-Identity: <base64-XRHID>(XRHID-based auth)
Key Points:
- Centralized Gateway validates JWT signature and injects
X-Rh-Identityheader for all backend services - Backend Services (Ingress, Koku API, ROS API) receive pre-authenticated requests with
X-Rh-Identity - Sources API is now integrated into Koku API at
/api/cost-management/v1/sources/ - XRHID Format:
{"org_id":"...","identity":{"org_id":"...","account_number":"...","type":"User"}}(base64-encoded) - The
org_idclaim from JWT is required and used throughout the system
⚠️ Important: Dual org_id PlacementThe X-Rh-Identity header includes
org_idin two locations:
- Top-level:
{"org_id": "org1234567", ...}- Workaround for Koku dev_middleware bug- Inside identity:
{"identity": {"org_id": "org1234567", ...}}- Correct location per Red Hat schemaThis dual placement is required because Koku's
dev_middleware.py(line 63) incorrectly readsidentity_header.get("org_id")instead ofidentity_header.get("identity", {}).get("org_id"). The Envoy Lua filter placesorg_idin both locations for compatibility.
-
JWT Token must contain:
- Standard OIDC claims (
sub,iat,exp,iss,aud) org_idclaim (String) - REQUIRED by ROS backend for organization identificationaccount_numberclaim (String) - REQUIRED for account-level data isolation and tenant identificationemailclaim (String) - REQUIRED by Koku for user creation (if not provided, defaults tousername@example.com)preferred_usernameclaim (String) - Recommended for user display name
- Standard OIDC claims (
-
Supported org_id Claim Names (Envoy Lua filter supports multiple alternatives):
org_id(preferred)organization_id(fallback)tenant_id(second fallback)
-
Supported account_number Claim Names (Gateway Lua filter supports multiple alternatives):
account_number(preferred)account_id(fallback)account(second fallback)
Implementation Reference: See
cost-onprem/templates/gateway/configmap-envoy.yamlLua filter section -
Keycloak Configuration:
- Service account client (client_credentials grant type)
- Hardcoded claim mapper for
org_id(REQUIRED) - Hardcoded claim mapper for
account_number(RECOMMENDED) - Proper audience and scope configuration
-
Operator Configuration:
- Secret with client_id and client_secret
- Token URL pointing to Keycloak realm
-
UI/Interactive User Configuration (for Cost Management UI access):
- Uses ENHANCED_ORG_ADMIN mode - all authenticated users are org admins
- No
accessclaim required in JWT - Simplified setup with full access within each user's org
The Cost Management on-premise deployment uses ENHANCED_ORG_ADMIN mode for simplified authentication:
| Setting | Value | Purpose |
|---|---|---|
DEVELOPMENT |
false |
Use production middleware (fixes Mock user bugs) |
ENHANCED_ORG_ADMIN |
true |
Bypass external RBAC service for admin users |
is_org_admin |
true (in X-Rh-Identity) |
Grant full access within user's org |
How it works:
- Envoy validates JWT and extracts
org_id,account_number, andusername - Envoy constructs X-Rh-Identity header with
is_org_admin: true - Koku receives the request and sees
is_org_admin=true - With
ENHANCED_ORG_ADMIN=true, Koku skips RBAC service calls - User gets full access to all resources within their org
Benefits:
- ✅ No external RBAC service required
- ✅ No
accessclaim needed in JWT or Keycloak user attributes - ✅ Simpler Keycloak configuration
- ✅ Production middleware (no Mock user bugs)
Limitations:
⚠️ All authenticated users are org admins (no granular RBAC within org)⚠️ Cannot restrict "User A sees only Cluster X" within same org- ✅ Multi-tenancy IS preserved: users only see their own org's data
When to use:
- Single-tenant deployments
- Trusted-user environments
- Deployments where all users should have full access
For deployments requiring granular permissions within an org, a future enhancement could:
- Deploy a custom RBAC service
- Set
ENHANCED_ORG_ADMIN=false - Add
accessclaim to JWT with resource-level permissions - Configure Koku to call the RBAC service
Resource Types (for future RBAC implementation):
| Resource Type | Description |
|---|---|
openshift.cluster |
OpenShift cluster access |
openshift.project |
OpenShift project/namespace access |
openshift.node |
OpenShift node access |
cost_model |
Cost model read/write access |
Note: AWS, Azure, and GCP providers are not currently supported in on-prem deployments.
To create a user that can access Cost Management UI, you must configure the following user attributes in Keycloak:
Required User Attributes:
| Attribute | Type | Description | Example |
|---|---|---|---|
org_id |
String | Tenant identifier (maps to database schema) | org1234567 |
account_number |
String | Customer account identifier | 7890123 |
Note: The
accessattribute is NOT required when using ENHANCED_ORG_ADMIN mode. All authenticated users are treated as org admins with full access within their org.
Creating a User via Keycloak Admin Console:
-
Navigate to Users → Add User
-
Fill in username, email, first/last name
-
Go to Attributes tab
-
Add the following attributes:
Key Value org_idorg1234567account_number7890123 -
Go to Credentials tab and set a password
-
Ensure user is enabled
Creating a User via Keycloak Admin API:
curl -X POST "$KEYCLOAK_URL/admin/realms/kubernetes/users" \
-H "Authorization: Bearer $ADMIN_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"username": "cost-user",
"email": "cost-user@example.com",
"emailVerified": true,
"enabled": true,
"firstName": "Cost",
"lastName": "User",
"attributes": {
"org_id": ["org1234567"],
"account_number": ["7890123"]
}
}'Note: The
org_idcurrently requires theorgprefix (e.g.,org1234567instead of1234567) as a workaround for a Koku schema naming bug. This will be addressed in a future release.
The authentication flow uses a centralized API gateway that handles JWT validation for all external traffic:
Service: cost-onprem-gateway (Port 9080)
Location: cost-onprem/templates/gateway/configmap-envoy.yaml
Purpose: Single point of authentication for all external API traffic. Validates JWT tokens, extracts claims, and injects X-Rh-Identity header for all backend services.
Authentication Flow (Sequence Diagram):
sequenceDiagram
participant Operator as Cost Management<br/>Operator
participant Gateway as API Gateway<br/>(Port 9080)
participant Keycloak as Keycloak<br/>JWKS Endpoint
participant Lua as Lua Filter
participant Backend as Backend Services<br/>(Ingress, Koku, ROS, Sources)
Note over Operator,Backend: Step 1: Request with JWT
Operator->>Gateway: POST /api/ingress/v1/upload<br/>Authorization: Bearer <JWT>
Note over Gateway,Keycloak: Step 2: JWT Validation
Gateway->>Keycloak: GET /auth/realms/kubernetes/protocol/openid-connect/certs
Keycloak-->>Gateway: JWKS (public keys)
Note over Gateway: jwt_authn filter:<br/>- Validates JWT signature<br/>- Verifies issuer<br/>- Verifies audience<br/>- Extracts payload
alt JWT Invalid
Gateway-->>Operator: 401 Unauthorized<br/>"Jwt verification fails"
else JWT Valid
Note over Gateway: Store JWT payload in metadata<br/>Key: "keycloak"
Note over Gateway,Lua: Step 3: Transform JWT to XRHID
Gateway->>Lua: envoy_on_request()
Note over Lua: Extract claims:<br/>- org_id (or fallbacks)<br/>- account_number (or fallbacks)<br/>- user_id (sub)
Note over Lua: Build XRHID JSON:<br/>{"identity":{"org_id":"...","account_number":"...","type":"User"}}
Note over Lua: Base64 encode XRHID
Lua-->>Gateway: Modified request with X-Rh-Identity header
Note over Gateway,Backend: Step 4: Route to Backend
Gateway->>Backend: POST /api/ingress/v1/upload<br/>X-Rh-Identity: <base64-XRHID>
Backend-->>Gateway: 202 Accepted
Gateway-->>Operator: 202 Accepted
end
Gateway JWT Validation Steps:
- Receives request with
Authorization: Bearer <JWT>header - Validates JWT signature against Keycloak JWKS endpoint (cached for 5 minutes)
- Verifies
issuermatches Keycloak realm URL - Verifies
audiencecontains expected client ID (e.g.,cost-management-operator) - Stores validated JWT payload in Envoy metadata under key
keycloakfor Lua filter access
Gateway Lua Filter: Transforms JWT to XRHID format
-
Retrieves JWT payload from Envoy metadata:
local metadata = request_handle:streamInfo():dynamicMetadata() local jwt_data = metadata:get("envoy.filters.http.jwt_authn") local payload = jwt_data["keycloak"]
-
Extracts
org_idwith fallback logic:-- Tries: org_id → organization_id → tenant_id local org_id = get_claim(payload, "org_id", "organization_id", "tenant_id") -- Default: "1" if missing (with warning)
-
Extracts
account_numberwith fallback logic:-- Tries: account_number → account_id → account local account_number = get_claim(payload, "account_number", "account_id", "account") -- Default: org_id value if missing
-
Builds XRHID JSON structure (Red Hat Identity format with dual org_id):
-- NOTE: org_id appears in TWO places as a workaround for Koku middleware compatibility -- 1. Top-level "org_id" - for middleware compatibility -- 2. identity.org_id - correct location per Red Hat identity schema -- is_org_admin=true triggers ENHANCED_ORG_ADMIN bypass (no RBAC service calls) local xrhid = string.format( '{"org_id":"%s","identity":{"org_id":"%s","account_number":"%s","type":"User","user":{"username":"%s","is_org_admin":true}},"entitlements":{"cost_management":{"is_entitled":true}}}', org_id, org_id, account_number, username )
-
Base64 encodes the XRHID JSON and injects headers:
local b64_xrhid = request_handle:base64Escape(xrhid) request_handle:headers():add("X-Rh-Identity", b64_xrhid)
Header Injection:
X-Rh-Identity(REQUIRED): Base64-encoded XRHID JSON used by all backend services for:- Authentication and authorization
- Multi-tenancy (org_id and account_number extraction)
- Database query filtering
- Audit logging
Gateway Routing: The gateway routes requests to backend services based on URL path:
| Path | Backend Service | Port |
|---|---|---|
/api/ingress/* |
Ingress | 8081 |
/api/cost-management/v1/recommendations/openshift |
ROS API | 8000 |
/api/cost-management/* (GET, HEAD) |
Koku API Reads | 8000 |
/api/cost-management/* (POST, PUT, DELETE, PATCH) |
Koku API Writes | 8000 |
Note: Sources API is now integrated into Koku at /api/cost-management/v1/sources/
Request Flow:
Cost Management Operator
↓ Authorization: Bearer <JWT>
Centralized Gateway (port 9080)
↓ Validates JWT, Transforms to XRHID
↓ Routes based on path
↓ X-Rh-Identity: <base64-encoded-JSON>
Backend Service (Ingress, Koku, ROS, Sources)
↓ Decodes XRHID, extracts org_id
↓ Processes request with tenant isolation
All backend services receive pre-authenticated requests from the gateway with the X-Rh-Identity header.
Services:
- Ingress (port 8081): File upload processing
- Koku API (port 8000): Cost management read/write operations
- ROS API (port 8000): Resource optimization recommendations
- Sources API (port 8000): Provider and source management
Authentication:
- All services use
X-Rh-Identityheader from gateway - Services decode base64 XRHID and extract
org_id,account_number - Multi-tenancy enforced via database query filtering
Benefits of Centralized Gateway:
- ✅ Single authentication point: All JWT validation in one place
- ✅ Simplified architecture: No per-service Envoy sidecars needed
- ✅ Easy debugging: All authentication logs in gateway
- ✅ Consistent security: Same authentication for all APIs
- ✅ Performance: JWT validation cached at gateway level
Complete End-to-End Flow:
1. Cost Management Operator → Keycloak
Request: client_credentials grant
Response: JWT token with org_id and account_number claims
2. Operator → Gateway (port 9080)
Request: Authorization: Bearer <JWT>
Gateway: Validates JWT signature, extracts payload
3. Gateway → Backend Service
Headers: X-Rh-Identity: <base64-XRHID>
Service: Decodes XRHID, extracts org_id
Processes request with tenant isolation
4. For Ingress uploads:
Ingress → Kafka: Message with org_id metadata
Kafka → Processors: Message consumed
Processors → APIs: Internal calls with X-Rh-Identity
5. For API requests:
Gateway → Koku/ROS/Sources: Pre-authenticated request
Service → Database: Query with org_id filter
Summary:
- Gateway validates all JWT tokens from external traffic
- Gateway injects X-Rh-Identity header for all backend services
- Backend services use X-Rh-Identity for multi-tenant database queries
- No per-service Envoy sidecars - centralized authentication at gateway
- OpenShift cluster with admin access (version 4.14 or later)
- Cluster admin permissions
ocCLI installed and logged in
Follow the official Red Hat documentation to install RHBK on OpenShift:
📖 Official Documentation:
Quick Installation Steps:
- Install the Red Hat Build of Keycloak Operator from OperatorHub
- Create a namespace for RHBK (e.g.,
keycloak) - Deploy a Keycloak instance
- Create a Keycloak realm
For testing or development environments, use the provided automation script:
cd scripts/
./deploy-rhbk.shThis script automates the operator installation and basic configuration.
Verify that RHBK is running:
# Check Keycloak instance status
oc get keycloak -n keycloak
# Get Keycloak URL
oc get keycloak keycloak -n keycloak -o jsonpath='{.status.hostname}'
# Get admin credentials (auto-generated by RHBK operator)
oc get secret keycloak-initial-admin -n keycloak \
-o jsonpath='{.data.username}' | base64 -d
oc get secret keycloak-initial-admin -n keycloak \
-o jsonpath='{.data.password}' | base64 -dThis section shows how to configure an existing Red Hat Build of Keycloak (RHBK) instance to work with the Cost Management Operator.
The Cost Management Operator requires:
- Realm: A Keycloak realm (e.g.,
kubernetes) - Client: A service account client with specific configuration
- org_id Claim Mapper: Critical for ROS backend compatibility
If you don't already have a realm, create one using a KeycloakRealm CR:
apiVersion: k8s.keycloak.org/v2alpha1
kind: KeycloakRealm
metadata:
name: kubernetes-realm
namespace: keycloak
labels:
app: keycloak
spec:
realm:
id: kubernetes
realm: kubernetes
enabled: true
displayName: "Kubernetes Realm"
accessTokenLifespan: 300
bruteForceProtected: true
failureFactor: 30
maxFailureWaitSeconds: 900
maxDeltaTimeSeconds: 43200
registrationAllowed: false
rememberMe: true
resetPasswordAllowed: true
verifyEmail: false
clientScopes:
- name: api.console
description: "API Console access scope for cost management"
protocol: openid-connect
attributes:
include.in.token.scope: "true"
display.on.consent.screen: "false"
defaultDefaultClientScopes:
- api.console
instanceSelector:
matchLabels:
app: keycloak📝 Configuration Notes:
accessTokenLifespan: 300- JWT tokens expire after 5 minutesbruteForceProtected: true- Protects against brute force attacksregistrationAllowed: false- Disable self-registration for securityclientScopes: Defines theapi.consolescope at the realm levelinclude.in.token.scope: "true"- Includes this scope in the token's scope claimdisplay.on.consent.screen: "false"- Don't show to users (service account clients)
defaultDefaultClientScopes: Automatically includesapi.consolein all clients- This makes the
api.consolescope available to all clients in this realm by default - Clients can still explicitly reference it in their
defaultClientScopesarray
- This makes the
ℹ️ Session Configuration Note: The Red Hat Build of Keycloak (RHBK) v2alpha1 API does not support
clientSessionMaxLifespanorssoSessionMaxLifespanfields in the KeycloakRealm CRD. If you need to configure session timeouts beyond the access token lifespan, you must set them via:
- The Keycloak Admin Console UI (
Realm Settings→Sessions)- The Keycloak Admin REST API
The
accessTokenLifespansetting controls how long JWT tokens remain valid.
Apply the realm:
oc apply -f keycloak-realm.yaml -n keycloak
# Wait for realm to be ready
oc wait --for=condition=ready keycloakrealm/kubernetes-realm -n keycloak --timeout=120sorg_id claim is REQUIRED by the ROS backend. Without it, all uploads will be rejected.
The ROS backend (cost-onprem-backend) requires the org_id claim to:
- Identify which organization the data belongs to
- Enforce multi-tenancy boundaries
- Route data to correct storage partitions
- Apply organization-specific policies
Create a KeycloakClient CR with the org_id mapper included:
apiVersion: k8s.keycloak.org/v2alpha1
kind: KeycloakClient
metadata:
name: cost-management-service-account
namespace: keycloak
labels:
app: keycloak
spec:
client:
clientId: cost-management-operator
secret:
name: keycloak-client-secret-cost-management-service-account
publicClient: false
serviceAccountsEnabled: true
protocol: openid-connect
defaultClientScopes:
- openid
- profile
- email
- api.console
protocolMappers:
- name: org-id-mapper
protocol: openid-connect
protocolMapper: oidc-hardcoded-claim-mapper
config:
access.token.claim: "true"
claim.name: org_id
claim.value: "12345"
id.token.claim: "false"
jsonType.label: String
userinfo.token.claim: "false"
- name: account-number-mapper
protocol: openid-connect
protocolMapper: oidc-hardcoded-claim-mapper
config:
access.token.claim: "true"
claim.name: account_number
claim.value: "7890123"
id.token.claim: "false"
jsonType.label: String
userinfo.token.claim: "false"
- name: audience-mapper
protocol: openid-connect
protocolMapper: oidc-audience-mapper
config:
access.token.claim: "true"
id.token.claim: "false"
included.client.audience: cost-management-operator
- name: client-id-mapper
protocol: openid-connect
protocolMapper: oidc-usersessionmodel-note-mapper
config:
access.token.claim: "true"
claim.name: clientId
id.token.claim: "true"
user.session.note: clientId
- name: api-console-mock
protocol: openid-connect
protocolMapper: oidc-hardcoded-claim-mapper
config:
access.token.claim: "true"
claim.name: scope
claim.value: api.console
id.token.claim: "false"
realmSelector:
matchLabels:
app: keycloak📝 Important Configuration Notes:
- Change
claim.value: "1"to your actual organization ID in theorg-id-mapper - Change
claim.value: "1"to your actual account number in theaccount-number-mapper(optional) - api.console scope: Included in
defaultClientScopesand added viaapi-console-mockmapper - Labels: Use
app: ssolabels to match with Keycloak instance selector - Multi-Organization Support: The system will extract
org_idfrom theclientIdclaim- The
clientIdis automatically included in JWT tokens by theclient-id-mapper - Backend services can parse the
clientIdto derive the organization identifier - This allows flexible multi-tenancy without hardcoding
org_idvalues
- The
Protocol Mappers Explained:
- org-id-mapper: Adds
org_idclaim (REQUIRED by ROS backend) - for explicit org identification - account-number-mapper: Adds
account_numberclaim (recommended for tenant identification) - audience-mapper: Adds audience validation for JWT
- client-id-mapper: Adds
clientIdclaim to tokens - used for org_id extraction in multi-tenant setups - api-console-mock: Adds
api.consoleto thescopeclaim (required for OpenShift integration)
Multi-Organization Architecture:
- The
clientIdclaim can be used to deriveorg_iddynamically - Example:
clientId: "cost-management-operator-org123"→org_id: "org123" - This eliminates the need for separate Keycloak clients per organization
- Backend services parse the
clientIdto determine the organization context
# Save the YAML above to a file
vi cost-management-client.yaml
# Apply the KeycloakClient CR
oc apply -f cost-management-client.yaml -n keycloak
# Wait for client to be ready
oc wait --for=condition=ready keycloakclient/cost-management-service-account -n keycloak --timeout=120s
# Verify the client was created
oc get keycloakclient -n keycloak cost-management-service-account
# Verify the client secret was created
oc get secret keycloak-client-secret-cost-management-service-account -n keycloak
# Get the client secret value
CLIENT_SECRET=$(oc get secret keycloak-client-secret-cost-management-service-account -n keycloak \
-o jsonpath='{.data.CLIENT_SECRET}' | base64 -d)
echo "Client Secret: $CLIENT_SECRET"If you already have a client without org_id, patch it:
# Set your organization ID
ORG_ID="1" # Change to your actual org_id
# Patch the existing KeycloakClient
oc patch keycloakclient cost-management-service-account -n keycloak --type=json -p='[
{
"op": "add",
"path": "/spec/client/protocolMappers/-",
"value": {
"name": "org-id-mapper",
"protocol": "openid-connect",
"protocolMapper": "oidc-hardcoded-claim-mapper",
"config": {
"claim.name": "org_id",
"claim.value": "'$ORG_ID'",
"jsonType.label": "String",
"access.token.claim": "true",
"id.token.claim": "false"
}
}
}
]'
# Wait for Keycloak to reconcile
sleep 10If you prefer to use the Keycloak web UI:
-
Log into Keycloak Admin Console
# Get URL and credentials KEYCLOAK_URL=$(oc get keycloak keycloak -n keycloak -o jsonpath='{.status.hostname}') echo "Admin Console: https://$KEYCLOAK_URL/admin/"
-
Navigate to the Client
- Realms →
kubernetes - Clients →
cost-management-operator - Mappers tab
- Realms →
-
Create org_id Mapper
- Click "Create"
- Name:
org-id-mapper - Mapper Type:
Hardcoded claim - Token Claim Name:
org_id - Claim value:
1(your organization ID) - Claim JSON Type:
String - Add to ID token: OFF
- Add to access token: ON ✅
- Add to userinfo: OFF
- Click "Save"
The Cost Management On-Premise Helm chart needs to know how to reach Keycloak and validate its TLS certificate. The chart provides intelligent defaults with automatic fallback to minimize manual configuration.
Automatic Discovery (Default):
# No jwt_auth configuration needed!
# The chart will auto-discover Keycloak from the clusterManual Override (External Keycloak):
jwt_auth:
keycloak:
url: "https://keycloak.external-company.com"
realm: "production"Logic:
- ✅ IF
jwtAuth.keycloak.urlis specified → Use that URL- ✅ IF NOT specified → Auto-discover from cluster:
- Search for Keycloak Custom Resources
- Find Routes in
keycloakorkeycloak-systemnamespaces - Construct URL from service discovery
Automatic Fetching (Default):
jwt_auth:
keycloak:
url: "https://keycloak.example.com"
# No tls.caCert needed - will be dynamically fetchedManual Override (Production/Air-gapped):
jwt_auth:
keycloak:
url: "https://keycloak.example.com"
tls:
caCert: |
-----BEGIN CERTIFICATE-----
MIIDXTCCAkWgAwIBAgIJAKLnUhVP3GVDMA0GCSqGSIb3...
-----END CERTIFICATE-----Logic:
- ✅ IF
jwtAuth.keycloak.tls.caCertis provided → Use that CA (skip dynamic fetch) - ✅ IF NOT provided → Dynamically fetch from Keycloak endpoint during pod initialization
- Fetches entire certificate chain
- Combines with system CA bundle and OpenShift CAs
- Gracefully degrades if fetch fails (uses system CAs only)
# openshift-values.yaml
# NO jwt_auth configuration needed!What happens:
- Keycloak URL: Auto-discovered from cluster ✅
- Keycloak CA: Auto-injected by OpenShift service CA + dynamic fetch ✅
- Realm: Defaults to
redhat-external
Confidence: 95%+ - This is the recommended approach for local Keycloak.
jwt_auth:
keycloak:
url: "https://auth.company.com" # Uses Let's Encrypt
realm: "production"What happens:
- Keycloak URL: Uses specified URL ✅
- Keycloak CA: System CA bundle already trusts Let's Encrypt ✅
- Dynamic fetch provides redundancy
Confidence: 85-90% - Works reliably with public CAs.
jwt_auth:
keycloak:
url: "https://keycloak.dev.external.com"
realm: "development"
# No tls.caCert - will attempt dynamic fetchWhat happens:
- Keycloak URL: Uses specified URL ✅
- Keycloak CA: Dynamically fetched from endpoint
⚠️ - Requires: Network egress from pods
- Requires: DNS resolution of external hostname
- 10-second timeout for fetch
Confidence: 70-80% - Works if network allows egress. Test thoroughly.
jwt_auth:
keycloak:
url: "https://keycloak.prod.external.com"
realm: "production"
tls:
caCert: |
-----BEGIN CERTIFICATE-----
MIIDXTCCAkWgAwIBAgIJAKLnUhVP3GVDMA0GCSqGSIb3DQEBCwUAMEUxCzAJBgNV
BAYTAlVTMRMwEQYDVQQIDApDYWxpZm9ybmlhMRYwFAYDVQQHDA1TYW4gRnJhbmNp
... (full certificate) ...
-----END CERTIFICATE-----What happens:
- Keycloak URL: Uses specified URL ✅
- Keycloak CA: Uses manually provided certificate ✅
- No external dependency during pod startup
- No network requirements
- Predictable behavior
Confidence: 95-99% - Recommended for production external Keycloak.
How to get the CA certificate:
# From your local machine or bastion host
echo | openssl s_client -connect keycloak.prod.external.com:443 -showcerts 2>/dev/null | \
openssl x509 -outform PEM > keycloak-ca.crt
# Verify it's valid
openssl x509 -in keycloak-ca.crt -noout -text
# Copy the contents into values.yaml
cat keycloak-ca.crtjwt_auth:
keycloak:
url: "https://keycloak.internal"
realm: "production"
tls:
caCert: |
-----BEGIN CERTIFICATE-----
... (REQUIRED - must be provided manually) ...
-----END CERTIFICATE-----What happens:
- Keycloak URL: Uses specified URL ✅
- Keycloak CA: Uses manually provided certificate ✅
- Dynamic fetch will fail (no external access) but manual CA prevents issues
Confidence: 95-99% - Manual CA is mandatory for air-gapped deployments.
| Environment | Keycloak Location | Recommended Configuration | Confidence |
|---|---|---|---|
| Development | Local (OpenShift) | Zero config | 95%+ |
| Development | External, Public CA | URL only | 85-90% |
| Development | External, Self-Signed | URL + dynamic CA | 70-80% |
| Production | Local (OpenShift) | Zero config | 95%+ |
| Production | External, Public CA | URL only | 85-90% |
| Production | External, Self-Signed | URL + manual CA | 95-99% |
| Air-gapped | Any | URL + manual CA | 95-99% |
Check what URL is being used:
kubectl get configmap -n cost-onprem cost-onprem-gateway-envoy-config -o yaml | grep issuerCheck CA bundle contents:
# Number of certificates in bundle
kubectl exec -n cost-onprem deploy/cost-onprem-gateway -c envoy -- \
cat /etc/ca-certificates/ca-bundle.crt | grep -c "BEGIN CERTIFICATE"
# Check init container logs
kubectl logs -n cost-onprem deploy/cost-onprem-gateway -c prepare-ca-bundle | grep -E "(Adding|Fetched)"Expected output:
📋 Adding system CA bundle...
📋 Adding Kubernetes service account CA...
📋 Adding OpenShift service CA...
✅ Successfully fetched Keycloak certificate chain (2 certificates)
- Configuration Behavior Details - Complete behavior reference
- TLS Certificate Options - Detailed CA configuration options
- External Keycloak Scenario - Architecture and troubleshooting
- Confidence Assessment - Risk analysis for dynamic CA fetch
# Get Keycloak URL
KEYCLOAK_URL=$(oc get keycloak keycloak -n keycloak -o jsonpath='{.status.hostname}')
# Get client credentials
CLIENT_ID="cost-management-operator"
CLIENT_SECRET=$(oc get secret keycloak-client-secret-cost-management-service-account \
-n keycloak -o jsonpath='{.data.clientSecret}' | base64 -d)
# Get JWT token
TOKEN=$(curl -k -s -X POST \
"https://${KEYCLOAK_URL}/auth/realms/kubernetes/protocol/openid-connect/token" \
-d "grant_type=client_credentials" \
-d "client_id=${CLIENT_ID}" \
-d "client_secret=${CLIENT_SECRET}" \
| jq -r '.access_token')
# Decode JWT and check for org_id
echo "JWT Header:"
echo $TOKEN | cut -d'.' -f1 | base64 -d 2>/dev/null | jq .
echo ""
echo "JWT Payload:"
echo $TOKEN | cut -d'.' -f2 | base64 -d 2>/dev/null | jq .
echo ""
echo "org_id claim:"
echo $TOKEN | cut -d'.' -f2 | base64 -d 2>/dev/null | jq -r '.org_id'
echo ""
echo "account_number claim:"
echo $TOKEN | cut -d'.' -f2 | base64 -d 2>/dev/null | jq -r '.account_number'Expected Output:
{
"exp": 1760628776,
"iat": 1760628476,
"jti": "5a1e42a0-6de5-4722-af84-de7170f2b4b0",
"iss": "https://keycloak-keycloak.apps.example.com/auth/realms/kubernetes",
"aud": "cost-management-operator",
"sub": "27f3c0e2-37c3-4207-9adc-691351165d9b",
"typ": "Bearer",
"azp": "cost-management-operator",
"scope": "api.console email profile",
"org_id": "12345", <-- MUST BE PRESENT (REQUIRED)
"account_number": "1", <-- RECOMMENDED FOR ACCOUNT ISOLATION
"clientId": "cost-management-operator",
"email_verified": false,
"clientHost": "192.168.122.217",
"preferred_username": "service-account-cost-management-operator",
"clientAddress": "192.168.122.217"
}Create the authentication secret in the operator namespace:
# Get credentials
CLIENT_ID="cost-management-operator"
CLIENT_SECRET=$(oc get secret keycloak-client-secret-cost-management-service-account \
-n keycloak -o jsonpath='{.data.CLIENT_SECRET}' | base64 -d)
# Create operator secret
oc create secret generic cost-management-auth-secret \
-n costmanagement-metrics-operator \
--from-literal=client_id=${CLIENT_ID} \
--from-literal=client_secret=${CLIENT_SECRET} \
--dry-run=client -o yaml | oc apply -f -Update the CostManagementMetricsConfig to use JWT authentication:
KEYCLOAK_URL=$(oc get route keycloak -n keycloak -o jsonpath='{.spec.host}')
oc patch costmanagementmetricsconfig costmanagementmetricscfg-tls \
-n costmanagement-metrics-operator \
--type merge -p '{
"spec": {
"authentication": {
"type": "service-account",
"secret_name": "cost-management-auth-secret",
"token_url": "https://'${KEYCLOAK_URL}'/auth/realms/kubernetes/protocol/openid-connect/token"
}
}
}'Monitor operator logs to ensure JWT acquisition is working:
oc logs -n costmanagement-metrics-operator \
deployment/costmanagement-metrics-operator \
--tail=50 -f | grep -E "token|auth|jwt"Expected log entries:
INFO crc_http.GetAccessToken requesting service-account access token
INFO crc_http.GetAccessToken successfully retrieved and set access token for subsequent requests
Test the complete flow from operator to ROS backend:
# Trigger an upload (or wait for scheduled upload)
# Check operator status
oc get costmanagementmetricsconfig -n costmanagement-metrics-operator \
-o jsonpath='{.status.upload.last_upload_status}'
# Should show: "202 Accepted" (not 401 Unauthorized)
# Check ingress logs for org_id extraction
oc logs -n cost-onprem deployment/cost-onprem-ingress -c ingress --tail=50 | \
grep -E "org_id|account"
# Expected: account="1", org_id="1"The ROS backend can extract org_id from the clientId claim in JWT tokens, enabling flexible multi-tenant deployments without requiring multiple Keycloak clients.
How It Works:
- The
client-id-mapperprotocol mapper adds theclientIdclaim to JWT tokens - Backend services parse the
clientIdto extract the organization identifier - Example:
cost-management-operator-org123→ extractsorg_id: "org123"
Benefits:
- ✅ Single Keycloak client handles multiple organizations
- ✅ Simplified Keycloak administration
- ✅ Easy to onboard new organizations
- ✅ Reduced operational overhead
Use a consistent clientId naming pattern that embeds the org_id:
# Client for Organization "12345"
apiVersion: k8s.keycloak.org/v2alpha1
kind: KeycloakClient
metadata:
name: cost-management-org-12345
namespace: keycloak
labels:
app: keycloak
spec:
client:
clientId: cost-management-operator-12345 # org_id embedded
secret:
name: keycloak-client-secret-cost-management-org-12345
publicClient: false
serviceAccountsEnabled: true
protocol: openid-connect
protocolMappers:
- name: client-id-mapper
protocol: openid-connect
protocolMapper: oidc-usersessionmodel-note-mapper
config:
claim.name: "clientId"
# Backend extracts "12345" from "cost-management-operator-12345"
realmSelector:
matchLabels:
app: keycloakBackend Parsing Logic (to be implemented):
// Example: Extract org_id from clientId
clientId := claims["clientId"] // "cost-management-operator-12345"
orgId := extractOrgId(clientId) // "12345"Continue using the hardcoded org_id mapper for explicit organization identification:
protocolMappers:
- name: org-id-mapper
protocolMapper: oidc-hardcoded-claim-mapper
config:
claim.name: "org_id"
claim.value: "12345" # Explicit org_idWhen to Use:
- Transitioning to the new architecture
- Need explicit org_id validation
- Legacy system compatibility
Create separate clients for each organization (not recommended for new deployments):
# Organization 1
ORG_ID="1" CLIENT_ID="cost-management-operator-1"
# Apply KeycloakClient with org_id="1"
# Organization 2
ORG_ID="2" CLIENT_ID="cost-management-operator-2"
# Apply KeycloakClient with org_id="2"Drawbacks:
- More Keycloak clients to manage
- Separate secrets per organization
- Increased operational complexity
The on-premise deployment uses DEVELOPMENT=false and ENHANCED_ORG_ADMIN=true to provide a simpler authentication model without requiring an external RBAC service.
Configuration:
# In Koku deployment
DEVELOPMENT: "False" # Use production middleware
ENHANCED_ORG_ADMIN: "True" # Bypass RBAC for org adminsX-Rh-Identity Header Format:
{
"org_id": "org1234567",
"identity": {
"org_id": "org1234567",
"account_number": "7890123",
"type": "User",
"user": {
"username": "test",
"is_org_admin": true
}
},
"entitlements": {
"cost_management": {"is_entitled": true}
}
}How it works:
is_org_admin: truein the identity marks the user as an organization admin- With
ENHANCED_ORG_ADMIN=true, Koku's_get_access()method returns{}for admin users - This bypasses the RBAC service call entirely
- User gets full access to all resources within their org
Benefits over DEVELOPMENT mode:
- ✅ No Mock user bugs (production middleware)
- ✅ Proper
betaattribute handling - ✅ Real User objects instead of Mock objects
- ✅ All middleware checks work correctly
Dual org_id placement:
The org_id appears at both top-level and inside identity for middleware compatibility.
This ensures proper tenant lookup regardless of which code path reads the org_id.
Symptoms:
- Operator logs:
upload failed | error: status: 401 - Ingress logs:
"error":"Invalid or missing identity"
Root Cause:
- JWT doesn't contain
org_id - Gateway not deployed or misconfigured
Fix:
- Verify
org_idin JWT (see Part 3, Step 1) - Check gateway is running:
oc get pod -n cost-onprem -l app.kubernetes.io/component=gateway # Should show gateway pods running - Verify Helm chart version includes JWT support (v0.1.5+)
Symptoms:
- JWT has
org_idclaim - Still get 401 Unauthorized
Root Cause:
- Gateway JWT filter not recognizing the token
- Wrong issuer or audience
Fix:
- Check gateway Envoy configuration:
oc get configmap cost-onprem-gateway-envoy-config -n cost-onprem -o yaml
- Verify
issuermatches Keycloak:issuer: "https://keycloak-keycloak.apps.example.com/auth/realms/kubernetes"
- Verify
audiencesincludes your client_id:audiences: - "cost-management-operator"
Symptoms:
- Operator logs:
failed to get access token - Operator logs:
connection refusedortimeout
Root Cause:
- Network connectivity issue
- Wrong token URL
- Missing CA certificates
Fix:
- Test connectivity from operator pod:
oc exec -n costmanagement-metrics-operator \ deployment/costmanagement-metrics-operator -- \ curl -k -I https://keycloak-keycloak.apps.example.com - Verify CA certificates are mounted:
oc get deployment costmanagement-metrics-operator \ -n costmanagement-metrics-operator \ -o jsonpath='{.spec.template.spec.volumes[?(@.name=="ca-bundle")]}' - Check token URL is correct:
oc get costmanagementmetricsconfig \ -o jsonpath='{.items[0].spec.authentication.token_url}'
Symptoms:
- Data appears in wrong organization
- ROS backend accepts upload but stores in wrong partition
Fix:
- Verify org_id in JWT matches expected value
- Update mapper in Keycloak:
- Admin Console → Clients → Mappers → org-id-mapper
- Change "Claim value" to correct org_id
- Delete operator pod to force new token acquisition
Default token lifespan is 5 minutes (300 seconds). The operator caches tokens and refreshes automatically.
To adjust:
oc patch keycloakrealm kubernetes-realm -n keycloak --type=merge -p '{
"spec": {
"realm": {
"accessTokenLifespan": 300
}
}
}'Rotate client secrets periodically:
# Keycloak will regenerate the secret
oc delete secret keycloak-client-secret-cost-management-service-account -n keycloak
# Wait for regeneration (handled by operator)
sleep 30
# Update operator secret
NEW_SECRET=$(oc get secret keycloak-client-secret-cost-management-service-account \
-n keycloak -o jsonpath='{.data.clientSecret}' | base64 -d)
oc patch secret cost-management-auth-secret \
-n costmanagement-metrics-operator \
--type=json -p='[{
"op": "replace",
"path": "/data/client_secret",
"value": "'$(echo -n $NEW_SECRET | base64)'"
}]'
# Restart operator
oc delete pod -n costmanagement-metrics-operator \
-l app=costmanagement-metrics-operatorAlways use HTTPS for token endpoints:
# Good
token_url: "https://keycloak-keycloak.apps.example.com/auth/realms/kubernetes/protocol/openid-connect/token"
# Bad (insecure)
token_url: "http://keycloak-keycloak.apps.example.com/auth/realms/kubernetes/protocol/openid-connect/token"Ensure CA certificates are properly configured for self-signed certs.
Here's a complete end-to-end setup script:
#!/bin/bash
set -e
# Configuration
ORG_ID="1"
CLIENT_ID="cost-management-operator"
KEYCLOAK_NAMESPACE="keycloak"
OPERATOR_NAMESPACE="costmanagement-metrics-operator"
echo "=== Step 1: Deploy Red Hat Build of Keycloak ==="
./scripts/deploy-rhbk.sh
echo "=== Step 2: Add org_id mapper ==="
oc patch keycloakclient cost-management-service-account \
-n $KEYCLOAK_NAMESPACE --type=json -p='[
{
"op": "add",
"path": "/spec/client/protocolMappers/-",
"value": {
"name": "org-id-mapper",
"protocol": "openid-connect",
"protocolMapper": "oidc-hardcoded-claim-mapper",
"config": {
"claim.name": "org_id",
"claim.value": "'$ORG_ID'",
"jsonType.label": "String",
"access.token.claim": "true"
}
}
}
]'
echo "Waiting for Keycloak to reconcile..."
sleep 15
echo "=== Step 3: Verify JWT contains org_id ==="
KEYCLOAK_URL=$(oc get keycloak keycloak -n $KEYCLOAK_NAMESPACE -o jsonpath='{.status.hostname}')
CLIENT_SECRET=$(oc get secret keycloak-client-secret-cost-management-service-account \
-n $KEYCLOAK_NAMESPACE -o jsonpath='{.data.clientSecret}' | base64 -d)
TOKEN=$(curl -k -s -X POST \
"https://${KEYCLOAK_URL}/auth/realms/kubernetes/protocol/openid-connect/token" \
-d "grant_type=client_credentials" \
-d "client_id=${CLIENT_ID}" \
-d "client_secret=${CLIENT_SECRET}" \
| jq -r '.access_token')
ORG_ID_IN_TOKEN=$(echo $TOKEN | cut -d'.' -f2 | base64 -d 2>/dev/null | jq -r '.org_id')
if [ "$ORG_ID_IN_TOKEN" = "$ORG_ID" ]; then
echo "✅ JWT contains correct org_id: $ORG_ID"
else
echo "❌ JWT org_id mismatch. Expected: $ORG_ID, Got: $ORG_ID_IN_TOKEN"
exit 1
fi
echo "=== Step 4: Create operator secret ==="
oc create secret generic cost-management-auth-secret \
-n $OPERATOR_NAMESPACE \
--from-literal=client_id=${CLIENT_ID} \
--from-literal=client_secret=${CLIENT_SECRET} \
--dry-run=client -o yaml | oc apply -f -
echo "=== Step 5: Configure operator ==="
oc patch costmanagementmetricsconfig costmanagementmetricscfg-tls \
-n $OPERATOR_NAMESPACE \
--type merge -p '{
"spec": {
"authentication": {
"type": "service-account",
"secret_name": "cost-management-auth-secret",
"token_url": "https://'${KEYCLOAK_URL}'/auth/realms/kubernetes/protocol/openid-connect/token"
}
}
}'
echo "=== Setup Complete! ==="
echo ""
echo "Next steps:"
echo " 1. Wait for next operator upload cycle"
echo " 2. Verify upload status: oc get costmanagementmetricsconfig -o jsonpath='{.status.upload}'"
echo " 3. Check ingress logs: oc logs -n cost-onprem deployment/cost-onprem-ingress -c ingress"Critical Steps:
- ✅ Deploy Red Hat Build of Keycloak using
deploy-rhbk.sh - ✅ Add
org_idmapper to client (REQUIRED for backend services) - ✅ Add
account_numbermapper to client (RECOMMENDED for account-level isolation) - ✅ Verify JWT contains both
org_idandaccount_numberclaims - ✅ Configure operator with client credentials
- ✅ Verify end-to-end flow (operator → centralized gateway → backend services)
Key Takeaway: The org_id claim is mandatory for ROS backend compatibility. The account_number claim is recommended for proper multi-tenant account isolation. The basic Keycloak deployment does not include these claims by default, so they must be added as a post-deployment step.
For questions or issues, refer to:
scripts/deploy-rhbk.sh- Automated deploymentscripts/run-pytest.sh --auth- JWT authentication testingdocs/operations/troubleshooting.md- Common issues