This repository contains a Helm chart for deploying cost management solutions on-premise:
cost-onprem/ - Unified chart containing all components: ROS, Kruize, Koku (Cost Management with Sources API), PostgreSQL, and Valkey
Complete Helm chart for deploying the full Cost Management stack with OCP cost analytics capabilities.
π Quick Start:
# Automated deployment (recommended)
./scripts/install-helm-chart.shπ Documentation:
- Cost Management Installation Guide - Complete deployment guide
- Prerequisites: OpenShift 4.18+, S3-compatible object storage (ODF, AWS S3, or other), Kafka/AMQ Streams
- Architecture: Single unified chart with all components
- E2E Testing: Automated validation with
./scripts/run-pytest.sh(pytest-based test suite)
Key Features:
- π Complete OCP cost data pipeline (Kafka β CSV β PostgreSQL)
- ποΈ PostgreSQL-based data processing and analytics
- π Optimized Kubernetes resources with production-ready defaults
- π§ͺ Comprehensive E2E validation framework
OpenShift Helm chart for deploying the Resource Optimization Service (ROS) with Kruize integration and future cost management capabilities.
# Automated installation from Helm repository (recommended)
./scripts/install-helm-chart.sh
# Or install a specific chart version
CHART_VERSION=0.2.9 ./scripts/install-helm-chart.sh
# Or use local chart for development
USE_LOCAL_CHART=true LOCAL_CHART_PATH=../cost-onprem ./scripts/install-helm-chart.sh
# Or use Helm directly
helm repo add cost-onprem https://insights-onprem.github.io/cost-onprem-chart
helm repo update
helm install cost-onprem cost-onprem/cost-onprem --namespace cost-onprem --create-namespaceNote: See Authentication Setup section for required prerequisites (Keycloak)
π See Installation Guide for detailed installation options
π Complete Documentation Index β Comprehensive guides organized by use case, with detailed descriptions and navigation.
| π Getting Started | π Production Setup | π§ Operations |
|---|---|---|
| Quick Start Fast deployment walkthrough |
Installation Guide Detailed installation instructions |
Troubleshooting Common issues & solutions |
| Platform Guide OpenShift deployment details |
JWT Authentication Ingress authentication (Keycloak) |
Force Upload Testing & validation |
| Scripts Reference Automation scripts |
||
| Keycloak Setup SSO configuration |
Need more? Configuration, security, templates, and specialized guides are available in the Complete Documentation Index.
cost-onprem-chart/
βββ .github/workflows/ # CI/CD automation
βββ cost-onprem/ # Helm chart directory
β βββ Chart.yaml # Chart metadata
β βββ values.yaml # Default configuration
β βββ templates/ # Kubernetes resource templates
β βββ _helpers*.tpl # Template helper functions
β βββ cost-management/ # Cost Management (Koku, Sources API)
β βββ gateway/ # API gateway (Envoy)
β βββ infrastructure/ # Database, Kafka, storage, cache
β βββ ingress/ # File upload API
β βββ kruize/ # Kruize optimization engine
β βββ monitoring/ # Prometheus ServiceMonitor
β βββ ros/ # Resource Optimization Service
β βββ shared/ # Shared resources
β βββ ui/ # Cost Management UI
βββ docs/ # Documentation
βββ scripts/ # Deployment and automation scripts
βββ tests/ # Pytest E2E test suite
- PostgreSQL: Unified database server hosting ROS, Kruize, Koku, and Sources databases
- S3-compatible object storage: ODF, AWS S3, or other S3-compatible provider
- AMQ Streams Operator: Deploys and manages Kafka clusters (Streams for Apache Kafka 3.1)
- Kafka 4.1.0: Message streaming with persistent JBOD storage, KRaft mode (no ZooKeeper)
- API Gateway: Centralized Envoy gateway for JWT authentication and API routing (port 9080)
- Ingress: File upload API processing
- ROS API: Main REST API for recommendations and status
- ROS Processor: Data processing service for cost optimization
- ROS Recommendation Poller: Kruize integration for recommendations
- ROS Housekeeper: Maintenance tasks and data cleanup
- Kruize Autotune: Optimization recommendation engine (internal service, protected by network policies)
- Sources API: Source management and integration
- Valkey: Caching layer for performance
Security Architecture:
- Centralized Gateway: Single API gateway with JWT validation (Keycloak) for all external API traffic
- Backend Services: Receive pre-authenticated requests from gateway with
X-Rh-Identityheader - Network Policies: Restrict direct access to backend services while allowing Prometheus metrics scraping
- Multi-tenancy:
org_idandaccount_numberfrom authentication enable data isolation across organizations and accounts
See JWT Authentication Guide for detailed architecture
Complete Cost Management deployment requires significant cluster resources:
| Resource | Minimum | Recommended |
|---|---|---|
| CPU | 10 cores | 12-14 cores |
| Memory | 24 Gi | 32-40 Gi |
| Worker Nodes | 3 Γ 8 Gi | 3 Γ 16 Gi |
| Storage | 300 Gi | 400+ Gi |
| Pods | ~55 | - |
π See Resource Requirements Guide for detailed breakdown by component.
Any S3-compatible object storage is supported:
- ODF with Direct Ceph RGW (recommended for production - strong read-after-write consistency)
- AWS S3 (cloud-hosted)
- Other S3-compatible providers
Note: For ROS deployments, providers with strong read-after-write consistency are recommended. NooBaa has eventual consistency issues that can cause ROS processing failures.
See Configuration Guide for detailed requirements
Services accessible via OpenShift Routes:
oc get routes -n cost-onpremAvailable endpoints:
- Health Check:
/ready - ROS API:
/api/ros/* - Cost Management API:
/api/cost-management/* - Sources API:
/api/cost-management/v1/sources/(via Koku API) - Upload API:
/api/ingress/*
See Platform Guide for detailed access information
JWT authentication is automatically enabled and requires Keycloak configuration:
# Step 1: Deploy Red Hat Build of Keycloak (RHBK)
./scripts/deploy-rhbk.sh
# Step 2: Configure Cost Management Operator with JWT credentials
./scripts/setup-cost-mgmt-tls.sh
# Step 3: Deploy Cost Management On-Premise
./scripts/install-helm-chart.shπ See Keycloak Setup Guide for detailed configuration instructions
Key requirements:
- β
Keycloak realm with
org_idandaccount_numberclaims - β Service account client credentials
- β Self-signed CA certificate bundle (auto-configured)
- β Cost Management Operator configured with JWT token URL
Operator Support:
- β
Red Hat Build of Keycloak (RHBK) v22+ -
k8s.keycloak.org/v2alpha1
Architecture: JWT Authentication Overview
# Install/upgrade from Helm repository
./scripts/install-helm-chart.sh
# Check deployment status
./scripts/install-helm-chart.sh status
# Run health checks
./scripts/install-helm-chart.sh health# Cleanup preserving data volumes
./scripts/install-helm-chart.sh cleanup
# Complete removal including data
./scripts/install-helm-chart.sh cleanup --complete# Run all tests (excludes extended by default)
./scripts/run-pytest.sh
# Run specific test suites
./scripts/run-pytest.sh --helm # Helm chart validation
./scripts/run-pytest.sh --auth # JWT authentication tests
./scripts/run-pytest.sh --infrastructure # DB, S3, Kafka health
./scripts/run-pytest.sh --e2e # End-to-end data flow
# Run E2E with extended tests (summary tables, Kruize)
./scripts/run-pytest.sh --extended
# Run ALL tests including extended
./scripts/run-pytest.sh --all
# Run by test type
./scripts/run-pytest.sh -m component # Single-component tests
./scripts/run-pytest.sh -m integration # Multi-component testsSee Test Suite Documentation for detailed usage
- Lint & Validate: Chart validation on every PR
- Automated Releases: Chart-releaser publishes to Helm repository on version bump
- Version Tracking:
--save-versionsflag generatesversion_info.jsonfor traceability - Disconnected Support:
oc-mirrorcompatible (see Disconnected Deployment Guide)
Quick diagnostics:
# Check pods
kubectl get pods -n cost-onprem
# View logs
kubectl logs -n cost-onprem -l app.kubernetes.io/component=api
# Check storage
kubectl get pvc -n cost-onpremSee Troubleshooting Guide for comprehensive solutions
This project is licensed under the terms specified in the LICENSE file.
New to this project? See the OCP Dev Setup with S4 guide to set up a development environment on OpenShift using S4 (Ceph RGW) instead of ODF. This is the recommended approach for developers who don't have access to a multi-node OCP cluster with ODF.
| Setup | Nodes | Storage Backend | Use Case |
|---|---|---|---|
| Dev/Test (S4) | 1 (SNO) | S4 / Ceph RGW (standalone) | Local development, testing, demos |
| Production (ODF) | 3+ | S3-compatible object storage (ODF, AWS S3, or other) | |
| Production deployments |
See Quick Start Guide for development environment setup.
For issues and questions:
- Issues: GitHub Issues
- Documentation: Complete Documentation Index
- Scripts: Automation Scripts Reference