Complete installation methods, prerequisites, and upgrade procedures for the Cost Management On-Premise Helm chart.
- Prerequisites
- Installation Methods
- OpenShift Prerequisites
- Upgrade Procedures
- Verification
- Resource Requirements by Component
- E2E Validation (OCP Dataflow)
- Troubleshooting Installation
The installation scripts require the following tools:
# Required
helm # For installing Helm charts (v3+)
kubectl # For Kubernetes cluster access
jq # For JSON processing
# Required for E2E Testing
python3 # Python 3 interpreter (for NISE data generation)
python3-venv # Virtual environment module (for NISE isolation)# Ubuntu/Debian
sudo apt-get update
sudo apt-get install jq python3 python3-venv
# RHEL/CentOS/Fedora
sudo dnf install jq python3 python3-venv
# macOS
brew install jq
# Install Helm (all platforms)
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bashEnsure you have:
- Valid kubeconfig with cluster admin or appropriate namespace permissions
- Ability to create namespaces (or existing target namespace)
- Sufficient cluster resources (see Configuration Guide)
The easiest way to install using the automation script. Best for most users, CI/CD pipelines, and quick deployments.
# Install latest release with default settings
./scripts/install-helm-chart.sh
# Custom namespace
export NAMESPACE=cost-onprem
./scripts/install-helm-chart.sh
# Custom release name
export HELM_RELEASE_NAME=cost-onprem
./scripts/install-helm-chart.sh
# Use local chart for development
export USE_LOCAL_CHART=true
./scripts/install-helm-chart.shWhat the script does (Two-Phase Deployment):
The script deploys a unified chart containing all components:
Infrastructure:
- PostgreSQL (unified database for Koku, Sources, ROS, Kruize)
- Valkey (caching and Celery broker)
Applications:
- Koku API (unified, masu, listener)
- Celery Workers (background processing)
- ROS components (API, processor, housekeeper)
- Sources API
- UI and Ingress
Features:
- ✅ Two-phase deployment (infrastructure first, then application)
- ✅ Automatic secret creation (Django, Sources, S3 credentials)
- ✅ Installs from the Helm chart repository (GitHub Pages)
- ✅ Auto-discovers S3 credentials (OBC, NooBaa, S4)
- ✅ OpenShift platform verification
- ✅ Automatic upgrade detection
- ✅ Perfect for CI/CD pipelines
- ✅ Version pinning support via
CHART_VERSION
Environment Variables:
HELM_RELEASE_NAME: Helm release name (default:cost-onprem)NAMESPACE: Target namespace (default:cost-onprem)VALUES_FILE: Path to custom values fileCHART_VERSION: Pin a specific chart version (default: latest)USE_LOCAL_CHART: Use local chart instead of Helm repository (default:false)LOCAL_CHART_PATH: Path to local chart directory (default:../cost-onprem)
Note: JWT authentication is automatically enabled on OpenShift.
BYOI (Bring Your Own Infrastructure): When
database.deploy: falseis set in your values file, the script skips PostgreSQL credential creation and expects you to have pre-created the database credentials secret. See External Infrastructure (BYOI) for details.
For administrators who prefer full control over the deployment or cannot use the install-helm-chart.sh script (e.g., GitOps/ArgoCD workflows, air-gapped environments, custom CI pipelines), you can install the chart directly with helm install. You must supply the cluster-specific values that the install script would normally auto-detect.
| Source | Use Case | Installation |
|---|---|---|
| Helm Repository | Production (recommended) | helm repo add cost-onprem https://insights-onprem.github.io/cost-onprem-chart |
| OCI Registry | Air-gapped, GitOps, oc-mirror | helm pull oci://ghcr.io/insights-onprem/cost-onprem-chart/cost-onprem |
| Local Source | Development, testing, modifications | Clone repo and use ./cost-onprem directory |
Helm Repository (recommended):
# Add Helm repository
helm repo add cost-onprem https://insights-onprem.github.io/cost-onprem-chart
helm repo update
# Install latest version
helm install cost-onprem cost-onprem/cost-onprem \
--namespace cost-onprem \
--create-namespace
# Install a specific version
helm install cost-onprem cost-onprem/cost-onprem \
--namespace cost-onprem \
--create-namespace \
--version 0.2.9Verify available versions:
helm search repo cost-onpremOCI Registry (air-gapped/GitOps):
The chart is also published as an OCI artifact to GitHub Container Registry. This is useful for:
- Air-gapped environments using
oc-mirror - GitOps workflows (ArgoCD, Flux) that prefer OCI references
- Environments where traditional Helm repositories are blocked
# Install latest version from OCI registry
helm install cost-onprem oci://ghcr.io/insights-onprem/cost-onprem-chart/cost-onprem \
--namespace cost-onprem \
--create-namespace
# Install a specific version
helm install cost-onprem oci://ghcr.io/insights-onprem/cost-onprem-chart/cost-onprem \
--namespace cost-onprem \
--create-namespace \
--version 0.2.9
# Pull chart locally (for inspection or mirroring)
helm pull oci://ghcr.io/insights-onprem/cost-onprem-chart/cost-onprem --version 0.2.9
# Show available versions
helm show all oci://ghcr.io/insights-onprem/cost-onprem-chart/cost-onpremNote: OCI-based installation does not require
helm repo add. The chart is fetched directly from the container registry.
Local Source (for development):
# Clone the repository
git clone https://github.com/insights-onprem/cost-onprem-chart.git
cd cost-onprem-chart
# Use ./cost-onprem in the helm install commands belowThe chart ships with safe defaults for offline templating (used by oc-mirror), but real deployments require actual cluster values. Gather these from your cluster:
# Cluster domain (for Route hostnames)
CLUSTER_DOMAIN=$(oc get ingress.config.openshift.io cluster -o jsonpath='{.spec.domain}')
# Default storage class
STORAGE_CLASS=$(kubectl get sc -o jsonpath='{.items[?(@.metadata.annotations.storageclass\.kubernetes\.io/is-default-class=="true")].metadata.name}' | awk '{print $1}')
# Valkey fsGroup (from namespace supplemental-groups)
# First, create the namespace if it doesn't exist
oc create namespace cost-onprem --dry-run=client -o yaml | oc apply -f -
SUPP_GROUPS=$(oc get ns cost-onprem -o jsonpath='{.metadata.annotations.openshift\.io/sa\.scc\.supplemental-groups}')
FS_GROUP=$(echo "$SUPP_GROUPS" | cut -d'/' -f1)
# Keycloak URL (if using RHBK)
KEYCLOAK_NAMESPACE=$(oc get keycloaks.k8s.keycloak.org -A -o jsonpath='{.items[0].metadata.namespace}' 2>/dev/null)
KEYCLOAK_HOST=$(oc get keycloaks.k8s.keycloak.org -A -o jsonpath='{.items[0].status.hostname}' 2>/dev/null)
KEYCLOAK_URL="https://${KEYCLOAK_HOST}"Create a credentials secret and note your S3 endpoint:
kubectl create secret generic my-s3-credentials \
--namespace=cost-onprem \
--from-literal=access-key="<YOUR_ACCESS_KEY>" \
--from-literal=secret-key="<YOUR_SECRET_KEY>"helm install cost-onprem ./cost-onprem \
--namespace cost-onprem \
--create-namespace \
-f openshift-values.yaml \
--set global.clusterDomain="$CLUSTER_DOMAIN" \
--set global.storageClass="$STORAGE_CLASS" \
--set valkey.securityContext.fsGroup="$FS_GROUP" \
--set objectStorage.endpoint="<YOUR_S3_ENDPOINT>" \
--set objectStorage.port=443 \
--set objectStorage.useSSL=true \
--set objectStorage.secretName="my-s3-credentials" \
--set jwtAuth.keycloak.installed=true \
--set jwtAuth.keycloak.namespace="$KEYCLOAK_NAMESPACE" \
--set jwtAuth.keycloak.url="$KEYCLOAK_URL" \
--waitThe table below lists every cluster-specific value, its chart default, and how to determine the correct value for your environment.
| Value | Chart Default | Description | How to Determine |
|---|---|---|---|
global.clusterDomain |
apps.cluster.local |
OpenShift wildcard domain for Routes | oc get ingress.config.openshift.io cluster -o jsonpath='{.spec.domain}' |
global.storageClass |
ocs-storagecluster-ceph-rbd |
Default StorageClass for PVCs | kubectl get sc (look for the (default) annotation) |
global.volumeMode |
Filesystem |
PVC volume mode | Usually Filesystem; change only for raw block storage |
objectStorage.endpoint |
s3.openshift-storage.svc.cluster.local |
S3-compatible endpoint hostname | Your S3 provider's endpoint (e.g., s3.amazonaws.com, S4 hostname) |
objectStorage.port |
443 |
S3 endpoint port | 443 for HTTPS, 80 for HTTP |
objectStorage.useSSL |
true |
Use TLS for S3 connections | true for production, false for S4/dev |
objectStorage.secretName |
"" |
Pre-created credentials secret name | Name of the Secret you created in Step 2 |
valkey.securityContext.fsGroup |
(unset) | GID for Valkey PVC access on OpenShift | oc get ns <NS> -o jsonpath='{.metadata.annotations.openshift\.io/sa\.scc\.supplemental-groups}' (first number) |
jwtAuth.keycloak.installed |
true |
Whether Keycloak is deployed | true if RHBK is installed, false otherwise |
jwtAuth.keycloak.url |
"" |
Keycloak external URL | oc get route keycloak -n keycloak -o jsonpath='https://{.spec.host}' |
jwtAuth.keycloak.namespace |
"" |
Namespace where Keycloak runs | Usually keycloak |
database.deploy |
true |
Deploy bundled PostgreSQL StatefulSet | Set false to use an external database (see BYOI) |
valkey.deploy |
true |
Deploy bundled Valkey Deployment | Set false to use an external Redis/Valkey (see BYOI) |
Important: The chart defaults are designed for
oc-mirrorimage discovery (offline templating). They produce syntactically valid manifests but point to placeholder hostnames. For a working deployment, you must override the values marked above with real cluster values.
The install script normally creates several secrets automatically. When installing directly, you must create them yourself:
# 1. Django secret key (required by Koku)
kubectl create secret generic cost-onprem-django \
--namespace=cost-onprem \
--from-literal=django-secret-key="$(openssl rand -base64 50 | tr -dc 'a-zA-Z0-9' | head -c 50)"
# 2. S3 credentials (if not already created in Step 2)
# See Step 2 above
# 3. Keycloak CA certificate (for TLS trust between oauth2-proxy and Keycloak)
# Extract the Keycloak CA certificate and create the secret:
oc get secret -n keycloak keycloak-tls -o jsonpath='{.data.ca\.crt}' | base64 -d > /tmp/keycloak-ca.crt
kubectl create secret generic keycloak-ca-cert \
--namespace=cost-onprem \
--from-file=ca.crt=/tmp/keycloak-ca.crt
# 4. Database credentials (required — created by install script normally, or pre-created for BYOI)
# When using database.deploy: false (external database), you must create this secret manually:
kubectl create secret generic cost-onprem-db-credentials \
--namespace=cost-onprem \
--from-literal=postgres-user="admin" \
--from-literal=postgres-password="<admin_password>" \
--from-literal=ros-user="ros_user" \
--from-literal=ros-password="<ros_password>" \
--from-literal=kruize-user="kruize_user" \
--from-literal=kruize-password="<kruize_password>" \
--from-literal=koku-user="koku_user" \
--from-literal=koku-password="<koku_password>"# Check all pods are running
kubectl get pods -n cost-onprem -l app.kubernetes.io/instance=cost-onprem
# Check PVCs are bound
kubectl get pvc -n cost-onprem
# Check routes are created with correct hostnames
oc get routes -n cost-onpremInstead of passing many --set flags, you can create a values file:
# my-values.yaml — cluster-specific overrides for direct helm install
global:
clusterDomain: "apps.mycluster.example.com"
storageClass: "gp3-csi"
objectStorage:
endpoint: "s3.us-east-1.amazonaws.com"
port: 443
useSSL: true
secretName: "my-s3-credentials"
s3:
region: "us-east-1"
valkey:
securityContext:
fsGroup: 1000740000 # From namespace supplemental-groups annotation
jwtAuth:
keycloak:
installed: true
url: "https://keycloak-keycloak.apps.mycluster.example.com"
namespace: "keycloak"For deployments using external infrastructure (see also docs/examples/byoi-values.yaml for a minimal overlay):
# my-byoi-values.yaml — external database and cache
global:
clusterDomain: "apps.mycluster.example.com"
storageClass: "gp3-csi"
database:
deploy: false
server:
host: "my-postgres.example.com"
port: 5432
sslMode: require
secretName: "cost-onprem-db-credentials"
valkey:
deploy: false
host: "my-redis.example.com"
port: 6379
kafka:
bootstrapServers: "my-kafka:9092"
securityProtocol: "PLAINTEXT"
objectStorage:
endpoint: "s3.us-east-1.amazonaws.com"
port: 443
useSSL: true
secretName: "my-s3-credentials"
s3:
region: "us-east-1"
jwtAuth:
keycloak:
installed: true
url: "https://keycloak-keycloak.apps.mycluster.example.com"
namespace: "keycloak"Then install:
helm install cost-onprem ./cost-onprem \
--namespace cost-onprem \
--create-namespace \
-f openshift-values.yaml \
-f my-values.yaml \
--waitThe chart requires S3-compatible object storage. ODF is not required — any S3 provider works. For full configuration details, see the Storage Configuration section.
Supported backends:
| Backend | Use Case | Auto-Detected |
|---|---|---|
| AWS S3 | Production (disconnected AWS) | No — configure in values.yaml |
| Direct Ceph RGW (ODF) | Production (OpenShift with ODF) | Yes — via OBC |
| S4 (Ceph RGW) | Development/Testing | Yes — via S3_ENDPOINT |
| NooBaa (ODF) | Fallback only | Yes — not recommended |
Choose your path:
Pre-create buckets, create a credentials secret, and configure values.yaml:
# 1. Create namespace
kubectl create namespace cost-onprem
# 2. Create credentials secret
kubectl create secret generic my-s3-credentials \
--namespace=cost-onprem \
--from-literal=access-key=<YOUR_ACCESS_KEY> \
--from-literal=secret-key=<YOUR_SECRET_KEY># 3. In your values.yaml:
objectStorage:
endpoint: "s3.us-east-1.amazonaws.com" # Your S3 endpoint
port: 443
useSSL: true
secretName: "my-s3-credentials"
s3:
region: "us-east-1"The install script detects the pre-configured endpoint and skips all S3 auto-detection.
Create an ObjectBucketClaim and let the install script handle the rest:
# Create OBC for Direct Ceph RGW
cat <<EOF | oc apply -f -
apiVersion: objectbucket.io/v1alpha1
kind: ObjectBucketClaim
metadata:
name: ros-data-ceph
namespace: cost-onprem
spec:
generateBucketName: ros-data-ceph
storageClassName: ocs-storagecluster-ceph-rgw
EOF
oc wait --for=condition=Ready obc/ros-data-ceph -n cost-onprem --timeout=5mThe install script automatically detects the OBC, extracts configuration (endpoint, credentials, bucket name), and passes it to Helm. No values.yaml changes needed.
Note: Use Direct Ceph RGW (
ocs-storagecluster-ceph-rgw) over NooBaa (ocs-storagecluster-ceph-rbd). NooBaa's eventual consistency causes 403 errors when reading freshly uploaded files.
# Deploy S4
./scripts/deploy-s4-test.sh cost-onprem
# Install with S4
S3_ENDPOINT=s4.cost-onprem.svc.cluster.local S3_PORT=7480 S3_USE_SSL=false \
./scripts/install-helm-chart.sh --namespace cost-onpremThe script creates credentials, buckets, and passes objectStorage.* values to Helm.
See S4 Development Setup Guide for details.
Security Best Practices:
- Use dedicated service accounts (not admin credentials)
- Rotate credentials regularly
- Use external secret management (Vault, Sealed Secrets) where possible
- Use least-privilege access (specific buckets only)
- Never commit credentials to version control
External Secret Management Example:
# Sealed Secrets
kubectl create secret generic my-s3-credentials \
--namespace=cost-onprem \
--from-literal=access-key=<key> \
--from-literal=secret-key=<secret> \
--dry-run=client -o yaml | \
kubeseal -o yaml > sealed-secret.yamlEnsure you have permissions to:
- Create secrets in target namespace
- Deploy Helm charts
- Access S3 storage resources
- Create routes (OpenShift)
# Verify permissions
oc auth can-i create secrets -n cost-onprem
oc auth can-i create deployments -n cost-onprem
oc auth can-i create routes -n cost-onpremSingle Node OpenShift (SNO):
- SNO cluster with S3-compatible storage (ODF, S4, or external S3)
- 30GB+ block devices for persistent volumes
- Additional 6GB RAM for Cost Management On-Premise workloads
- Additional 2 CPU cores
See Configuration Guide for detailed requirements
Kafka is required for the Cost Management data pipeline (OCP metrics ingestion).
Automated Deployment (Recommended):
# Deploy AMQ Streams operator and Kafka cluster (KRaft mode)
./scripts/deploy-kafka.sh
# Script will:
# - Install AMQ Streams operator via OLM (channel: amq-streams-3.1.x)
# - Deploy Kafka 4.1.0 cluster in KRaft mode (no ZooKeeper)
# - Create separate controller and broker node pools with persistent JBOD storage
# - Verify OpenShift platform
# - Configure appropriate storage class
# - Wait for cluster to be readyCustomization:
# Custom namespace
KAFKA_NAMESPACE=my-kafka ./scripts/deploy-kafka.sh
# Custom Kafka cluster name
KAFKA_CLUSTER_NAME=my-cluster ./scripts/deploy-kafka.sh
# For OpenShift with specific storage class
STORAGE_CLASS=ocs-storagecluster-ceph-rbd ./scripts/deploy-kafka.shManual Verification:
# Check AMQ Streams operator
oc get csv -A | grep amqstreams
# Check Kafka cluster and node pools
oc get kafka -n kafka
oc get kafkanodepool -n kafka
# Verify Kafka is ready
oc wait kafka/cost-onprem-kafka --for=condition=Ready --timeout=300s -n kafkaRequired Kafka Topics:
platform.upload.announce(created automatically by Koku on first message)
Using an existing Kafka cluster: If you already have a Kafka cluster (e.g., AMQ Streams, Confluent, MSK), you can skip the AMQ Streams deployment and configure
kafka.bootstrapServersin your values file. SetKAFKA_BOOTSTRAP_SERVERSwhen running the install script to skip AMQ Streams verification. Only PLAINTEXT connections are currently supported. See External Kafka for details.
User Workload Monitoring must be enabled for Prometheus to scrape ServiceMonitors deployed by this chart. Without it, the ROS data pipeline will not function - ServiceMonitors will be created but no metrics will be collected.
Check if User Workload Monitoring is enabled:
# Check for prometheus-user-workload pods
oc get pods -n openshift-user-workload-monitoring
# If no pods are found, user workload monitoring is not enabledEnable User Workload Monitoring:
cat <<EOF | oc apply -f -
apiVersion: v1
kind: ConfigMap
metadata:
name: cluster-monitoring-config
namespace: openshift-monitoring
data:
config.yaml: |
enableUserWorkload: true
EOFVerify:
# Wait for prometheus-user-workload pods to start
oc get pods -n openshift-user-workload-monitoring -w
# Expected output: prometheus-user-workload-0, prometheus-user-workload-1, thanos-ruler-user-workload-*Warning: Without User Workload Monitoring enabled, the deployment will appear successful (all pods running, ServiceMonitors created), but the ROS data pipeline will produce no metrics or recommendations. This is a silent failure - always verify prometheus-user-workload pods are running before testing the data pipeline.
# Upgrade to latest release automatically
./scripts/install-helm-chart.sh
# The script detects existing installations and performs upgrades
# Installs from the Helm chart repository by default# Update repo index and upgrade to latest
helm repo update cost-onprem
helm upgrade cost-onprem cost-onprem/cost-onprem -n cost-onprem
# Upgrade to a specific version
helm upgrade cost-onprem cost-onprem/cost-onprem -n cost-onprem --version 0.2.9
# With custom values
helm upgrade cost-onprem cost-onprem/cost-onprem -n cost-onprem --values my-values.yaml# Using script
export USE_LOCAL_CHART=true
./scripts/install-helm-chart.sh
# Direct Helm command
helm upgrade cost-onprem ./cost-onprem -n cost-onpremBefore upgrading:
- Check release notes for breaking changes
- Backup persistent data if needed
- Verify cluster resources are sufficient
- Test in non-production environment first
During upgrade:
- Helm performs rolling updates by default
- Some downtime may occur during database upgrades
- Monitor pod status:
kubectl get pods -n cost-onprem -w
After upgrade:
# Verify upgrade
./scripts/install-helm-chart.sh status
# Run health checks
./scripts/install-helm-chart.sh health
# Check version
helm list -n cost-onprem# Check Helm release
helm status cost-onprem -n cost-onprem
# Check all pods
kubectl get pods -n cost-onprem
# Wait for all pods to be ready
kubectl wait --for=condition=ready pod -l app.kubernetes.io/instance=cost-onprem -n cost-onprem --timeout=300s# Run automated health checks
./scripts/install-helm-chart.sh health
# Test ingress endpoint
curl -k https://<route-host>/ready
# Check API endpoints
curl http://localhost:32061/api/ros/status# Check persistent volume claims
kubectl get pvc -n cost-onprem
# Verify all PVCs are bound
kubectl get pvc -n cost-onprem | grep -v Bound && echo "ISSUE: Unbound PVCs found" || echo "OK: All PVCs bound"
# Check storage class
kubectl get pvc -n cost-onprem -o jsonpath='{.items[*].spec.storageClassName}' | tr ' ' '\n' | sort -u# Test database connections
kubectl exec -it deployment/cost-onprem-ros-api -n cost-onprem -- \
env | grep DATABASE_URL
# Test Kafka connectivity
kubectl exec -it statefulset/cost-onprem-kafka -n cost-onprem -- \
kafka-topics.sh --list --bootstrap-server localhost:29092
# Test S3 access (endpoint depends on your storage backend)
oc rsh -n cost-onprem deployment/cost-onprem-ingress -- \
aws s3 ls --endpoint-url https://<your-s3-endpoint>Note: Resource allocations are aligned with the SaaS Clowder configuration from:
- Koku:
deploy/clowdapp.yamlin insights-onprem/koku- ROS:
clowdapp.yamlin insights-onprem/ros-ocp-backend
| Component | Pods | CPU Request | CPU Limit | Memory Request | Memory Limit |
|---|---|---|---|---|---|
| PostgreSQL | 1 | 500m | 1000m | 1Gi | 2Gi |
| Valkey | 1 | 100m | 500m | 256Mi | 512Mi |
| Subtotal | 2 | 600m | 1.5 cores | 1.25 GB | 2.5 GB |
| Component | Pods | CPU Request | CPU Limit | Memory Request | Memory Limit |
|---|---|---|---|---|---|
| Koku API Reads | 1-2 | 250m each | 500m each | 512Mi each | 1Gi each |
| Koku API Writes | 1 | 250m | 500m | 512Mi | 1Gi |
| Koku API MASU | 1 | 50m | 100m | 500Mi | 700Mi |
| Koku Listener | 1 | 150m | 300m | 300Mi | 600Mi |
| Celery Beat | 1 | 50m | 100m | 200Mi | 400Mi |
| Celery Workers | 11-21 | 100m each | 200m each | 256Mi-512Mi | 400Mi-1Gi |
| ROS API | 1 | 500m | 1000m | 1Gi | 1Gi |
| ROS Processor | 1 | 500m | 1000m | 1Gi | 1Gi |
| ROS Poller | 1 | 500m | 1000m | 1Gi | 1Gi |
| ROS Housekeeper | 1 | 500m | 1000m | 1Gi | 1Gi |
| Kruize | 1-2 | 200m | 1000m | 1Gi | 2Gi |
| Subtotal | 18-28 | ~4-6 cores | ~8-12 cores | ~9-14 Gi | ~14-22 Gi |
| Scenario | Pods | CPU Request | CPU Limit | Memory Request | Memory Limit |
|---|---|---|---|---|---|
| OCP-Only (minimal) | ~24 | ~7.5 cores | ~15 cores | ~16 Gi | ~28 Gi |
| OCP on Cloud | ~34 | ~9 cores | ~18 cores | ~21 Gi | ~36 Gi |
Note: See Worker Deployment Scenarios for detailed worker requirements by scenario.
After installation, validate the complete data pipeline using the OCP dataflow test.
# Install Python dependencies (required for NISE data generation)
# Ubuntu/Debian
sudo apt-get install python3 python3-venv
# RHEL/CentOS/Fedora
sudo yum install python3 python3-venv
# macOS
brew install python3Note: NISE (test data generator) is automatically installed in a Python virtual environment during test execution. No manual NISE installation required.
# Option 1: Run pytest test suite (~3 minutes)
NAMESPACE=cost-onprem ./scripts/run-pytest.sh
# Option 2: Run specific test suites
./scripts/run-pytest.sh --e2e # E2E tests only
./scripts/run-pytest.sh --auth # Authentication tests
./scripts/run-pytest.sh --ros # ROS-specific tests
# Option 3: Full Cost Management E2E test (~3 minutes)
NAMESPACE=cost-onprem ./scripts/run-pytest.sh --e2e- ✅ NISE Integration - Automatic installation and production-like data generation (73 lines)
- ✅ Data Upload - Generates realistic test data and uploads via JWT auth
- ✅ Ingress Processing - CSV file uploaded to S3
- ✅ ROS Processing - CSV downloaded from S3, parsed successfully (CRLF conversion)
- ✅ Kruize Integration - Recommendations generated with actual CPU/memory values
- ✅ ROS-Only Mode - Skips Koku processing for faster validation
- ✅ Preflight - Environment checks
- ✅ Provider - Creates OCP cost provider
- ✅ Data Upload - Generates and uploads test data (CSV → TAR.GZ → S3)
- ✅ Kafka - Publishes message to trigger processing
- ✅ Processing - CSV parsing and data ingestion
- ✅ Database - Validates data in PostgreSQL tables
- ✅ Aggregation - Summary table generation
- ✅ Validation - Verifies cost calculations
[SUCCESS] ===== ROS E2E Test Summary =====
Upload Status: ✅ HTTP 202 Accepted
Koku Processing: ⏭️ Skipped (ROS-only test)
ROS Processing: ✅ CSV downloaded and parsed successfully
Kruize Status: ✅ Recommendations generated
Recommendation details (short_term cost optimization):
experiment_name | interval_end_time | cpu_request | cpu_limit | memory_request | memory_limit
------------------------------------+-------------------+-------------+-----------+----------------+--------------
org1234567;test-cluster-1769027891 | 2026-01-21 20:00 | 1.78 cores | 1.78 cores| 3.64 GB | 3.64 GB
[SUCCESS] ✅ ROS-ONLY TEST PASSED!
[SUCCESS] Found 1 recommendation(s) for cluster test-cluster-1769027891
Test Duration: ~5 minutes
Pipeline Validated: Ingress → ROS → Kruize → Recommendations
✅ E2E SMOKE TEST PASSED
Phases: 8/8 passed
✅ preflight
✅ migrations
✅ kafka_validation
✅ provider
✅ data_upload
✅ processing
✅ database
✅ validation
Total Time: ~2-3 minutes
# Port-forward to PostgreSQL
kubectl port-forward -n cost-onprem pod/cost-onprem-database-0 5432:5432 &
# Query aggregated cost data
psql -h localhost -U koku -d costonprem_koku -c "
SELECT
cluster_id,
COUNT(*) as daily_rows,
SUM(pod_usage_cpu_core_hours) as total_cpu_usage,
SUM(pod_request_cpu_core_hours) as total_cpu_request
FROM reporting_ocpusagelineitem_daily_summary
WHERE cluster_id IS NOT NULL
GROUP BY cluster_id
LIMIT 5;
"Missing prerequisites:
# Check required tools
which jq helm kubectl
# Install missing tools
sudo apt-get install jq # Ubuntu/Debian
brew install jq # macOSScript permissions:
# Make executable
chmod +x scripts/install-helm-chart.sh
# Run with explicit bash
bash scripts/install-helm-chart.sh# Test Helm repository connectivity
helm repo add cost-onprem https://insights-onprem.github.io/cost-onprem-chart
helm repo update cost-onprem
helm search repo cost-onprem
# If the repo add fails, verify the URL is reachable
curl -sI https://insights-onprem.github.io/cost-onprem-chart/index.yaml# Check node resources
kubectl describe nodes | grep -A 5 "Allocated resources"
# Check available resources
kubectl top nodes # requires metrics-serverSee Troubleshooting Guide for comprehensive solutions
After successful installation:
- Configure Access: See Configuration Guide
- Set Up JWT Auth: See JWT Authentication Guide
- Configure TLS: See TLS Setup Guide
- Run Tests: See Scripts Reference
Related Documentation: