Skip to content

Reduce default control plane resources for cost optimizationΒ #4318

@KaranKumar3

Description

@KaranKumar3

Problem

Gatekeeper's controller and audit components each request 512Mi memory by default. While appropriate for large clusters with many policies, this results in significant costs for smaller deployments.

Current cost impact:

  • Controller: ~$49/year (100m CPU, 512Mi memory)
  • Audit: ~$49/year (100m CPU, 512Mi memory)
  • Total: ~$98/year per cluster

For organizations running Gatekeeper across multiple clusters:

  • 5 clusters: ~$490/year
  • 10 clusters: ~$980/year
  • 20 clusters: ~$1,960/year

Analysis

I analyzed Gatekeeper resource usage using Wozz:

Cost breakdown:

# gatekeeper-controller-manager
resources:
  requests:
    cpu: 100m       # $24/yr
    memory: 512Mi   # $24.58/yr

# gatekeeper-audit
resources:
  requests:
    cpu: 100m       # $24/yr
    memory: 512Mi   # $24.58/yr

Cost breakdown:

Cluster Count Current Annual Cost Optimized (256Mi) Savings
1 cluster $98 $49 $49 (50%)
5 clusters $490 $245 $245 (50%)
10 clusters $980 $490 $490 (50%)
20 clusters $1,960 $980 $980 (50%)

Proposal

Add deployment profiles based on cluster size and policy complexity:

# values.yaml

# Deployment profile selector (new)
deploymentProfile: "standard"  # Options: small, standard, large

# Resource profiles
resourceProfiles:
  # Small clusters (<100 nodes, <50 policies)
  small:
    controllerManager:
      resources:
        requests:
          cpu: 100m
          memory: 256Mi
        limits:
          memory: 512Mi
    audit:
      resources:
        requests:
          cpu: 100m
          memory: 256Mi
        limits:
          memory: 512Mi
  
  # Standard clusters (current defaults)
  standard:
    controllerManager:
      resources:
        requests:
          cpu: 100m
          memory: 512Mi
        limits:
          memory: 512Mi
    audit:
      resources:
        requests:
          cpu: 100m
          memory: 512Mi
        limits:
          memory: 512Mi
  
  # Large clusters (100+ nodes, 100+ policies)
  large:
    controllerManager:
      resources:
        requests:
          cpu: 200m
          memory: 1024Mi
        limits:
          memory: 1024Mi
    audit:
      resources:
        requests:
          cpu: 200m
          memory: 1024Mi
        limits:
          memory: 1024Mi

Usage:

# Small (dev/staging, <100 nodes, <50 policies)
helm install gatekeeper gatekeeper/gatekeeper --set deploymentProfile=small

# Standard (current default, no change)
helm install gatekeeper gatekeeper/gatekeeper

# Large (production, 100+ nodes, complex policies)
helm install gatekeeper gatekeeper/gatekeeper --set deploymentProfile=large

# Custom override still works
helm install gatekeeper gatekeeper/gatekeeper \
  --set controllerManager.resources.requests.memory=384Mi

Evidence
Typical Gatekeeper memory usage patterns:

  • Small clusters (<100 nodes, <50 policies): 128-256Mi for controller, 128-256Mi for audit
  • Medium clusters (100-500 nodes, 50-100 policies): 256-512Mi for controller, 256-512Mi for audit
  • Large clusters (500+ nodes, 100+ policies): 512Mi-1Gi for controller, 512Mi-1Gi for audit

The "small" profile (256Mi) is appropriate for:

  • Development/staging clusters
  • Clusters with standard policy sets (<50 policies)
  • Smaller production clusters (<100 nodes)

This represents a significant portion of Gatekeeper deployments, especially in multi-cluster environments where dev/staging clusters don't need production-scale resources.

Impact
For the Gatekeeper community:

  • CNCF graduated project - demonstrates cost-conscious best practices
  • Multi-cluster friendly - Teams typically run Gatekeeper in multiple environments
  • Lower adoption barrier - Cheaper to deploy in dev/staging for testing
  • Cost transparency - Profiles clearly indicate resource expectations

Additional Context
As a CNCF graduated project and the de facto Kubernetes policy engine, Gatekeeper is widely deployed across multiple clusters (dev, staging, prod). Optimizing per-cluster costs has a significant aggregate impact across the community.

Similar resource profile patterns are being adopted by other CNCF infrastructure projects to acknowledge different deployment scales and use cases.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions