Skip to content

Possible memory leak in controller - grows from ~140Mi to 8.8GB over 127 days #4571

@jacobk-papaya

Description

@jacobk-papaya

Checklist

  • I've included steps to reproduce the bug.
  • I've included the version of argo rollouts.

Describe the bug

The Argo Rollouts controller exhibits a memory leak where memory consumption grows continuously over time. After 127 days of uptime, the leader controller pod consumed 8.8GB of memory. After restarting, the same pod uses only ~140Mi.

Environment

  • Argo Rollouts version: v1.8.3
  • Kubernetes version: v1.33 (EKS)
  • HA mode: Yes (2 replicas with leader election)

Observed Behavior

Pod Role Uptime Memory
argo-rollouts-7ccd66b9d5-g4rvs Leader 127 days 8877Mi
argo-rollouts-7ccd66b9d5-78v26 Standby 110 days 34Mi

After restart:

Pod Role Uptime Memory
argo-rollouts-6f589bd668-9hkn6 Leader ~1.5 hours 142Mi
argo-rollouts-6f589bd668-schtq Standby ~1.5 hours 31Mi

Cluster Details

  • Rollouts: 25
  • AnalysisRuns: 135 (active)
  • Deployment frequency: High (continuous deployments throughout the day)

Controller Configuration

--healthzPort=8080
--metricsport=8090
--loglevel=info
--logformat=text
--kloglevel=0
--leader-elect
--aws-verify-target-group
--rollout-threads=20
--analysis-threads=30

No resource limits were configured (empty resources: {}).

Steps to Reproduce

  1. Deploy Argo Rollouts v1.8.3 in HA mode (2 replicas)
  2. Configure multiple rollouts with analysis templates
  3. Perform frequent deployments over weeks/months
  4. Monitor memory consumption of the leader pod

Expected Behavior

Memory usage should remain stable over time, not grow unbounded.

Additional Context

  • The standby pod (non-leader) maintains low memory (~31-34Mi) regardless of uptime
  • Only the leader pod exhibits the memory growth
  • Similar issue reported in High CPU/Mem usage with Argo #2443 (closed without resolution)
  • The dashboard component shows the same leak pattern (15GB after 110 days)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions