Skip to content

Hardcoded Leader Election settings cause excessive Kube API Server traffic (Lease spam)Β #3898

@piby180

Description

@piby180

Describe the bug
The Argo Events Controller and EventSource pods are generating a high volume of coordination.k8s.io Lease renewal requests. For EventSources, the leaseDurationSeconds is hardcoded to 5 seconds, leading to a renewal request every ~2 seconds. For EventController leaseDurationSeconds is 15 seconds. This effectively creates a "denial of service" (heavy load) on the Kubernetes API server.

I tried to use environment variables to set custom values. This works with Argo Workflows helm chart but not with Argo Events. The environment variables are available in event controller and event source pods but they are not being used

Event Controller (helm chart)

controller:
  replicas: 3
  env: # Env does not work
    - name: LEADER_ELECTION_LEASE_DURATION
      value: "120s" # Default is 15s
    - name: LEADER_ELECTION_RENEW_DEADLINE
      value: "100s" # Default is 10s
    - name: LEADER_ELECTION_RETRY_PERIOD
      value: "30s" # Default is 2s

Event Source

apiVersion: argoproj.io/v1alpha1
kind: EventSource
metadata:
  name: argo-events-event-source-mks-s3-event
spec:
  replicas: 3
  template:
    serviceAccountName: argo-events-controller-manager
    nodeSelector:
      workload-type: ${ARGO_EVENTS_WORKLOAD_TYPE}
    container: # Note: This must be 'container', not 'containers'
      env:
        - name: LEADER_ELECTION_LEASE_DURATION
          value: "120s"
        - name: LEADER_ELECTION_RENEW_DEADLINE
          value: "100s"
        - name: LEADER_ELECTION_RETRY_PERIOD
          value: "30s"
  eventBusName: argo-events-event-bus-mks
  kafka:
    s3-event:
      url: ${AWS_MSK_BOOTSTRAP_BROKERS}
      topic: ${AWS_MSK_ARGO_EVENTS_S3_TOPIC}
      jsonBody: true
      tls:
        insecureSkipVerify: true # https://github.com/argoproj/argo-events/issues/1277
      sasl:
        mechanism: SCRAM-SHA-512
        passwordSecret:
          name: argo-events-event-bus-mks-secret
          key: password
        userSecret:
          name: argo-events-event-bus-mks-secret
          key: user
      connectionBackoff:
        duration: 10s
        steps: 5
        factor: 2
        jitter: 0.2
      consumerGroup:
        groupName: argo-events-event-source-mks-s3-event
        oldest: false
        rebalanceStrategy: range
      limitEventsPerSecond: 1
      version: "3.8.0"

To Reproduce
Deploy event controller and kafka event source. Check
kubectl get lease -n argo and check "Lease Duration Seconds"

Expected behavior
Argo events should use these environment variables and modify lease just like Argo Workflows does.

Screenshots
None

Environment (please complete the following information):

  • Kubernetes: [v1.33.0]
  • Argo: [3.6]
  • Argo Events: [e.g. v1.9.10]

Additional context
Add any other context about the problem here.


Message from the maintainers:

If you wish to see this enhancement implemented please add a πŸ‘ reaction to this issue! We often sort issues this way to know what to prioritize.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions