-
Notifications
You must be signed in to change notification settings - Fork 804
Description
Describe the bug
The Argo Events Controller and EventSource pods are generating a high volume of coordination.k8s.io Lease renewal requests. For EventSources, the leaseDurationSeconds is hardcoded to 5 seconds, leading to a renewal request every ~2 seconds. For EventController leaseDurationSeconds is 15 seconds. This effectively creates a "denial of service" (heavy load) on the Kubernetes API server.
I tried to use environment variables to set custom values. This works with Argo Workflows helm chart but not with Argo Events. The environment variables are available in event controller and event source pods but they are not being used
Event Controller (helm chart)
controller:
replicas: 3
env: # Env does not work
- name: LEADER_ELECTION_LEASE_DURATION
value: "120s" # Default is 15s
- name: LEADER_ELECTION_RENEW_DEADLINE
value: "100s" # Default is 10s
- name: LEADER_ELECTION_RETRY_PERIOD
value: "30s" # Default is 2s
Event Source
apiVersion: argoproj.io/v1alpha1
kind: EventSource
metadata:
name: argo-events-event-source-mks-s3-event
spec:
replicas: 3
template:
serviceAccountName: argo-events-controller-manager
nodeSelector:
workload-type: ${ARGO_EVENTS_WORKLOAD_TYPE}
container: # Note: This must be 'container', not 'containers'
env:
- name: LEADER_ELECTION_LEASE_DURATION
value: "120s"
- name: LEADER_ELECTION_RENEW_DEADLINE
value: "100s"
- name: LEADER_ELECTION_RETRY_PERIOD
value: "30s"
eventBusName: argo-events-event-bus-mks
kafka:
s3-event:
url: ${AWS_MSK_BOOTSTRAP_BROKERS}
topic: ${AWS_MSK_ARGO_EVENTS_S3_TOPIC}
jsonBody: true
tls:
insecureSkipVerify: true # https://github.com/argoproj/argo-events/issues/1277
sasl:
mechanism: SCRAM-SHA-512
passwordSecret:
name: argo-events-event-bus-mks-secret
key: password
userSecret:
name: argo-events-event-bus-mks-secret
key: user
connectionBackoff:
duration: 10s
steps: 5
factor: 2
jitter: 0.2
consumerGroup:
groupName: argo-events-event-source-mks-s3-event
oldest: false
rebalanceStrategy: range
limitEventsPerSecond: 1
version: "3.8.0"
To Reproduce
Deploy event controller and kafka event source. Check
kubectl get lease -n argo and check "Lease Duration Seconds"
Expected behavior
Argo events should use these environment variables and modify lease just like Argo Workflows does.
Screenshots
None
Environment (please complete the following information):
- Kubernetes: [v1.33.0]
- Argo: [3.6]
- Argo Events: [e.g. v1.9.10]
Additional context
Add any other context about the problem here.
Message from the maintainers:
If you wish to see this enhancement implemented please add a π reaction to this issue! We often sort issues this way to know what to prioritize.