-
Notifications
You must be signed in to change notification settings - Fork 25
Description
We are having issues with the grafana operator crashing due to leasing issues, it appears Grafana operator upstream does not recommend using leader election as per line 5 here:
https://github.com/grafana/grafana-operator/blob/master/deploy/helm/grafana-operator/values.yaml
It appears Broadcom enable this by default, even with a single replica. I'm trying to find out how to disable leaderElect with a value of false, but i'm not clear how, the only route we have as this is a supervisor service is to edit the values.yaml but every attempt has failed. I looked at the Carvel app directly but cant see a values schema. the supervisor documentation only has a reference to set image.
Here are the logs, initally the lease is aquired.
2025-04-21T07:48:00Z INFO starting server {"name": "health probe", "addr": "0.0.0.0:8081"}
I0421 07:48:00.519507 1 leaderelection.go:254] attempting to acquire leader lease svc-grafana-operator-domain-c92/f75f3bba.integreatly.org...
2025-04-21T07:48:10Z INFO GrafanaFolderReconciler folder sync complete
2025-04-21T07:48:10Z INFO GrafanaDashboardReconciler dashboard sync complete
2025-04-21T07:48:10Z INFO GrafanaDatasourceReconciler datasources sync complete
I0421 07:48:15.664768 1 leaderelection.go:268] successfully acquired lease svc-grafana-operator-domain-c92/f75f3bba.integreatly.org
Then seconds later it is lost as it tries to renew.
2025-04-21T07:48:15Z INFO GrafanaReconciler running stage {"controller": "grafana", "controllerGroup": "grafana.integreatly.org", "controllerKind": "Grafana", "Grafana": {"name":"grafana","namespace":"infra-observability"}, "namespace": "infra-observability", "name": "grafana", "reconcileID": "938b9a58-7bdb-4500-8b5d-298b3a04ffbe", "stage": "admin user"}
2025-04-21T07:48:15Z INFO GrafanaFolderReconciler found matching Grafana instances for folder {"controller": "grafanafolder", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaFolder", "GrafanaFolder": {"name":"nginx","namespace":"infra-observability"}, "namespace": "infra-observability", "name": "nginx", "reconcileID": "633aa7be-a17a-4b06-9e6f-4d8351a03de1", "count": 1}
2025-04-21T07:48:15Z INFO GrafanaDatasourceReconciler found matching Grafana instances for datasource {"controller": "grafanadatasource", "controllerGroup": "grafana.integreatly.org", "controllerKind": "GrafanaDatasource", "GrafanaDatasource": {"name":"preprod-corp-dev-test-prometheus","namespace":"infra-observability"}, "namespace": "infra-observability", "name": "preprod-corp-dev-test-prometheus", "reconcileID": "683a6049-0b87-423d-9cf9-7b041ef0f585", "count": 1}
E0421 07:48:27.674634 1 leaderelection.go:429] Failed to update lock optimitically: Put "https://10.91.0.1:443/apis/coordination.k8s.io/v1/namespaces/svc-grafana-operator-domain-c92/leases/f75f3bba.integreatly.org": context deadline exceeded, falling back to slow path
E0421 07:48:27.674820 1 leaderelection.go:436] error retrieving resource lock svc-grafana-operator-domain-c92/f75f3bba.integreatly.org: client rate limiter Wait returned an error: context deadline exceeded
I0421 07:48:27.674858 1 leaderelection.go:297] failed to renew lease svc-grafana-operator-domain-c92/f75f3bba.integreatly.org: timed out waiting for the condition
2025-04-21T07:48:27Z ERROR setup problem running manager {"version": "v5.15.0", "error": "leader election lost"}
main.main
github.com/grafana/grafana-operator/v5/main.go:268
runtime.main
runtime/proc.go:271