-
Notifications
You must be signed in to change notification settings - Fork 75
Description
Describe the bug
The EMQX CRD version v2beta1 can cause a nil pointer dereference in the operator controller when updating the Replicas status if spec.coreTemplate.spec.replicas is not defined in the CRD.
Logs
"level":"info","ts":"2025-01-12T17:42:42Z","msg":"Starting workers","controller":"rebalance","controllerGroup":"apps.emqx.io","controllerKind":"Rebalance","worker count":1}
{"level":"info","ts":"2025-01-12T17:42:42Z","msg":"Starting workers","controller":"emqx","controllerGroup":"apps.emqx.io","controllerKind":"EMQX","worker count":1}
{"level":"info","ts":"2025-01-12T17:42:43Z","msg":"Observed a panic in reconciler: runtime error: invalid memory address or nil pointer dereference","controller":"emqx","controllerGroup":"apps.emqx.io","controllerKind":"EMQX","EMQX":{"name":"emqx","namespace":"emqx"},"namespace":"emqx","name":"emqx","reconcileID":"eaeaed64-f44c-420c-81e2-8e1613bbeb4c"}
panic: runtime error: invalid memory address or nil pointer dereference [recovered]
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x177dc84]
goroutine 248 [running]:
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile.func1()
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:116 +0x1e5
panic({0x195d980?, 0x2c00a90?})
/usr/local/go/src/runtime/panic.go:770 +0x132
github.com/emqx/emqx-operator/controllers/apps/v2beta1.(*updateStatus).reconcile(0xc00049e020?, {0x1e76b08?, 0xc00098a690?}, {{0xc0001c5830?, 0xc00058cc08?}, 0xc00098a690?}, 0xc00058cc08?, {0x0?, 0x0?})
/workspace/controllers/apps/v2beta1/update_emqx_status.go:24 +0x44
github.com/emqx/emqx-operator/controllers/apps/v2beta1.(*EMQXReconciler).Reconcile(0xc000429f50, {0x1e76b08, 0xc00098a690}, {{{0xc0009afaac?, 0x5?}, {0xc0009afaa8?, 0xc0006a9d10?}}})
/workspace/controllers/apps/v2beta1/emqx_controller.go:137 +0x7c3
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Reconcile(0x1e7b088?, {0x1e76b08?, 0xc00098a690?}, {{{0xc0009afaac?, 0xb?}, {0xc0009afaa8?, 0x0?}}})
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:119 +0xb7
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler(0xc0002f6960, {0x1e76b40, 0xc0000507d0}, {0x1a0de80, 0xc0000ac4e0})
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:316 +0x3bc
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem(0xc0002f6960, {0x1e76b40, 0xc0000507d0})
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:266 +0x1be
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2()
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:227 +0x79
created by sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2 in goroutine 145
/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.17.2/pkg/internal/controller/controller.go:223 +0x50c
To Reproduce
-
Deploy the EMQX Operator as described in the official docs
-
Create an EMQX CRD without defining
spec.coreTemplate.spec.replicas:--- apiVersion: apps.emqx.io/v2beta1 kind: EMQX metadata: name: emqx namespace: emqx spec: image: emqx:5 coreTemplate: metadata: labels: app: emqx
-
Observe that the operator crashes due to a nil pointer dereference.
Expected Behavior
An EMQX instace should be deployed successfully based on the CRD without causing a crash in the operator controller. The operator should handle the absence of replicas gracefully.
The following manifest resolves the issue:
apiVersion: apps.emqx.io/v2beta1
kind: EMQX
metadata:
name: emqx
namespace: emqx
spec:
image: emqx:5
coreTemplate:
spec:
replicas: 1Additional Information
The issue is likely caused by the lack of a nil check on instance.Spec.CoreTemplate.Spec.Replicas in the controller code, specifically at [this line](
| instance.Status.CoreNodesStatus.Replicas = *instance.Spec.CoreTemplate.Spec.Replicas |
Replicas field if it is not specified.
To fix this:
- The controller should ensure that
instance.Spec.CoreTemplate.Spec.Replicasis checked fornilbefore dereferencing. - Alternatively, an admission webhook could be added to ensure that a valid value for
replicasis provided during CRD creation.
Environment Details:
- Kubernetes version: 1.32.0
- Cloud provider/provisioner: Self Hosted Kubernetes Cluster
- EMQX Operator version: 2.2.26
- Installation method: Helm via CICD