Skip to content

Commit dd1800d

Browse files
authored
Update troubleshoot-collection.md (#5014)
updated some troubleshooting steps
1 parent 7ac96a8 commit dd1800d

File tree

1 file changed

+22
-2
lines changed

1 file changed

+22
-2
lines changed

docs/send-data/kubernetes/troubleshoot-collection.md

Lines changed: 22 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -912,9 +912,29 @@ Delete the pod forcefully by adding `--force --grace-period=0` to the `kubectl d
912912

913913
If you are running the out of the box rancher monitoring setup, you cannot run our Prometheus operator alongside it. The Rancher Prometheus Operator setup will actually kill and permanently terminate our Prometheus Operator instance and will prevent the metrics system from coming up. If you have the Rancher prometheus operator setup running, they will have to use the UI to disable it before they can install our collection process.
914914

915-
### unmarshal errors: field collector_selector not found in type config.Config
915+
### Incorrect CRDs
916+
917+
If you receive errors similar to below, this typically points to a schema (CRD) that’s out of date. Ensure you have the correct CRDs applied in the cluster.
918+
919+
```
920+
unmarshal errors: field collector_selector not found in type config.Config
921+
```
922+
923+
### HorizontalPodAutoscaler (Metrics Server Disabled)
924+
925+
If you receive warning events similar to below, this typically means that the HorizontalPodAutoscaler (HPA) cannot connect to the metrics-server or the metrics-server is disabled.
926+
927+
```
928+
Warning FailedGetResourceMetric horizontalpodautoscaler/sumo-logic-sumologic-otelcol-metrics failed to get cpu utilization: unable to get metrics for resource cpu: unable to fetch metrics from resource metrics API: the server could not find the requested resource (get pods.metrics.k8s.io)
929+
```
930+
931+
To resolve this, you can try enabling the metrics-server manually in the helm chart configuration:
932+
933+
```yaml
934+
metrics-server:
935+
enabled: true
936+
```
916937

917-
This typically points to a schema (CRD) that’s out of date. Ensure you have the correct CRDs applied in the cluster.
918938

919939
### Falco and Google Kubernetes Engine (GKE)
920940

0 commit comments

Comments
 (0)