You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/en/ingest-management/troubleshooting/troubleshooting.asciidoc
+123-2Lines changed: 123 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -833,6 +833,127 @@ To resolve this, either install {agent} without the `--unprivileged` flag so tha
833
833
834
834
[discrete]
835
835
[[agent-kubernetes-kustomize]]
836
-
== Problems installing Elastic Agent on Kubernetes through `kustomize`
836
+
== Troubleshoot {agent} Installation on Kubernetes, with Kustomize
837
837
838
-
TBD :)
838
+
Potential issues during {agent} installation on Kubernetes can be categorized into two main areas:
839
+
840
+
. <<agent-kustomize-manifest>>.
841
+
. <<agent-kustomize-after>>.
842
+
843
+
[discrete]
844
+
[[agent-kustomize-manifest]]
845
+
=== Problems related to the creation of objects within the manifest
846
+
847
+
When troubleshooting installations performed with https://github.com/kubernetes-sigs/kustomize[Kustomize], it's always a good practice to inspect the output of the rendered manifest. To do this, take the installation command provided by Kibana Onboarding and replace the final part, `| kubectl apply -f-`, with a redirection to a local file. This allows for easier analysis of the rendered output.
848
+
849
+
For example, the following command, originally provided by Kibana for an {agent} Standalone installation, has been modified to redirect the output for troubleshooting purposes:
The previous generates a local file named `elastic_agent_installation_complete_manifest.yaml`, which you can use for further analysis. It contains the complete set of resources required for the {agent} installation, including:
857
+
858
+
* RBAC objects (`ServiceAccounts`, `Roles`, etc. )
859
+
860
+
* `ConfigMaps` and `Secrets` for {agent} configuration.
861
+
862
+
* {agent} Standalone deployed as a `DaemonSet`
863
+
864
+
* https://github.com/kubernetes/kube-state-metrics[Kube-state-metrics] deployed as a `Deployment`.
865
+
866
+
The content of this file is equivalent to the one obtained when following the "<<running-on-kubernetes-standalone>>" document, with the exception of `kube-state-metrics`, which is not included in the other method.
867
+
868
+
Possible issues:
869
+
870
+
* If your user don't have *cluster-admin* privileges, the RBAC resources creation might fail.
871
+
872
+
* Some Kubernetes security mechanisms (like https://kubernetes.io/docs/concepts/security/pod-security-standards/[Pod Security Standards]) could make part of the manifest to be rejected, as `hostNetwork` access and `hostPath` volumes are required.
873
+
874
+
* If you already have an installation of `kube-state-metrics`, it could make part of the manifest installation to fail or to update your existing resources wihtout notice.
875
+
876
+
[discrete]
877
+
[[agent-kustomize-after]]
878
+
=== Failures occurring within specific components after installation
879
+
880
+
If the installation is correct, all resources are deployed, but data is not flowing as expected (for example you don't see anything on *[Metrics Kubernetes] Cluster Overview* dashboard), check the following items:
881
+
882
+
. Check resources status and ensure they are all `Running`
883
+
+
884
+
[source,sh]
885
+
----
886
+
kubectl get pods -n kube-system | grep elastic
887
+
kubectl get pods -n kube-system | grep kube-state-metrics
888
+
----
889
+
890
+
. Describe the Pods in case they are in `Pending` state:
* Connectivity, authorization, or authentication issues when connecting to Elasticsearch:
913
+
+
914
+
Ensure the API Key and Elasticsearch destination endpoint used during the installation is correct and is reachable from within the Pods.
915
+
+
916
+
In an already installed system, the API Key is stored in a `Secret` named `elastic-agent-creds-<hash>`, and the endpoint is configured in the `ConfigMap` `elastic-agent-configs-<hash>`.
917
+
918
+
* Only missing cluster-level metrics (provided by `kube-state-metrics`):
919
+
+
920
+
These metrics (`state_*`) are retrieved by one of the Pods acting as `leader` (as described in <<running-on-kubernetes-standalone>>), so in order to troubleshoot that situation:
921
+
+
922
+
. Check which Pod owns the leadership `lease` in the cluster, with:
923
+
+
924
+
[source,sh]
925
+
----
926
+
kubectl get lease -n kube-system elastic-agent-cluster-leader
927
+
----
928
+
+
929
+
. Check the logs of that Pod to see if there are errors when connecting to `kube-state-metrics` and if the `state_*` metrics are being sent.
930
+
+
931
+
One way to check if `state_*` metrics are being delivered to Elasticsearch is to check the log lines with the `"Non-zero metrics in the last 30s"` message and check the values of the "state_*" metricsets within the line, with something like:
If the previous return something like `"state_pod":{"events":213,"success":213}` for all `state_*` metricsets, it means the metrics are being delivered.
939
+
+
940
+
. As a last resort, if you believe none of the Pods is acting as a leader, you can try deleting the `lease`, thereby generating a new one:
# wait a few seconds and check for the lease again
946
+
kubectl get lease -n kube-system elastic-agent-cluster-leader
947
+
----
948
+
949
+
* Performance problems
950
+
+
951
+
Monitor the CPU and Memory usage of the agents Pods and adjust the manifest requests and limits when needed. Refer to <<scaling-on-kubernetes>> documentation for extra details about the needed resources.
952
+
953
+
Extra resources for {agent} on Kubernetes troubleshooting and information:
954
+
955
+
* <<agent-oom-k8s>>.
956
+
957
+
* https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes/elastic-agent-kustomize/default[{agent} Kustomize Templates] documentation and resources.
958
+
959
+
* Other examples and manifests to deploy https://github.com/elastic/elastic-agent/tree/main/deploy/kubernetes[{agent} on Kubernetes].
0 commit comments