diff --git a/troubleshoot/elasticsearch/all-shards-failed.md b/troubleshoot/elasticsearch/all-shards-failed.md new file mode 100644 index 0000000000..c2d1f336cf --- /dev/null +++ b/troubleshoot/elasticsearch/all-shards-failed.md @@ -0,0 +1,199 @@ +--- +applies_to: + stack: + deployment: + eck: + ess: + ece: + self: +navigation_title: "Error: All shards failed" +--- + +# Fix error: All shards failed [all-shards-failed] + +```console +Error: all shards failed +``` + +The `all shards failed` error indicates that {{es}} couldn't get a successful response from any of the shards involved in the query. Possible causes include shard allocation issues, misconfiguration, insufficient resources, or unsupported operations such as aggregating on text fields. + +## Unsupported operations on text fields + +The `all shards failed` error can occur when you try to sort or aggregate on `text` fields. These fields are designed for full-text search and don't support exact-value operations like sorting and aggregation. + +To fix this issue, use a `.keyword` subfield: + +```console +GET my-index/_search +{ + "aggs": { + "names": { + "terms": { + "field": "name.keyword" + } + } + } +} +``` + +If no `.keyword` subfield exists, define a [multi-field](elasticsearch://reference/elasticsearch/mapping-reference/field-data-types.md#types-multi-fields) mapping: + +```console +PUT my-index +{ + "mappings": { + "properties": { + "name": { + "type": "text", + "fields": { + "keyword": { + "type": "keyword" + } + } + } + } + } +} +``` + +### Metric aggregations on text fields + +The `all shards failed` error can also occur when you use a metric aggregation on a text field. [Metric aggregations](elasticsearch://reference/aggregations/metrics.md) require numeric fields. + +You can use a script to convert the text value to a number at query time: + +```console +GET my-index/_search +{ + "aggs": { + "total_cost": { + "sum": { + "script": { + "source": "Integer.parseInt(doc.cost.value)" + } + } + } + } +} +``` + +Or change the field mapping to a numeric type: + +```console +PUT my-index +{ + "mappings": { + "properties": { + "cost": { + "type": "integer" + } + } + } +} +``` + +## Failed shard recovery + +A shard failure during recovery can prevent successful queries. + +To identify the cause, check the cluster health: + +```console +GET _cluster/health +``` + +As a last resort, you can delete the problematic index. + +## Misused global aggregation + +[Global aggregations](elasticsearch://reference/aggregations/search-aggregations-bucket-global-aggregation.md) must be defined at the top level of the aggregations object. Nesting can cause errors. + +To fix this issue, structure the query so that the `global` aggregation appears at the top level: + +```console +GET my-index/_search +{ + "size": 0, + "aggs": { + "all_products": { + "global": {}, + "aggs": { + "genres": { + "terms": { + "field": "cost" + } + } + } + } + } +} +``` + +## Reverse_nested usage errors + +Using a [`reverse_nested`](elasticsearch://reference/aggregations/search-aggregations-bucket-reverse-nested-aggregation.md) aggregation outside of a `nested` context causes errors. + +To fix this issue, structure the query so that the `reverse_nested` aggregation is inside a `nested` aggregation: + +```console +GET my-index/_search +{ + "aggs": { + "comments": { + "nested": { + "path": "comments" + }, + "aggs": { + "top_usernames": { + "terms": { + "field": "comments.username" + }, + "aggs": { + "comment_issue": { + "reverse_nested": {}, + "aggs": { + "top_tags": { + "terms": { + "field": "tags" + } + } + } + } + } + } + } + } + } +} +``` + +## Further troubleshooting + +Use the `_cat/shards` API to view shard status and troubleshoot further. + +```console +GET _cat/shards +``` + +For a specific index: + +```console +GET _cat/shards/my-index +``` + +Example output: + +```console-result +my-index 5 p STARTED 0 283b 127.0.0.1 ziap +my-index 5 r UNASSIGNED +my-index 2 p STARTED 1 3.7kb 127.0.0.1 ziap +my-index 2 r UNASSIGNED +my-index 3 p STARTED 3 7.2kb 127.0.0.1 ziap +my-index 3 r UNASSIGNED +my-index 1 p STARTED 1 3.7kb 127.0.0.1 ziap +my-index 1 r UNASSIGNED +my-index 4 p STARTED 2 3.8kb 127.0.0.1 ziap +my-index 4 r UNASSIGNED +my-index 0 p STARTED 0 283b 127.0.0.1 ziap +my-index 0 r UNASSIGNED +``` diff --git a/troubleshoot/elasticsearch/errors.md b/troubleshoot/elasticsearch/errors.md new file mode 100644 index 0000000000..29b4c58b18 --- /dev/null +++ b/troubleshoot/elasticsearch/errors.md @@ -0,0 +1,12 @@ +--- +navigation_title: Error reference +--- + +# Troubleshoot common errors in {{es}} + +Use the topics in this section to troubleshoot common errors in {{es}} deployments. + +* [](/troubleshoot/elasticsearch/all-shards-failed.md) +* [](/troubleshoot/elasticsearch/failed-to-parse-field-of-type.md) +* [](/troubleshoot/elasticsearch/unable-to-retrieve-node-fs-stats.md) +* [](/troubleshoot/elasticsearch/unable-to-parse-response-body.md) diff --git a/troubleshoot/elasticsearch/failed-to-parse-field-of-type.md b/troubleshoot/elasticsearch/failed-to-parse-field-of-type.md new file mode 100644 index 0000000000..877aaad95c --- /dev/null +++ b/troubleshoot/elasticsearch/failed-to-parse-field-of-type.md @@ -0,0 +1,58 @@ +--- +applies_to: + stack: + deployment: + eck: + ess: + ece: + self: +navigation_title: "Error: Failed to parse field of type in document with id" +--- + +# Fix error: Failed to parse field [failed-to-parse-field-of-type] + +```console +Error: failed to parse field [field] of type [type] in document with id [id] +``` + +This error occurs when you try to index a document, but one of the field values doesn't match the expected data type. {{es}} rejects the document when it encounters incompatible values, like a string in a numeric field or an invalid IP address. + +To fix this issue, make sure each field value matches the data type defined in the mapping. + +## Field types and mapping + +When no explicit mapping exists, {{es}} uses [dynamic mappings](../../manage-data/data-store/mapping/dynamic-field-mapping.md) to infer a field's type based on the **first value indexed**. + +For example, if you index: + +```console +PUT test/_doc/1 +{ + "ip_address": "179.152.62.82", + "boolean_field": "off" +} +``` + +Without explicit mapping, {{es}} will treat `ip_address` and `boolean_field` as `text`, which might not be the intended result. + +To avoid this, define the mapping explicitly: + +```console +PUT test +{ + "mappings": { + "properties": { + "ip_address": { "type": "ip" }, + "boolean_field": { "type": "boolean" } + } + } +} +``` + +To check the data type of the field causing the error, first get the mapping: + + ```console + GET your-index-name/_mapping + ``` + +Make sure the incoming data matches the expected type. If not, you'll need to fix the data or update the mapping. If necessary, create a new index with the correct mapping and reindex your data. \ No newline at end of file diff --git a/troubleshoot/elasticsearch/security/token-invalid-expired.md b/troubleshoot/elasticsearch/security/token-invalid-expired.md new file mode 100644 index 0000000000..a4bce9b4ba --- /dev/null +++ b/troubleshoot/elasticsearch/security/token-invalid-expired.md @@ -0,0 +1,62 @@ +--- +applies_to: + stack: + deployment: + eck: + ess: + ece: + self: +navigation_title: "Error: Token invalid or expired" +--- + +# Fix errors: Invalid token or token expired in {{es}} [token-invalid-expired] + +```console +Error: token expired +``` + +```console +Error: invalid token +``` + +These errors occur when {{es}} receives a request containing an invalid or expired token during authentication. They're typically caused by missing, incorrect, or outdated tokens. If an invalid or expired token is used, {{es}} rejects the request. + +## Invalid token + +{{es}} rejects requests with invalid authentication tokens. Common causes include: + +- The token is expired or revoked +- The token format is incorrect or malformed +- The Authorization header is missing or doesn’t start with Bearer +- The client or middleware failed to attach the token properly +- Security settings in {{es}} are misconfigured + +To resolve this error: + +- Verify the token and ensure it's correctly formatted and current. +- Check expiration and generate a new token if needed. +- Inspect your client and confirm the token is sent in the `Authorization` header. +- Review {{es}} settings and check that token auth is enabled: + + ```yaml + xpack.security.authc.token.enabled: true + ``` + +- Use logs for details: {{es}} logs may provide context about the failure. + + +## Token expired + +This error occurs when {{es}} receives a request containing an expired token during authentication. + +To resolve this issue: + +- Refresh the token, and obtain a new token using your token refresh workflow. +- Implement automatic token refresh and ensure your application is configured to refresh tokens before expiration. +- Avoid using expired tokens and do not reuse tokens after logout or expiration. +- Adjust token lifespan if needed and configure a longer token expiration in `elasticsearch.yml`, though this should be balanced against security needs: + + ```yaml + xpack.security.authc.token.timeout: 20m + ``` + \ No newline at end of file diff --git a/troubleshoot/elasticsearch/unable-to-parse-response-body.md b/troubleshoot/elasticsearch/unable-to-parse-response-body.md new file mode 100644 index 0000000000..5cf50b50bf --- /dev/null +++ b/troubleshoot/elasticsearch/unable-to-parse-response-body.md @@ -0,0 +1,79 @@ +--- +applies_to: + stack: + deployment: + eck: + ess: + ece: + self: +navigation_title: "Error: Unable to parse response body" +--- + +# Fix error: Unable to parse response body [unable-to-parse-response-body] + +```console +Error: Unable to parse response body +``` + +This error occurs when {{es}} cannot process a response body, possibly due to incorrect formatting or syntax. To resolve this issue, make sure the response body is in the correct format (usually JSON) and that all syntax is correct. + +If the error persists, start with these general steps: +- Check the {{es}} logs for more detailed error messages. +- Update {{es}} to the latest version. + +If you're using the high-level Java REST client, continue to the next section. + +## Java REST client + +:::{warning} +The Java REST client is deprecated. Use the [Java API client](https://www.elastic.co/guide/en/elasticsearch/client/java-api-client/current/index.html) instead. +::: + + +This error can occur when the high-level Java REST client cannot parse the response received by the low-level {{es}} client. + +The REST high-level client acts as a wrapper around the low-level client. The low-level client ultimately performs the HTTP request to the cluster. If the response returned to the high-level client is malformed or does not comply with the expected schema, the client throws the `unable to parse response body` exception. + +Use the following sections to identify and fix the root cause of the error. + +### Version mismatch + +{{es}} does not guarantee compatibility between different major versions. Make sure the client version matches the {{es}} version. For more details, refer to the [{{es}} Java server compatibility policy](elasticsearch-java://reference/index.md#_elasticsearch_server_compatibility_policy). + +### Reverse proxy with path prefix + +If your cluster is behind a reverse proxy and you have set a path prefix to access it, make sure to configure the high-level client to include the path prefix so the proxy routes the request to the cluster correctly. + +For example, suppose you have an Nginx reverse proxy receiving connections at `mycompany.com:80`, and the `/elasticsearch` path prefix is set to proxy connections to a cluster running in your infrastructure. The `/elasticsearch` path prefix must be configured on the client you're using to access the cluster — not just on the host (`mycompany.com`). + +Use `setPathPrefix()` to set the path prefix: + +```java +new RestHighLevelClient( + RestClient.builder( + new HttpHost("mycompany.com", 80, DEFAULT_SCHEME_NAME)) + .setPathPrefix("/elasticsearch") +); +``` + +For more context, refer to these Elastic community forum posts: + +- [RestHighLevelClient - Accessing an elastic http endpoint behind reverse proxy](https://discuss.elastic.co/t/resthighlevelclient-accessing-an-elastic-http-endpoint-behind-reverse-proxy/117306) +- [Issue with HighLevelRestClient with the host = xyz.com:8080/elasticsearch](https://discuss.elastic.co/t/issue-with-highlevelrestclient-with-the-host-xyz-com-8080-elasticsearch/186384) + +### HTTP size limit + +The `unable to parse response body` error can occur when bulk indexing a large volume of data. By default, {{es}} has a maximum HTTP request body size of 100 MB. To raise this limit, increase the value of `http.max_content_length` in the {{es}} configuration file. + +```yaml +http.max_content_length: 200mb +``` + +For an example of an HTTP size limit issue, refer to this Elastic community forum post: [Bulk indexing with java high level rest client gives error](https://discuss.elastic.co/t/bulk-indexing-with-java-high-level-rest-client-gives-error-unable-to-parse-response-body/161696) + +### Kubernetes with ingress controller + +If your {{es}} cluster runs on Kubernetes and is exposed through an ingress controller, check your ingress controller configuration. Misrouted or malformed responses from the controller can cause parsing errors in the client. + +For an example of incorrect SSL redirection in an ingress controller, refer to this Elastic community forum post: [RestHighLevelClient - Unable to parse response body](https://discuss.elastic.co/t/resthighlevelclient-unable-to-parse-response-body/240809) + diff --git a/troubleshoot/elasticsearch/unable-to-retrieve-node-fs-stats.md b/troubleshoot/elasticsearch/unable-to-retrieve-node-fs-stats.md new file mode 100644 index 0000000000..462360d11b --- /dev/null +++ b/troubleshoot/elasticsearch/unable-to-retrieve-node-fs-stats.md @@ -0,0 +1,99 @@ +--- +applies_to: + stack: + deployment: + eck: + ess: + ece: + self: +navigation_title: "Error: Unable to retrieve node fs stats" +--- + +# Fix error: Unable to retrieve node fs stats [unable-to-retrieve-node-fs-stats] + +```console +Error: unable to retrieve node fs stats +``` + +This error occurs when {{kib}} or another {{es}} client can't fetch version information from an {{es}} node. Without version information, the client can't confirm compatibility or proceed with requests. + +Possible causes include network issues, incorrect configuration, or unavailable nodes. To diagnose, first try these general actions: + +- Ensure that all nodes are up and running. +- Check the network connectivity between the client and the nodes. +- Verify configuration settings. + +If the issue persists, check the {{es}} logs for details, then continue with the tips below. + +## Check potential causes + +This error typically appears in the {{kib}} logs during startup. Because {{kib}} acts as a client to {{es}}, it requires access to several resources: + +- The cluster's host and port +- Authentication credentials, if required +- TLS settings, if applicable + +If {{kib}} can't reach the configured nodes, it can't verify version compatibility and logs the `unable to retrieve` error. Check these possible access issues: + +- One or more entries in `elasticsearch.hosts` are unreachable or misconfigured +- The `KBN_PATH_CONF` environment variable points to a different config file +- A firewall is blocking access between {{kib}} and {{es}} + +## Configuration locations + +Settings are defined in `kibana.yml`, usually located at `$KIBANA_HOME/config`. You can change the path as needed: + +```bash +KBN_PATH_CONF=/home/kibana/config ./bin/kibana +``` + +Check the relevant settings: + +```yaml +elasticsearch.hosts: ["http://localhost:9200"] +elasticsearch.username: "kibana" +elasticsearch.password: "your_password" +elasticsearch.ssl.certificateAuthorities: ["path/to/ca.crt"] +``` +{{kib}} tries every endpoint in `elasticsearch.hosts`, so even one unreachable node can cause the error. Use `https` if your cluster requires encrypted communication. + +### Test connectivity + +Use `curl` to test the connection to each host in `elasticsearch.hosts`: + +```bash +curl http://es01:9200/ +``` + +If you're using TLS, try one of the following: + +```bash +# Insecure test +curl -u elastic -k https://es01:9200/ + +# Secure test +curl -u elastic --cacert ~/certs/ca/ca.crt https://es01:9200/ +``` + +Example response: + +```json +{ + "name" : "node01", + "cluster_name" : "elasticsearch", + "cluster_uuid" : "fxP-R0FTRcmTl_AWs7-DiA", + "version" : { + "number" : "7.13.3", + "build_flavor" : "default", + "build_type" : "tar", + "build_hash" : "5d21bea28db1e89ecc1f66311ebdec9dc3aa7d64", + "build_date" : "2021-07-02T12:06:10.804015202Z", + "build_snapshot" : false, + "lucene_version" : "8.8.2" + }, + "tagline" : "You Know, for Search" +} +``` + +If you're still encountering issues, check the {{kib}} logs for more details and context. + diff --git a/troubleshoot/toc.yml b/troubleshoot/toc.yml index cdfea4d361..5af5a50b1e 100644 --- a/troubleshoot/toc.yml +++ b/troubleshoot/toc.yml @@ -31,6 +31,7 @@ toc: - file: elasticsearch/increase-shard-limit.md - file: elasticsearch/increase-cluster-shard-limit.md - file: elasticsearch/corruption-troubleshooting.md + - file: elasticsearch/capacity.md children: - file: elasticsearch/fix-data-node-out-of-disk.md @@ -65,6 +66,14 @@ toc: - file: elasticsearch/security/trb-security-internalserver.md - file: elasticsearch/security/trb-security-setup.md - file: elasticsearch/security/trb-security-path.md + - file: elasticsearch/security/token-invalid-expired.md + + - file: elasticsearch/errors.md + children: + - file: elasticsearch/all-shards-failed.md + - file: elasticsearch/failed-to-parse-field-of-type.md + - file: elasticsearch/unable-to-retrieve-node-fs-stats.md + - file: elasticsearch/unable-to-parse-response-body.md - file: elasticsearch/clients.md - file: elasticsearch/diagnostic.md - file: elasticsearch/more-topics.md