You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
% investigate if any of the optimizations apply to serverless also.
15
16
Elasticsearch’s default settings provide a good out-of-box experience for basic operations like full text search, highlighting, aggregations, and indexing.
16
17
17
-
However, there are a number of optimizations you can make to improve performance for your use case.
18
+
However, there are a number of optimizations you can make to improve performance for your use case. This section includes both deployment-level configuration suggestions and usage-level guidance to optimize the performance of your cluster.
18
19
19
-
This section provides recommendations for various use cases.
20
+
Use the following topics to explore relevant strategies:
Copy file name to clipboardExpand all lines: deploy-manage/production-guidance/optimize-performance/disk-usage.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,6 +11,7 @@ applies_to:
11
11
12
12
# Disk usage [tune-for-disk-usage]
13
13
14
+
This page provides strategies to reduce the storage footprint of your Elasticsearch indices. Disk usage is influenced by field mappings, index settings, document structure, and how you manage segments and shards. Use these recommendations to improve compression, eliminate unnecessary data, and optimize storage for your specific use case.
14
15
15
16
## Disable the features you do not need [_disable_the_features_you_do_not_need]
Copy file name to clipboardExpand all lines: deploy-manage/production-guidance/optimize-performance/indexing-speed.md
+36-6Lines changed: 36 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,6 +11,13 @@ applies_to:
11
11
12
12
# Indexing speed [tune-for-indexing-speed]
13
13
14
+
Elasticsearch offers a wide range of indexing performance optimizations, especially useful for high-throughput ingestion workloads. This page provides practical recommendations to help you maximize indexing speed, from bulk sizing and refresh intervals to hardware and thread management.
15
+
16
+
::::{note}
17
+
Indexing performance is also affected by your sharding and indexing strategies. Whether you’re indexing into a single index or hundreds in parallel, and how many shards each index has, can significantly influence indexing speed.
18
+
19
+
Make sure to consider also your cluster’s shard count, index layout, and overall data distribution when tuning for indexing speed. Refer to [](./size-shards.md) for more details about sharing strategies and recommendations.
20
+
::::
14
21
15
22
## Use bulk requests [_use_bulk_requests]
16
23
@@ -21,11 +28,12 @@ Bulk requests will yield much better performance than single-document index requ
21
28
22
29
A single thread sending bulk requests is unlikely to be able to max out the indexing capacity of an Elasticsearch cluster. In order to use all resources of the cluster, you should send data from multiple threads or processes. In addition to making better use of the resources of the cluster, this should help reduce the cost of each fsync.
23
30
31
+
On the other hand, sending data to a single shard from too many concurrent threads or processes can overwhelm the cluster. If the indexing load exceeds what {{es}} can handle, it may become a bottleneck and start rejecting requests or slowing down overall performance.
32
+
24
33
Make sure to watch for `TOO_MANY_REQUESTS (429)` response codes (`EsRejectedExecutionException` with the Java client), which is the way that Elasticsearch tells you that it cannot keep up with the current indexing rate. When it happens, you should pause indexing a bit before trying again, ideally with randomized exponential backoff.
25
34
26
35
Similarly to sizing bulk requests, only testing can tell what the optimal number of workers is. This can be tested by progressively increasing the number of workers until either I/O or CPU is saturated on the cluster.
27
36
28
-
29
37
## Unset or increase the refresh interval [_unset_or_increase_the_refresh_interval]
30
38
31
39
The operation that consists of making changes visible to search - called a [refresh](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-refresh) - is costly, and calling it often while there is ongoing indexing activity can hurt indexing speed.
@@ -45,14 +53,28 @@ If `index.refresh_interval` is configured in the index settings, it may further
45
53
46
54
47
55
## Disable swapping [_disable_swapping_2]
48
-
56
+
```yaml {applies_to}
57
+
deployment:
58
+
self: all
59
+
```
49
60
You should make sure that the operating system is not swapping out the java process by [disabling swapping](../../deploy/self-managed/setup-configuration-memory.md).
50
61
51
-
52
62
## Give memory to the filesystem cache [_give_memory_to_the_filesystem_cache]
63
+
```yaml {applies_to}
64
+
deployment:
65
+
self: all
66
+
eck: all
67
+
```
53
68
54
-
The filesystem cache will be used in order to buffer I/O operations. You should make sure to give at least half the memory of the machine running Elasticsearch to the filesystem cache.
69
+
The filesystem cache is used to buffer I/O operations and plays a critical role in {{es}} performance. You should make sure to give at least half of the system's memory to the filesystem cache.
55
70
71
+
By default, {{es}} automatically sets its [JVM heap size](/deploy-manage/deploy/self-managed/important-settings-configuration.md#heap-size-settings) to follow this best practice. However, in self-managed or {{eck}} deployments, you have the flexibility to allocate even more memory to the filesystem cache.
72
+
73
+
While the filesystem cache primarily benefits search workloads, it can also improve indexing speed in certain scenarios—especially when indexing into many shards or performing frequent segment merges that involve reading existing data.
74
+
75
+
::::{note}
76
+
On Linux, the filesystem cache uses any memory not actively used by applications. To allocate memory to the cache, ensure that enough system memory remains available and is not consumed by {{es}} or other processes.
77
+
::::
56
78
57
79
## Use auto-generated ids [_use_auto_generated_ids]
58
80
@@ -65,8 +87,17 @@ If indexing is I/O-bound, consider increasing the size of the filesystem cache (
65
87
66
88
Stripe your index across multiple SSDs by configuring a RAID 0 array. Remember that it will increase the risk of failure since the failure of any one SSD destroys the index. However this is typically the right tradeoff to make: optimize single shards for maximum performance, and then add replicas across different nodes so there’s redundancy for any node failures. You can also use [snapshot and restore](../../tools/snapshot-and-restore.md) to backup the index for further insurance.
67
89
90
+
::::{note}
91
+
In {{ech}} and {{ece}}, you can choose the underlying hardware by selecting different hardware profiles or deployment templates. Refer to [ECH>Change hardware](/deploy-manage/deploy/elastic-cloud/change-hardware.md) and [ECE deployment templates](/deploy-manage/deploy/cloud-enterprise/configure-deployment-templates.md) for more details.
92
+
::::
68
93
69
94
### Local vs. remote storage [_local_vs_remote_storage]
95
+
```yaml {applies_to}
96
+
deployment:
97
+
self: all
98
+
eck: all
99
+
ece: all
100
+
```
70
101
71
102
Directly-attached (local) storage generally performs better than remote storage because it is simpler to configure well and avoids communications overheads.
72
103
@@ -95,5 +126,4 @@ Within a single cluster, indexing and searching can compete for resources. By se
Copy file name to clipboardExpand all lines: deploy-manage/production-guidance/optimize-performance/search-speed.md
+34-5Lines changed: 34 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,13 +11,36 @@ applies_to:
11
11
12
12
# Search speed [tune-for-search-speed]
13
13
14
+
This page provides guidance on tuning {{es}} for faster search performance. While hardware and system-level settings play an important role, the structure of your documents and the design of your queries often have the biggest impact. Use these recommendations to optimize field mappings, caching behavior, and query design for high-throughput, low-latency search at scale.
15
+
16
+
::::{note}
17
+
Search performance in {{es}} depends on a combination of factors, including how expensive individual queries are, how many searches run in parallel, the number of indices and shards involved, and the overall sharding strategy and shard size. These variables influence how the system should be tuned—for example, optimizing for a small number of complex queries differs significantly from optimizing for many lightweight, concurrent searches.
18
+
Make sure to consider also your cluster’s shard count, index layout, and overall data distribution when tuning for indexing speed. Refer to [](./size-shards.md) for more details about sharing strategies and recommendations.
19
+
::::
20
+
14
21
15
22
## Give memory to the filesystem cache [_give_memory_to_the_filesystem_cache_2]
23
+
```yaml {applies_to}
24
+
deployment:
25
+
self: all
26
+
eck: all
27
+
```
28
+
29
+
{{es}} heavily relies on the filesystem cache in order to make search fast. In general, you should make sure that at least half the available memory goes to the filesystem cache so that {{es}} can keep hot regions of the index in physical memory.
16
30
17
-
Elasticsearch heavily relies on the filesystem cache in order to make search fast. In general, you should make sure that at least half the available memory goes to the filesystem cache so that Elasticsearch can keep hot regions of the index in physical memory.
31
+
By default, {{es}} automatically sets its [JVM heap size](/deploy-manage/deploy/self-managed/important-settings-configuration.md#heap-size-settings) to follow this best practice. However, in self-managed or {{eck}} deployments, you have the flexibility to allocate even more memory to the filesystem cache, which can lead to performance improvements depending on your workload.
18
32
33
+
::::{note}
34
+
On Linux, the filesystem cache uses any memory not actively used by applications. To allocate memory to the cache, ensure that enough system memory remains available and is not consumed by {{es}} or other processes.
35
+
::::
19
36
20
37
## Avoid page cache thrashing by using modest readahead values on Linux [_avoid_page_cache_thrashing_by_using_modest_readahead_values_on_linux]
38
+
```yaml {applies_to}
39
+
deployment:
40
+
self: all
41
+
eck: all
42
+
ece: all
43
+
```
21
44
22
45
Search can cause a lot of randomized read I/O. When the underlying block device has a high readahead value, there may be a lot of unnecessary read I/O done, especially when files are accessed using memory mapping (see [storage types](elasticsearch://reference/elasticsearch/index-settings/store.md#file-system)).
23
46
@@ -29,16 +52,25 @@ You can check the current value in `KiB` using `lsblk -o NAME,RA,MOUNTPOINT,TYPE
29
52
`blockdev`expects values in 512 byte sectors whereas `lsblk` reports values in `KiB`. As an example, to temporarily set readahead to `128KiB` for `/dev/nvme0n1`, specify `blockdev --setra 256 /dev/nvme0n1`.
30
53
::::
31
54
32
-
55
+
The disk readahead cannot be adjusted in {{ech}}, as it is controlled at Linux kernel level. However, it can be modified in {{ece}}, Kubernetes, or self-managed nodes.
33
56
34
57
## Use faster hardware [search-use-faster-hardware]
35
58
36
59
If your searches are I/O-bound, consider increasing the size of the filesystem cache (see above) or using faster storage. Each search involves a mix of sequential and random reads across multiple files, and there may be many searches running concurrently on each shard, so SSD drives tend to perform better than spinning disks.
37
60
38
61
If your searches are CPU-bound, consider using a larger number of faster CPUs.
39
62
63
+
::::{note}
64
+
In {{ech}} and {{ece}}, you can choose the underlying hardware by selecting different hardware profiles or deployment templates. Refer to [ECH>Change hardware](/deploy-manage/deploy/elastic-cloud/change-hardware.md) and [ECE deployment templates](/deploy-manage/deploy/cloud-enterprise/configure-deployment-templates.md) for more details.
65
+
::::
40
66
41
67
### Local vs. remote storage [_local_vs_remote_storage_2]
68
+
```yaml {applies_to}
69
+
deployment:
70
+
self: all
71
+
eck: all
72
+
ece: all
73
+
```
42
74
43
75
Directly-attached (local) storage generally performs better than remote storage because it is simpler to configure well and avoids communications overheads.
44
76
@@ -80,7 +112,6 @@ PUT movies
80
112
}
81
113
```
82
114
83
-
84
115
## Pre-index data [_pre_index_data]
85
116
86
117
You should leverage patterns in your queries to optimize the way data is indexed. For instance, if all your documents have a `price` field and most queries run [`range`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-range-aggregation.md) aggregations on a fixed list of ranges, you could make this aggregation faster by pre-indexing the ranges into the index and using a [`terms`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) aggregations.
@@ -152,7 +183,6 @@ GET index/_search
152
183
}
153
184
```
154
185
155
-
156
186
## Consider mapping identifiers as `keyword` [map-ids-as-keyword]
157
187
158
188
Not all numeric data should be mapped as a [numeric](elasticsearch://reference/elasticsearch/mapping-reference/number.md) field data type. {{es}} optimizes numeric fields, such as `integer` or `long`, for [`range`](elasticsearch://reference/query-languages/query-dsl-range-query.md) queries. However, [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) fields are better for [`term`](elasticsearch://reference/query-languages/query-dsl-term-query.md) and other [term-level](elasticsearch://reference/query-languages/term-level-queries.md) queries.
@@ -309,7 +339,6 @@ Loading data into the filesystem cache eagerly on too many indices or too many f
309
339
::::
310
340
311
341
312
-
313
342
## Use index sorting to speed up conjunctions [_use_index_sorting_to_speed_up_conjunctions]
314
343
315
344
[Index sorting](elasticsearch://reference/elasticsearch/index-settings/sorting.md) can be useful in order to make conjunctions faster at the cost of slightly slower indexing. Read more about it in the [index sorting documentation](elasticsearch://reference/elasticsearch/index-settings/sorting-conjunctions.md).
0 commit comments