performance optimizations optimized

eedugon · eedugon · commit 0e6233ab05ef · 2025-03-22T19:55:36.000+01:00
diff --git a/deploy-manage/production-guidance/optimize-performance.md b/deploy-manage/production-guidance/optimize-performance.md
@@ -12,17 +12,16 @@ applies_to:
 
 # Elasticsearch performance optimizations [how-to]
 
+% investigate if any of the optimizations apply to serverless also.
 Elasticsearch’s default settings provide a good out-of-box experience for basic operations like full text search, highlighting, aggregations, and indexing.
 
-However, there are a number of optimizations you can make to improve performance for your use case.
+However, there are a number of optimizations you can make to improve performance for your use case. This section includes both deployment-level configuration suggestions and usage-level guidance to optimize the performance of your cluster.
 
-This section provides recommendations for various use cases.
+Use the following topics to explore relevant strategies:
 
 * [General recommendations](general-recommendations.md)
-* [Size your shards](optimize-performance/size-shards.md)
 * [Tune for indexing speed](optimize-performance/indexing-speed.md)
 * [Tune for search speed](optimize-performance/search-speed.md)
 * [Tune approximate kNN search](optimize-performance/approximate-knn-search.md)
 * [Tune for disk usage](optimize-performance/disk-usage.md)
-
-
+* [Size your shards](optimize-performance/size-shards.md)
diff --git a/deploy-manage/production-guidance/optimize-performance/disk-usage.md b/deploy-manage/production-guidance/optimize-performance/disk-usage.md
@@ -11,6 +11,7 @@ applies_to:
 
 # Disk usage [tune-for-disk-usage]
 
+This page provides strategies to reduce the storage footprint of your Elasticsearch indices. Disk usage is influenced by field mappings, index settings, document structure, and how you manage segments and shards. Use these recommendations to improve compression, eliminate unnecessary data, and optimize storage for your specific use case.
 
 ## Disable the features you do not need [_disable_the_features_you_do_not_need]
 
diff --git a/deploy-manage/production-guidance/optimize-performance/indexing-speed.md b/deploy-manage/production-guidance/optimize-performance/indexing-speed.md
@@ -11,6 +11,13 @@ applies_to:
 
 # Indexing speed [tune-for-indexing-speed]
 
+Elasticsearch offers a wide range of indexing performance optimizations, especially useful for high-throughput ingestion workloads. This page provides practical recommendations to help you maximize indexing speed, from bulk sizing and refresh intervals to hardware and thread management.
+
+::::{note}
+Indexing performance is also affected by your sharding and indexing strategies. Whether you’re indexing into a single index or hundreds in parallel, and how many shards each index has, can significantly influence indexing speed.
+
+Make sure to consider also your cluster’s shard count, index layout, and overall data distribution when tuning for indexing speed. Refer to [](./size-shards.md) for more details about sharing strategies and recommendations.
+::::
 
 ## Use bulk requests [_use_bulk_requests]
 
@@ -21,11 +28,12 @@ Bulk requests will yield much better performance than single-document index requ
 
 A single thread sending bulk requests is unlikely to be able to max out the indexing capacity of an Elasticsearch cluster. In order to use all resources of the cluster, you should send data from multiple threads or processes. In addition to making better use of the resources of the cluster, this should help reduce the cost of each fsync.
 
+On the other hand, sending data to a single shard from too many concurrent threads or processes can overwhelm the cluster. If the indexing load exceeds what {{es}} can handle, it may become a bottleneck and start rejecting requests or slowing down overall performance.
+
 Make sure to watch for `TOO_MANY_REQUESTS (429)` response codes (`EsRejectedExecutionException` with the Java client), which is the way that Elasticsearch tells you that it cannot keep up with the current indexing rate. When it happens, you should pause indexing a bit before trying again, ideally with randomized exponential backoff.
 
 Similarly to sizing bulk requests, only testing can tell what the optimal number of workers is. This can be tested by progressively increasing the number of workers until either I/O or CPU is saturated on the cluster.
 
-
 ## Unset or increase the refresh interval [_unset_or_increase_the_refresh_interval]
 
 The operation that consists of making changes visible to search - called a [refresh](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-refresh) - is costly, and calling it often while there is ongoing indexing activity can hurt indexing speed.
@@ -45,14 +53,28 @@ If `index.refresh_interval` is configured in the index settings, it may further
 
 
 ## Disable swapping [_disable_swapping_2]
-
+```yaml {applies_to}
+deployment:
+  self: all
+```
 You should make sure that the operating system is not swapping out the java process by [disabling swapping](../../deploy/self-managed/setup-configuration-memory.md).
 
-
 ## Give memory to the filesystem cache [_give_memory_to_the_filesystem_cache]
+```yaml {applies_to}
+deployment:
+  self: all
+  eck: all
+```
 
-The filesystem cache will be used in order to buffer I/O operations. You should make sure to give at least half the memory of the machine running Elasticsearch to the filesystem cache.
+The filesystem cache is used to buffer I/O operations and plays a critical role in {{es}} performance. You should make sure to give at least half of the system's memory to the filesystem cache.
 
+By default, {{es}} automatically sets its [JVM heap size](/deploy-manage/deploy/self-managed/important-settings-configuration.md#heap-size-settings) to follow this best practice. However, in self-managed or {{eck}} deployments, you have the flexibility to allocate even more memory to the filesystem cache.
+
+While the filesystem cache primarily benefits search workloads, it can also improve indexing speed in certain scenarios—especially when indexing into many shards or performing frequent segment merges that involve reading existing data.
+
+::::{note}
+On Linux, the filesystem cache uses any memory not actively used by applications. To allocate memory to the cache, ensure that enough system memory remains available and is not consumed by {{es}} or other processes. 
+::::
 
 ## Use auto-generated ids [_use_auto_generated_ids]
 
@@ -65,8 +87,17 @@ If indexing is I/O-bound, consider increasing the size of the filesystem cache (
 
 Stripe your index across multiple SSDs by configuring a RAID 0 array. Remember that it will increase the risk of failure since the failure of any one SSD destroys the index. However this is typically the right tradeoff to make: optimize single shards for maximum performance, and then add replicas across different nodes so there’s redundancy for any node failures. You can also use [snapshot and restore](../../tools/snapshot-and-restore.md) to backup the index for further insurance.
 
+::::{note}
+In {{ech}} and {{ece}}, you can choose the underlying hardware by selecting different hardware profiles or deployment templates. Refer to [ECH>Change hardware](/deploy-manage/deploy/elastic-cloud/change-hardware.md) and [ECE deployment templates](/deploy-manage/deploy/cloud-enterprise/configure-deployment-templates.md) for more details.
+::::
 
 ### Local vs. remote storage [_local_vs_remote_storage]
+```yaml {applies_to}
+deployment:
+  self: all
+  eck: all
+  ece: all
+```
 
 Directly-attached (local) storage generally performs better than remote storage because it is simpler to configure well and avoids communications overheads.
 
@@ -95,5 +126,4 @@ Within a single cluster, indexing and searching can compete for resources. By se
 
 ## Additional optimizations [_additional_optimizations]
 
-Many of the strategies outlined in [*Tune for disk usage*](disk-usage.md) also provide an improvement in the speed of indexing.
-
+Many of the strategies outlined in [Tune for disk usage](disk-usage.md) can also help improve indexing speed.
diff --git a/deploy-manage/production-guidance/optimize-performance/search-speed.md b/deploy-manage/production-guidance/optimize-performance/search-speed.md
@@ -11,13 +11,36 @@ applies_to:
 
 # Search speed [tune-for-search-speed]
 
+This page provides guidance on tuning {{es}} for faster search performance. While hardware and system-level settings play an important role, the structure of your documents and the design of your queries often have the biggest impact. Use these recommendations to optimize field mappings, caching behavior, and query design for high-throughput, low-latency search at scale.
+
+::::{note}
+Search performance in {{es}} depends on a combination of factors, including how expensive individual queries are, how many searches run in parallel, the number of indices and shards involved, and the overall sharding strategy and shard size. These variables influence how the system should be tuned—for example, optimizing for a small number of complex queries differs significantly from optimizing for many lightweight, concurrent searches.
+Make sure to consider also your cluster’s shard count, index layout, and overall data distribution when tuning for indexing speed. Refer to [](./size-shards.md) for more details about sharing strategies and recommendations.
+::::
+
 
 ## Give memory to the filesystem cache [_give_memory_to_the_filesystem_cache_2]
+```yaml {applies_to}
+deployment:
+  self: all
+  eck: all
+```
+
+{{es}} heavily relies on the filesystem cache in order to make search fast. In general, you should make sure that at least half the available memory goes to the filesystem cache so that {{es}} can keep hot regions of the index in physical memory.
 
-Elasticsearch heavily relies on the filesystem cache in order to make search fast. In general, you should make sure that at least half the available memory goes to the filesystem cache so that Elasticsearch can keep hot regions of the index in physical memory.
+By default, {{es}} automatically sets its [JVM heap size](/deploy-manage/deploy/self-managed/important-settings-configuration.md#heap-size-settings) to follow this best practice. However, in self-managed or {{eck}} deployments, you have the flexibility to allocate even more memory to the filesystem cache, which can lead to performance improvements depending on your workload.
 
+::::{note}
+On Linux, the filesystem cache uses any memory not actively used by applications. To allocate memory to the cache, ensure that enough system memory remains available and is not consumed by {{es}} or other processes. 
+::::
 
 ## Avoid page cache thrashing by using modest readahead values on Linux [_avoid_page_cache_thrashing_by_using_modest_readahead_values_on_linux]
+```yaml {applies_to}
+deployment:
+  self: all
+  eck: all
+  ece: all
+```
 
 Search can cause a lot of randomized read I/O. When the underlying block device has a high readahead value, there may be a lot of unnecessary read I/O done, especially when files are accessed using memory mapping (see [storage types](elasticsearch://reference/elasticsearch/index-settings/store.md#file-system)).
 
@@ -29,16 +52,25 @@ You can check the current value in `KiB` using `lsblk -o NAME,RA,MOUNTPOINT,TYPE
 `blockdev` expects values in 512 byte sectors whereas `lsblk` reports values in `KiB`. As an example, to temporarily set readahead to `128KiB` for `/dev/nvme0n1`, specify `blockdev --setra 256 /dev/nvme0n1`.
 ::::
 
-
+The disk readahead cannot be adjusted in {{ech}}, as it is controlled at Linux kernel level. However, it can be modified in {{ece}}, Kubernetes, or self-managed nodes.
 
 ## Use faster hardware [search-use-faster-hardware]
 
 If your searches are I/O-bound, consider increasing the size of the filesystem cache (see above) or using faster storage. Each search involves a mix of sequential and random reads across multiple files, and there may be many searches running concurrently on each shard, so SSD drives tend to perform better than spinning disks.
 
 If your searches are CPU-bound, consider using a larger number of faster CPUs.
 
+::::{note}
+In {{ech}} and {{ece}}, you can choose the underlying hardware by selecting different hardware profiles or deployment templates. Refer to [ECH>Change hardware](/deploy-manage/deploy/elastic-cloud/change-hardware.md) and [ECE deployment templates](/deploy-manage/deploy/cloud-enterprise/configure-deployment-templates.md) for more details.
+::::
 
 ### Local vs. remote storage [_local_vs_remote_storage_2]
+```yaml {applies_to}
+deployment:
+  self: all
+  eck: all
+  ece: all
+```
 
 Directly-attached (local) storage generally performs better than remote storage because it is simpler to configure well and avoids communications overheads.
 
@@ -80,7 +112,6 @@ PUT movies
 }
 ```
 
-
 ## Pre-index data [_pre_index_data]
 
 You should leverage patterns in your queries to optimize the way data is indexed. For instance, if all your documents have a `price` field and most queries run [`range`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-range-aggregation.md) aggregations on a fixed list of ranges, you could make this aggregation faster by pre-indexing the ranges into the index and using a [`terms`](elasticsearch://reference/data-analysis/aggregations/search-aggregations-bucket-terms-aggregation.md) aggregations.
@@ -152,7 +183,6 @@ GET index/_search
 }
 ```
 
-
 ## Consider mapping identifiers as `keyword` [map-ids-as-keyword]
 
 Not all numeric data should be mapped as a [numeric](elasticsearch://reference/elasticsearch/mapping-reference/number.md) field data type. {{es}} optimizes numeric fields, such as `integer` or `long`, for [`range`](elasticsearch://reference/query-languages/query-dsl-range-query.md) queries. However, [`keyword`](elasticsearch://reference/elasticsearch/mapping-reference/keyword.md) fields are better for [`term`](elasticsearch://reference/query-languages/query-dsl-term-query.md) and other [term-level](elasticsearch://reference/query-languages/term-level-queries.md) queries.
@@ -309,7 +339,6 @@ Loading data into the filesystem cache eagerly on too many indices or too many f
 ::::
 
 
-
 ## Use index sorting to speed up conjunctions [_use_index_sorting_to_speed_up_conjunctions]
 
 [Index sorting](elasticsearch://reference/elasticsearch/index-settings/sorting.md) can be useful in order to make conjunctions faster at the cost of slightly slower indexing. Read more about it in the [index sorting documentation](elasticsearch://reference/elasticsearch/index-settings/sorting-conjunctions.md).

Original file line number	Diff line number	Diff line change
`@@ -11,6 +11,7 @@ applies_to:`
`11`	`11`
`12`	`12`	`# Disk usage [tune-for-disk-usage]`
`13`	`13`
	`14`	`+This page provides strategies to reduce the storage footprint of your Elasticsearch indices. Disk usage is influenced by field mappings, index settings, document structure, and how you manage segments and shards. Use these recommendations to improve compression, eliminate unnecessary data, and optimize storage for your specific use case.`
`14`	`15`
`15`	`16`	`## Disable the features you do not need [_disable_the_features_you_do_not_need]`
`16`	`17`