diff --git a/docs/reference/index-modules/merge.asciidoc b/docs/reference/index-modules/merge.asciidoc index b4c785446ca4a..517581e23dca7 100644 --- a/docs/reference/index-modules/merge.asciidoc +++ b/docs/reference/index-modules/merge.asciidoc @@ -14,18 +14,32 @@ resources between merging and other activities like search. [[merge-scheduling]] === Merge scheduling -The merge scheduler (ConcurrentMergeScheduler) controls the execution of merge -operations when they are needed. Merges run in separate threads, and when the -maximum number of threads is reached, further merges will wait until a merge -thread becomes available. - -The merge scheduler supports the following _dynamic_ setting: - -`index.merge.scheduler.max_thread_count`:: - - The maximum number of threads on a single shard that may be merging at once. - Defaults to - `Math.max(1, Math.min(4, <> / 2))` which - works well for a good solid-state-disk (SSD). If your index is on spinning - platter drives instead, decrease this to 1. +The merge scheduler controls the execution of merge operations when they are needed. +Merges run on the dedicated `merge` thread pool. +Smaller merges are prioritized over larger ones, across all shards on the node. +Merges are disk IO throttled so that bursts, while merging activity is otherwise low, are smoothed out in order to not impact indexing throughput. +There is no limit on the number of merges that can be enqueued for execution on the thread pool. +However, beyond a certain per-shard limit, after merging is completely disk IO un-throttled, indexing for the shard will itself be throttled until merging catches up. + +The available disk space is periodically monitored, such that no new merge tasks are scheduled for execution when the available disk space is low. +This is in order to prevent that the temporary disk space, which is required while merges are executed, completely fills up the disk space on the node. + +The merge scheduler supports the following *dynamic* settings: + +`index.merge.scheduler.max_thread_count` +: The maximum number of threads on a **single** shard that may be merging at once. Defaults to `Math.max(1, Math.min(4, <> / 2))` which works well for a good solid-state-disk (SSD). If your index is on spinning platter drives instead, decrease this to 1. + +`indices.merge.disk.check_interval` +: The time interval for checking the available disk space. Defaults to `5s`. + +`indices.merge.disk.watermark.high` +: Controls the disk usage watermark, which defaults to `95%`, beyond which no merge tasks can start execution. +The disk usage tally includes the estimated temporary disk space still required by all the currently executing merge tasks. +Any merge task scheduled *before* the limit is reached continues execution, even if the limit is exceeded while executing +(merge tasks are not aborted). + +`indices.merge.disk.watermark.high.max_headroom` +: Controls the max headroom for the merge disk usage watermark, in case it is specified as percentage or ratio values. +Defaults to `100GB` when `indices.merge.disk.watermark.high` is not explicitly set. +This caps the amount of free disk space before merge scheduling is blocked. diff --git a/docs/reference/modules/threadpool.asciidoc b/docs/reference/modules/threadpool.asciidoc index 8ae8f59c22982..e01e139d46a6e 100644 --- a/docs/reference/modules/threadpool.asciidoc +++ b/docs/reference/modules/threadpool.asciidoc @@ -79,10 +79,13 @@ There are several thread pools, but the important ones include: default maximum size of `min(5, (`<>`) / 2)`. +`merge`:: + For [merge](https://www.elastic.co/guide/en/elasticsearch/reference/current/index-modules-merge.html) operations of all the shards on the node. + Thread pool type is `scaling` with a keep-alive of `5m` and a default maximum size of [`# of allocated processors`](#node.processors). + `force_merge`:: - For <> operations. - Thread pool type is `fixed` with a size of `max(1, (`<>`) / 8)` and an unbounded queue size. + For waiting on blocking [force merge](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-indices-forcemerge) operations. + Thread pool type is `fixed` with a size of `max(1, (`[`# of allocated processors`](#node.processors)`) / 8)` and an unbounded queue size. `management`:: For cluster management.