Pause writing data to disk for merges when disk almost full

If a node exceeds the flood-stage disk watermark then we block further writes to its indices but we allow merges to continue. Merges can temporarily consume a very large amount of disk space, more than enough to fill up the gap between any reasonable flood-stage watermark and a completely full disk. When a node completely fills its disk, it basically dies.

We could pause merge-related writes in this situation, for instance by overriding `Store$StoreDirectory#createOutput` and adjusting the output's behaviour according to the supplied `IOContext`.

We probably don't want to do this for all writes, because (e.g.) a primary may need to refresh before it can relocate itself elsewhere, and because blocking random write threads seems like a recipe for deadlocks. Blocking merge threads seems ok tho. We may also need to be sensitive to the size of the merge (see `IOContext.mergeInfo.estimatedMergeBytes` and `IOContext.flushInfo.estimatedSegmentSize`) since smaller merges may soon be triggered by the merge-on-refresh feature.

It's unclear whether to do this based on the `read_only_allow_delete` block (which affects other nodes below the flood-stage watermark) or the actual disk usage on the node (which may not know the flood-stage watermark that the master is using).

---

NB we can also consider reducing the flood-stage max headroom once we have better protection against merges consuming all the remaining space.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Pause writing data to disk for merges when disk almost full #88606

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Pause writing data to disk for merges when disk almost full #88606

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions