fix: Coordinator memory, validate only coordinator heap, use worker heap capacity by patdevinwilson · Pull Request #27162 · prestodb/presto

patdevinwilson · 2026-02-18T17:28:27Z

Description

This PR improves how the coordinator handles memory when it does not run tasks (node-scheduler.include-coordinator=false):

Coordinator-only memory validation
The coordinator no longer validates that query.max-memory-per-node / query.max-total-memory-per-node fit in its own JVM heap. It only checks that memory.heap-headroom-per-node fits and sizes a single general pool to (heap − headroom). Workers still enforce the full per-node limits. This allows a small-heap coordinator to start with the same config as large-heap workers.
Worker-advertised capacity for query limits
When the coordinator does not schedule work, it can cap query memory limits using the sum of workers' advertised general pool capacity. Effective limits become min(configured query.max-memory / query.max-total-memory, sum of worker capacities). This is controlled by query.use-worker-advertised-memory-for-limit (default true).

Changes:

LocalMemoryManager: New constructor with useCoordinatorOnlyValidation; when true, only heap headroom is validated and only the general pool is created (no reserved pool).
LocalMemoryManagerProvider: Builds LocalMemoryManager with coordinator-only validation when serverConfig.isCoordinator() && !nodeSchedulerConfig.isIncludeCoordinator().
ServerMainModule: Binds LocalMemoryManager via LocalMemoryManagerProvider.
MemoryManagerConfig: New query.use-worker-advertised-memory-for-limit (default true).
ClusterMemoryManager: When the flag is true and work is not scheduled on the coordinator, effective user and total query limits are capped by the sum of worker general pool maxBytes (from MemoryInfo).
Tests for coordinator-only validation and for the new config; admin docs updated for the new property.

Motivation and Context

Today, the coordinator runs the same LocalMemoryManager validation as workers, so query.max-memory-per-node (and thus query.max-total-memory-per-node) must fit in the coordinator's heap. With node-scheduler.include-coordinator=false, the coordinator does not run tasks but still had to pass that check, forcing the same per-node value for the whole cluster and blocking small-heap coordinators when workers use larger limits.

A better design is: the coordinator only validates its own (small) heap, and workers advertise capacity; the coordinator uses worker-advertised capacity (capped by config) for scheduling and OOM decisions. This PR implements that.

Impact

Public API / config: New config query.use-worker-advertised-memory-for-limit (boolean, default true). Documented in admin properties.
Behavior: With node-scheduler.include-coordinator=false, the coordinator can start with large query.max-memory-per-node / query.max-total-memory-per-node (for workers) as long as memory.heap-headroom-per-node fits in its heap. When the new config is true, effective query limits are capped by the sum of worker general pool capacity.
Performance: No intentional performance change; coordinator uses existing worker MemoryInfo already gathered for pool updates.

Test Plan

Unit: TestLocalMemoryManager – coordinator-only path allows large per-node config with small heap and fails when headroom ≥ heap. TestNodeMemoryConfig – validateCoordinatorHeapHeadroom passes/fails as expected. TestMemoryManagerConfig – default and explicit mapping for query.use-worker-advertised-memory-for-limit.
Manual: Coordinator with node-scheduler.include-coordinator=false, small heap, and worker-sized query.max-memory-per-node starts successfully; with workers up, query limits are effectively capped by worker-advertised capacity when the new config is true.

Contributor checklist

Submission complies with contributing guide (code style and commit standards).
PR description addresses the change accurately.
Documented new property query.use-worker-advertised-memory-for-limit (default true) in admin properties.
Release notes filled below if required.
Tests added for coordinator-only validation and new config.
CI passed.
No new dependencies.

Release Notes

Summary by Sourcery

Adjust coordinator memory handling to support small-heap coordinators that do not schedule work, and cap query memory limits using worker-advertised capacity.

New Features:

Introduce coordinator-only LocalMemoryManager validation mode that sizes only a general pool based on heap minus headroom and allows worker-sized per-node limits on a small-heap coordinator.
Add an optional mechanism to cap per-query user and total memory limits by the sum of workers' advertised general pool capacity when the coordinator does not schedule work.

Enhancements:

Provide a LocalMemoryManagerProvider that configures LocalMemoryManager differently on coordinators that do not include themselves in scheduling.
Wire LocalMemoryManager through dependency injection so its behavior can depend on server and scheduler configuration.

Documentation:

Document the new query.use-worker-advertised-memory-for-limit configuration property in the admin properties.

Tests:

Extend memory-related unit tests to cover coordinator-only heap-headroom validation, the new configuration property defaults and mappings, and the worker-capacity-based limit behavior.

…ised capacity for limits - LocalMemoryManager: coordinator-only validation when include-coordinator=false (only heap headroom validated; no reserved pool) - LocalMemoryManagerProvider: wire coordinator-only path in ServerMainModule - MemoryManagerConfig: query.use-worker-advertised-memory-for-limit (default true) - ClusterMemoryManager: cap query limits by sum of worker general pool capacity - Tests and admin docs for new config and validation

linux-foundation-easycla · 2026-02-18T17:28:37Z

The committers listed above are authorized under a signed CLA.

✅ login: patdevinwilson / name: Patrick Wilson (13d082a, 1e78f92, 50b62db, 99ba3e9, d058461, e4471e9)

sourcery-ai · 2026-02-18T17:28:41Z

Reviewer's Guide

Implements coordinator-only memory validation and worker-advertised memory caps so that a non-scheduling coordinator can run with a smaller heap while still enforcing safe query limits based on worker capacity.

Sequence diagram for worker-advertised memory limits during query processing

sequenceDiagram
    actor User
    participant Coordinator
    participant ClusterMemoryManager
    participant Worker1
    participant Worker2
    participant GeneralPool as GeneralPool_cluster

    User->>Coordinator: submitQuery()
    Coordinator->>ClusterMemoryManager: registerQuery(query)

    loop periodicMemoryUpdates
        Worker1->>ClusterMemoryManager: sendMemoryInfo(generalPoolMaxBytes_1)
        Worker2->>ClusterMemoryManager: sendMemoryInfo(generalPoolMaxBytes_2)
        ClusterMemoryManager->>GeneralPool: updateTotalDistributedBytes()
    end

    loop memoryManagementCycle
        Coordinator->>ClusterMemoryManager: process(runningQueries)
        ClusterMemoryManager->>ClusterMemoryManager: readConfigFlags(isWorkScheduledOnCoordinator,useWorkerAdvertisedMemoryForLimit)
        alt coordinatorDoesNotScheduleWork_and_flagTrue
            ClusterMemoryManager->>GeneralPool: getTotalDistributedBytes()
            GeneralPool-->>ClusterMemoryManager: workerTotalCapacity
            ClusterMemoryManager->>ClusterMemoryManager: effectiveMaxQueryTotalMemoryInBytes = min(configuredMaxQueryTotalMemoryInBytes,workerTotalCapacity)
            ClusterMemoryManager->>ClusterMemoryManager: effectiveMaxQueryMemoryInBytes = min(configuredMaxQueryMemoryInBytes,effectiveMaxQueryTotalMemoryInBytes)
        else flagFalse_or_coordinatorSchedulesWork
            ClusterMemoryManager->>ClusterMemoryManager: effectiveMaxQueryMemoryInBytes = configuredMaxQueryMemoryInBytes
            ClusterMemoryManager->>ClusterMemoryManager: effectiveMaxQueryTotalMemoryInBytes = configuredMaxQueryTotalMemoryInBytes
        end

        ClusterMemoryManager->>ClusterMemoryManager: userMemoryLimit = min(effectiveMaxQueryMemoryInBytes,sessionQueryMaxMemory)
        ClusterMemoryManager->>ClusterMemoryManager: totalMemoryLimit = min(effectiveMaxQueryTotalMemoryInBytes,otherLimits)
        ClusterMemoryManager-->>Coordinator: enforceLimits_or_failQuery()
    end

Class diagram for updated memory management components

classDiagram
    class LocalMemoryManager {
        +DataSize maxMemory
        +Map_pools
        +LocalMemoryManager(NodeMemoryConfig_config)
        +LocalMemoryManager(NodeMemoryConfig_config,long_availableMemory)
        +LocalMemoryManager(NodeMemoryConfig_config,long_availableMemory,boolean_useCoordinatorOnlyValidation)
        -configureMemoryPools(NodeMemoryConfig_config,long_availableMemory,boolean_useCoordinatorOnlyValidation)
        +MemoryInfo getInfo()
        +static void validateHeapHeadroom(NodeMemoryConfig_config,long_availableMemory)
        +static void validateCoordinatorHeapHeadroom(NodeMemoryConfig_config,long_availableMemory)
    }

    class LocalMemoryManagerProvider {
        -NodeMemoryConfig nodeMemoryConfig
        -ServerConfig serverConfig
        -NodeSchedulerConfig nodeSchedulerConfig
        +LocalMemoryManagerProvider(NodeMemoryConfig_nodeMemoryConfig,ServerConfig_serverConfig,NodeSchedulerConfig_nodeSchedulerConfig)
        +LocalMemoryManager get()
    }

    class ClusterMemoryManager {
        -boolean isWorkScheduledOnCoordinator
        -boolean isBinaryTransportEnabled
        -boolean useWorkerAdvertisedMemoryForLimit
        -Map_pools
        -long maxQueryMemoryInBytes
        -long maxQueryTotalMemoryInBytes
        +ClusterMemoryManager(MemoryManagerConfig_config,NodeSchedulerConfig_schedulerConfig,ServerConfig_serverConfig,QueryManagerConfig_queryManagerConfig,MemoryManagerConfig_memoryManagerConfig,FeaturesConfig_featuresConfig,NodeTaskMap_nodeTaskMap,MemoryPool_assigner,QueryIdGenerator_queryIdGenerator)
        +void process(Iterable_runningQueries)
    }

    class MemoryManagerConfig {
        -String lowMemoryKillerPolicy
        -Duration killOnOutOfMemoryDelay
        -boolean tableFinishOperatorMemoryTrackingEnabled
        -boolean useWorkerAdvertisedMemoryForLimit
        +boolean isUseWorkerAdvertisedMemoryForLimit()
        +MemoryManagerConfig setUseWorkerAdvertisedMemoryForLimit(boolean_useWorkerAdvertisedMemoryForLimit)
    }

    class NodeMemoryConfig
    class ServerConfig {
        +boolean isCoordinator()
    }
    class NodeSchedulerConfig {
        +boolean isIncludeCoordinator()
    }

    LocalMemoryManagerProvider ..> LocalMemoryManager : creates
    LocalMemoryManagerProvider --> NodeMemoryConfig : uses
    LocalMemoryManagerProvider --> ServerConfig : uses
    LocalMemoryManagerProvider --> NodeSchedulerConfig : uses

    ClusterMemoryManager --> MemoryManagerConfig : reads_limits
    ClusterMemoryManager --> NodeSchedulerConfig : reads_includeCoordinator

    ServerConfig ..> ClusterMemoryManager
    NodeMemoryConfig ..> LocalMemoryManager
    MemoryManagerConfig ..> ClusterMemoryManager

File-Level Changes

Change	Details	Files
Add coordinator-only memory validation path that only checks heap headroom and creates a single general pool without a reserved pool.	Introduce a three-argument LocalMemoryManager constructor taking useCoordinatorOnlyValidation and delegate existing two-arg constructor to it with false. Update configureMemoryPools to branch on useCoordinatorOnlyValidation, validating only coordinator heap headroom, sizing the general pool to (heap - headroom), skipping reserved pool creation, and otherwise preserving the existing validation logic. Add validateCoordinatorHeapHeadroom helper that ensures heap headroom is non-negative and less than available memory, and use it in the coordinator-only path. Extend LocalMemoryManager tests to cover coordinator-only validation success with large per-node limits and failure when headroom exceeds heap.	`presto-main-base/src/main/java/com/facebook/presto/memory/LocalMemoryManager.java` `presto-main-base/src/test/java/com/facebook/presto/memory/TestLocalMemoryManager.java` `presto-main-base/src/test/java/com/facebook/presto/memory/TestNodeMemoryConfig.java`
Make LocalMemoryManager construction environment-aware via a provider that enables coordinator-only validation when the coordinator does not schedule work on itself.	Introduce LocalMemoryManagerProvider that inspects ServerConfig and NodeSchedulerConfig to decide whether to enable coordinator-only validation and constructs LocalMemoryManager with Runtime max heap. Bind LocalMemoryManager in ServerMainModule to be provided by LocalMemoryManagerProvider instead of direct singleton binding.	`presto-main/src/main/java/com/facebook/presto/server/LocalMemoryManagerProvider.java` `presto-main/src/main/java/com/facebook/presto/server/ServerMainModule.java`
Add configuration flag to cap query memory limits based on worker-advertised general pool capacity and wire it into cluster memory management.	Extend MemoryManagerConfig with boolean useWorkerAdvertisedMemoryForLimit (default true) and expose it as config property query.use-worker-advertised-memory-for-limit with description. Update ClusterMemoryManager to read the new flag, compute effectiveMaxQueryMemoryInBytes and effectiveMaxQueryTotalMemoryInBytes based on worker general pool total capacity when the coordinator does not schedule work, and use these effective limits for user and total query memory limit checks. Update MemoryManagerConfig tests to cover defaults and explicit mappings for the new property.	`presto-main-base/src/main/java/com/facebook/presto/memory/MemoryManagerConfig.java` `presto-main/src/main/java/com/facebook/presto/memory/ClusterMemoryManager.java` `presto-main-base/src/test/java/com/facebook/presto/memory/TestMemoryManagerConfig.java`
Update documentation for the new memory manager configuration property.	Document query.use-worker-advertised-memory-for-limit in admin properties, describing its behavior and default value.	`presto-docs/src/main/sphinx/admin/properties.rst`

Possibly linked issues

#(unlisted): PR implements coordinator-only validation so query per-node memory needn’t fit coordinator heap, directly resolving the issue

Tips and commands

Interacting with Sourcery

Trigger a new review: Comment @sourcery-ai review on the pull request.
Continue discussions: Reply directly to Sourcery's review comments.
Generate a GitHub issue from a review comment: Ask Sourcery to create an
issue from a review comment by replying to it. You can also reply to a
review comment with @sourcery-ai issue to create an issue from it.
Generate a pull request title: Write @sourcery-ai anywhere in the pull
request title to generate a title at any time. You can also comment
@sourcery-ai title on the pull request to (re-)generate the title at any time.
Generate a pull request summary: Write @sourcery-ai summary anywhere in
the pull request body to generate a PR summary at any time exactly where you
want it. You can also comment @sourcery-ai summary on the pull request to
(re-)generate the summary at any time.
Generate reviewer's guide: Comment @sourcery-ai guide on the pull
request to (re-)generate the reviewer's guide at any time.
Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
pull request to resolve all Sourcery comments. Useful if you've already
addressed all the comments and don't want to see them anymore.
Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
request to dismiss all existing Sourcery reviews. Especially useful if you
want to start fresh with a new review - don't forget to comment
@sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

Enable or disable review features such as the Sourcery-generated pull request
summary, the reviewer's guide, and others.
Change the review language.
Add, remove or edit custom review instructions.
Adjust other review settings.

Getting Help

Contact our support team for questions or feedback.
Visit our documentation for detailed guides and information.
Keep in touch with the Sourcery team by following us on X/Twitter, LinkedIn or GitHub.

sourcery-ai

Hey - I've found 1 issue, and left some high level feedback:

The LocalMemoryManager(NodeMemoryConfig, long, boolean) constructor is annotated @VisibleForTesting but is now used in production via LocalMemoryManagerProvider; either remove the annotation or introduce a separate production-facing factory to avoid misleading the intent.
In ClusterMemoryManager.process, when worker-advertised capacity reduces effectiveMaxQueryMemoryInBytes / effectiveMaxQueryTotalMemoryInBytes, consider emitting a debug log with the capped values and worker capacity to aid in diagnosing cluster-wide memory limit behavior.

Prompt for AI Agents

Please address the comments from this code review:

## Overall Comments
- The `LocalMemoryManager(NodeMemoryConfig, long, boolean)` constructor is annotated `@VisibleForTesting` but is now used in production via `LocalMemoryManagerProvider`; either remove the annotation or introduce a separate production-facing factory to avoid misleading the intent.
- In `ClusterMemoryManager.process`, when worker-advertised capacity reduces `effectiveMaxQueryMemoryInBytes` / `effectiveMaxQueryTotalMemoryInBytes`, consider emitting a debug log with the capped values and worker capacity to aid in diagnosing cluster-wide memory limit behavior.

## Individual Comments

### Comment 1
<location> `presto-main-base/src/test/java/com/facebook/presto/memory/TestMemoryManagerConfig.java:59-62` </location>
<code_context>
                 .put("query.max-total-memory", "3GB")
                 .put("query.soft-max-total-memory", "2GB")
                 .put("table-finish-operator-memory-tracking-enabled", "true")
+                .put("query.use-worker-advertised-memory-for-limit", "false")
                 .build();

</code_context>

<issue_to_address>
**issue (testing):** Missing behavioral tests for worker-advertised capacity capping of query limits in ClusterMemoryManager.

Config-level coverage is good, but we still lack tests that exercise this behavior in `ClusterMemoryManager`. Please add tests (in the existing `ClusterMemoryManager` test suite) for at least: (1) `useWorkerAdvertisedMemoryForLimit = true` with worker capacity smaller than configured limits, asserting effective user/total limits are capped; (2) capacity larger than configured limits, asserting configured limits still apply; and (3) `useWorkerAdvertisedMemoryForLimit = false` or coordinator scheduling work, asserting behavior is unchanged. This will verify the new flag and config interaction with cluster-level memory enforcement end‑to‑end.
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨

_{Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.}

presto-main-base/src/test/java/com/facebook/presto/memory/TestMemoryManagerConfig.java

steveburnett · 2026-02-18T19:53:57Z

Please sign the Presto CLA as mentioned in this comment. Thanks!

steveburnett

Thanks for the documentation! Looks good, just a nit of phrasing.

presto-docs/src/main/sphinx/admin/properties.rst

Co-authored-by: Steve Burnett <burnett@pobox.com>

steveburnett

LGTM! (docs)

Pull updated branch, new local doc build, looks good. Thanks!

…-advertised capacity for limits

… limit

- LazyOutputBuffer: no-op when delegate is null and state is terminal to avoid IllegalStateException 'Buffer has not been initialized' on teardown/races - TestMetadata.testShowTables: use information_schema.tables instead of SHOW TABLES LIKE so expected (Java) query runs reliably in native-vs-java tests

- Listener: return early when task revocable memory <= threshold (don't schedule) - Visitor: only request revocation when !isMemoryRevokingRequested() to avoid stale revoking-requested flags (fixes TestMemoryRevokingScheduler.testTaskThresholdRevokingSchedulerImmediate) Co-authored-by: Cursor <cursoragent@cursor.com>

patdevinwilson requested review from a team, elharo and steveburnett as code owners February 18, 2026 17:28

sourcery-ai bot reviewed Feb 18, 2026

View reviewed changes

presto-main-base/src/test/java/com/facebook/presto/memory/TestMemoryManagerConfig.java Show resolved Hide resolved

steveburnett requested changes Feb 18, 2026

View reviewed changes

presto-docs/src/main/sphinx/admin/properties.rst Outdated Show resolved Hide resolved

Update presto-docs/src/main/sphinx/admin/properties.rst

1e78f92

Co-authored-by: Steve Burnett <burnett@pobox.com>

steveburnett previously approved these changes Feb 19, 2026

View reviewed changes

patdevinwilson dismissed steveburnett’s stale review via b6a60e1 February 19, 2026 17:13

patdevinwilson force-pushed the pwilson/coordinator-memory-worker-advertised-limit branch from b6a60e1 to e4471e9 Compare February 19, 2026 17:31

feat: Coordinator memory - validate only coordinator heap; use worker…

e4471e9

…-advertised capacity for limits

patdevinwilson changed the title ~~Coordinator memory: validate only coordinator heap; use worker-advert…~~ fix (Coordinator memory: validate only coordinator heap; use worker-advert… Feb 19, 2026

patdevinwilson changed the title ~~fix (Coordinator memory: validate only coordinator heap; use worker-advert…~~ fix: Coordinator memory, validate only coordinator heap, use worker heap capacity Feb 19, 2026

patdevinwilson added 2 commits February 19, 2026 18:42

Product tests: disable worker-advertised memory cap to fix q18 memory…

50b62db

… limit

patdevinwilson requested a review from a team as a code owner February 20, 2026 14:50

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: Coordinator memory, validate only coordinator heap, use worker heap capacity#27162

fix: Coordinator memory, validate only coordinator heap, use worker heap capacity#27162
patdevinwilson wants to merge 6 commits intoprestodb:masterfrom
patdevinwilson:pwilson/coordinator-memory-worker-advertised-limit

patdevinwilson commented Feb 18, 2026 •

edited by sourcery-ai bot

Loading

Uh oh!

linux-foundation-easycla bot commented Feb 18, 2026 •

edited

Loading

Uh oh!

sourcery-ai bot commented Feb 18, 2026 •

edited

Loading

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Uh oh!

Uh oh!

steveburnett commented Feb 18, 2026

Uh oh!

steveburnett left a comment

Uh oh!

Uh oh!

steveburnett left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

Conversation

patdevinwilson commented Feb 18, 2026 • edited by sourcery-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Impact

Test Plan

Contributor checklist

Release Notes

Summary by Sourcery

Uh oh!

linux-foundation-easycla bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

sourcery-ai bot commented Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviewer's Guide

Sequence diagram for worker-advertised memory limits during query processing

Class diagram for updated memory management components

File-Level Changes

Possibly linked issues

Interacting with Sourcery

Customizing Your Experience

Getting Help

Uh oh!

sourcery-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

steveburnett commented Feb 18, 2026

Uh oh!

steveburnett left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

steveburnett left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Comments

patdevinwilson commented Feb 18, 2026 •

edited by sourcery-ai bot

Loading

linux-foundation-easycla bot commented Feb 18, 2026 •

edited

Loading

sourcery-ai bot commented Feb 18, 2026 •

edited

Loading