Skip to content

configurable rejection window for > warn-result-size queries#807

Open
neil-harkins-sf wants to merge 1 commit intoslack-19.0from
nh_v19_result_size_warn_lockout
Open

configurable rejection window for > warn-result-size queries#807
neil-harkins-sf wants to merge 1 commit intoslack-19.0from
nh_v19_result_size_warn_lockout

Conversation

@neil-harkins-sf
Copy link

Description

When vttablet's --queryserver-config-warn-result-size threshold is exceeded, the existing behavior logs a warning but takes no preventive action. This allows repeated large-result queries to compound memory pressure (e.g. heap ballooning to >5GB from queries returning 200k rows).

This change adds --queryserver-config-warn-result-size-reject-time-seconds which, when set to a nonzero value, causes vttablet to temporarily block subsequent executions of the same query fingerprint after it triggers the warn threshold. Blocked queries receive FAILED_PRECONDITION, which vtgate will retry on a different tablet (spreading load) or return to the client if all tablets have the fingerprint blocked.

Implementation uses a sync.Map in QueryEngine keyed by normalized query template with unix-second expiry, avoiding plan cache invalidation. The setting is dynamically adjustable via /debug/env.

Related Issue(s)

Checklist

  • "Backport to:" labels have been added if this change should be back-ported to release branches
  • If this change is to be back-ported to previous releases, a justification is included in the PR description
  • Tests were added or are not required
  • Did the new or modified tests pass consistently locally and on CI?
  • Documentation was added or is not required

Deployment Notes

AI Disclosure

This PR was written by Claude Code unde rmy guidance.

When vttablet's --queryserver-config-warn-result-size threshold is
exceeded, the existing behavior logs a warning but takes no preventive
action. This allows repeated large-result queries to compound memory
pressure (e.g. heap ballooning to >5GB from queries returning 200k rows).

This change adds --queryserver-config-warn-result-size-reject-time-seconds
which, when set to a nonzero value, causes vttablet to temporarily block
subsequent executions of the same query fingerprint after it triggers the
warn threshold. Blocked queries receive FAILED_PRECONDITION, which vtgate
will retry on a different tablet (spreading load) or return to the client
if all tablets have the fingerprint blocked.

Implementation uses a sync.Map in QueryEngine keyed by normalized query
template with unix-second expiry, avoiding plan cache invalidation.
The setting is dynamically adjustable via /debug/env.

Co-Authored-By: Claude <svc-devxp-claude@slack-corp.com>
@neil-harkins-sf neil-harkins-sf requested a review from a team as a code owner February 26, 2026 22:14
@github-actions github-actions bot added this to the v19.0.7 milestone Feb 26, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants