admission: move yielding into ElasticCPUWorkHandle #159917

sumeerbhola · 2025-12-19T19:53:19Z

ElasticCPUWorkHandle.Overlimit is expected to be called in a tight loop, so yielding there is conceptually the right place. More importantly, this will allow in the future for KV work that is not holding latches to also yield.

As part of this change, elastic work that does not wish to wait in admission control queues (due to cluster settings), is now accounted for in the elastic tokens, and in the admission.elastic_cpu_bypassed.utilization metric. One side-effect of this accounting is that work that needs to wait in admission queues may have fewer tokens available to it, and may wait longer. This is considered acceptable since:

Elastic work that bypasses queueing is still elastic work, and our overarching goal is to reduce impact to foreground work.
Due to the default on use of runtime.Yield, all elastic work yields, which allows the system to run at higher elastic CPU utilization without impacting the latency of foreground work.

Epic: none

Release note: None

cockroach-teamcity · 2025-12-19T19:53:32Z

This change is

pkg/kv/bulk/cpu_pacer.go

dt · 2025-12-19T20:18:51Z

pkg/kv/bulk/cpu_pacer.go

 	if !ok {
 		tenantID = roachpb.SystemTenantID
 	}
 	return db.AdmissionPacerFactory.NewPacer(


I think AdmissionPacerFactory can be nil as well: https://github.com/cockroachdb/cockroach/blob/master/pkg/server/tenant.go#L222-L223

Good catch.
Is this a concern, in that were you trying to make yield work for SQL pods in serverless?
If yes, I'll look into fixing that old todo.

I guess my version would yield in pods even if the rest of elastic AC was otherwise not hooked up. But I don’t know if this is all that important since perhaps the right answer is just to aim to hook up a real elastic granter in all sql servers — including pods - and then be able to assume it is never nil?

But I don’t think that needs to happen here

To clarify: I’m 👍 merging as is.

Ack. I added a commit that creates a real elastic grant coordinator.

I've dropped this commit since I suspect it was the cause of some test failures that were hard to track down. I'll revive it later.

sumeerbhola

TFTR!

@sumeerbhola made 3 comments.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dt, @golgeek, @tbg, and @williamchoe3).

sumeerbhola · 2025-12-19T20:33:19Z

pkg/kv/bulk/cpu_pacer.go

 	if !ok {
 		tenantID = roachpb.SystemTenantID
 	}
 	return db.AdmissionPacerFactory.NewPacer(


Good catch.
Is this a concern, in that were you trying to make yield work for SQL pods in serverless?
If yes, I'll look into fixing that old todo.

pkg/kv/bulk/cpu_pacer.go

github-actions · 2025-12-19T20:37:37Z

Potential Bug(s) Detected

The three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation.

Next Steps:
Please review the detailed findings in the workflow run.

Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary.

After you review the findings, please tag the issue as follows:

If the detected issue is real or was helpful in any way, please tag the issue with O-AI-Review-Real-Issue-Found
If the detected issue was not helpful in any way, please tag the issue with O-AI-Review-Not-Helpful

tbg · 2025-12-24T08:36:44Z

Won't be able to review this until I'm back from PTO, feel free to merge this with @dt's LGTM.
I see a large amount of changes but no changes in tests. Is this because this is rearranging existing functionality? Or is this PR just very light on testing?

ElasticCPUWorkHandle.Overlimit is expected to be called in a tight loop, so yielding there is conceptually the right place. More importantly, this will allow in the future for KV work that is not holding latches to also yield. As part of this change, elastic work that does not wish to wait in admission control queues (due to cluster settings), is now accounted for in the elastic tokens, and in the admission.elastic_cpu_bypassed.utilization metric. One side-effect of this accounting is that work that needs to wait in admission queues may have fewer tokens available to it, and may wait longer. This is considered acceptable since: - Elastic work that bypasses queueing is still elastic work, and our overarching goal is to reduce impact to foreground work. - Due to the default on use of runtime.Yield, all elastic work yields, which allows the system to run at higher elastic CPU utilization without impacting the latency of foreground work. Epic: none Release note: None

sumeerbhola

Is this because this is rearranging existing functionality? Or is this PR just very light on testing?

Both, in that it is rearrangement, and AFAIK the yield stuff does not have any existing automated testing (I'll rerun @dt 's test https://cockroachlabs.slack.com/archives/C01SRKWGHG8/p1767716152730549?thread_ts=1766160955.465809&cid=C01SRKWGHG8).

@sumeerbhola made 4 comments.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @dt, @golgeek, @kyle-a-wong, @tbg, @williamchoe3, and @xinhaoz).

pkg/util/admission/elastic_cpu_work_handle.go line 118 at r4 (raw file):

// TODO(irfansharif): Non-test callers use one or the other return value, not
// both. Split this API?
func (h *ElasticCPUWorkHandle) IsOverLimitAndPossiblyYield() (

Changed the name here.

pkg/util/admission/elastic_cpu_work_queue.go line 108 at r4 (raw file):

	e.metrics.PreWorkNanos.Inc(h.preWork.Nanoseconds())
	_, difference := h.overLimitInner()

Not using overlimitInner was a buglet even before this PR, in that it could return stale information if enough iterations hadn't happened. With this PR, we definitely don't want to yield here.

sumeerbhola · 2026-01-06T16:41:32Z

pkg/kv/bulk/cpu_pacer.go

 	if !ok {
 		tenantID = roachpb.SystemTenantID
 	}
 	return db.AdmissionPacerFactory.NewPacer(


Ack. I added a commit that creates a real elastic grant coordinator.

sumeerbhola

@sumeerbhola made 1 comment.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @DrewKimball, @dt, @golgeek, @kyle-a-wong, @tbg, @williamchoe3, and @xinhaoz).

sumeerbhola · 2026-01-07T17:22:44Z

pkg/kv/bulk/cpu_pacer.go

 	if !ok {
 		tenantID = roachpb.SystemTenantID
 	}
 	return db.AdmissionPacerFactory.NewPacer(


I've dropped this commit since I suspect it was the cause of some test failures that were hard to track down. I'll revive it later.

sumeerbhola · 2026-01-07T18:04:33Z

I'll rerun @dt 's test https://cockroachlabs.slack.com/archives/C01SRKWGHG8/p1767716152730549?thread_ts=1766160955.465809&cid=C01SRKWGHG8).

Done and verified no behavior change.

sumeerbhola · 2026-01-07T18:23:21Z

bors r=dt

craig · 2026-01-07T19:06:36Z

Build succeeded:

sumeerbhola requested review from dt and tbg December 19, 2025 19:53

sumeerbhola requested review from a team as code owners December 19, 2025 19:53

sumeerbhola requested review from golgeek and williamchoe3 and removed request for a team December 19, 2025 19:53

dt reviewed Dec 19, 2025

View reviewed changes

pkg/kv/bulk/cpu_pacer.go Show resolved Hide resolved

dt reviewed Dec 19, 2025

View reviewed changes

sumeerbhola force-pushed the yield_in_handle branch from b8d95e8 to fbef5ab Compare December 19, 2025 20:32

sumeerbhola commented Dec 19, 2025

View reviewed changes

github-actions bot added the o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. label Dec 19, 2025

dt approved these changes Dec 19, 2025

View reviewed changes

sumeerbhola mentioned this pull request Dec 19, 2025

kvserver: permit yielding for read requests evaluating without latches #159925

Open

sumeerbhola force-pushed the yield_in_handle branch from fbef5ab to 4ba4340 Compare December 19, 2025 20:47

sumeerbhola added the O-AI-Review-Real-Issue-Found AI reviewer found real issue label Dec 19, 2025

sumeerbhola force-pushed the yield_in_handle branch 2 times, most recently from 6123dc8 to 95628b7 Compare December 20, 2025 00:15

sumeerbhola requested a review from a team as a code owner December 20, 2025 00:15

sumeerbhola requested review from DrewKimball and removed request for a team December 20, 2025 00:15

sumeerbhola force-pushed the yield_in_handle branch from 95628b7 to d627a86 Compare January 6, 2026 16:19

sumeerbhola requested a review from a team as a code owner January 6, 2026 16:19

sumeerbhola requested a review from xinhaoz January 6, 2026 16:19

sumeerbhola requested a review from a team as a code owner January 6, 2026 16:38

sumeerbhola requested review from kyle-a-wong and removed request for a team January 6, 2026 16:38

sumeerbhola commented Jan 6, 2026

View reviewed changes

sumeerbhola force-pushed the yield_in_handle branch 5 times, most recently from 67b6f6c to d627a86 Compare January 7, 2026 17:21

sumeerbhola commented Jan 7, 2026

View reviewed changes

craig bot merged commit f3f12a6 into cockroachdb:master Jan 7, 2026
39 of 46 checks passed

celeste-cockroachdb bot added the target-release-26.2.0 label Jan 7, 2026

admission: move yielding into ElasticCPUWorkHandle #159917

admission: move yielding into ElasticCPUWorkHandle #159917

Uh oh!

Conversation

sumeerbhola commented Dec 19, 2025

Uh oh!

cockroach-teamcity commented Dec 19, 2025

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions bot commented Dec 19, 2025

Potential Bug(s) Detected

Uh oh!

tbg commented Dec 24, 2025

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sumeerbhola left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

sumeerbhola commented Jan 7, 2026

Uh oh!

sumeerbhola commented Jan 7, 2026

Uh oh!

craig bot commented Jan 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants