kvcoord: deflake BenchmarkTxnWriteBuffer #160648

miraradeva · 2026-01-07T18:28:10Z

The SendLocked subtest has been consistently flaky since the test was introduced. The suspicion is that the ratio of time spent between stop/start is extremely small compared to outside the timed section. This tends to cause the benchmark not to be able to reach the 1s target, as 1s of the timed part could equal 20 minutes in total outside of the timed section.

This commit removes the starting and stopping of the timer. The downside is that the benchmark now also measures the allocations as part of makeBuffer. But it also makes this subtest consistent with the second subtest (flushBufferAndSendBatch) in that respect.

Fixes: #151650

Release note: None

The `SendLocked` subtest has been consistently flaky since the test was introduced. The suspicion is that the ratio of time spent between stop/start is extremely small compared to outside the timed section. This tends to cause the benchmark not to be able to reach the 1s target, as 1s of the timed part could equal 20 minutes in total outside of the timed section. This commit removes the starting and stopping of the timer. The downside is that the benchmark now also measures the allocations as part of `makeBuffer`. But it also makes this subtest consistent with the second subtest (`flushBufferAndSendBatch`) in that respect. Fixes: cockroachdb#151650 Release note: None

cockroach-teamcity · 2026-01-07T18:28:27Z

This change is

miraradeva · 2026-01-09T16:50:11Z

bors r=stevendanna

158639: storepool: consider StoreLiveness status when determining suspect stores r=miraradeva a=arulajmani First 4 commits from #158629; only the last one is relevant for this PR. Feel free to hold off on reviewing until the thing merges. ---- This patch treats stores from which we recently (within the suspect duration) withdrew support in StoreLiveness as suspect. Doing so ensures we don't transfer leases to such stores, as shown by the changes to TestLeaseTransferAfterStoreLivenessSupportWithdrawn. Closes #158513 Release note: None 160642: authors: add amerani to authors r=amerani a=amerani Epic: None Release note: None 160647: kvnemesis: disable SQL ops in safety mode r=stevendanna a=miraradeva Currently, there is a single kvnemesis operation that executes via SQL: `ToggleGlobalReads`. We have seen this operation get stuck and cause the test to timeout under safety more. The expected behavior is that any stuck operations would time out, but it seems like there is a context cancelation propagation issue, most likely in lib/pq, but not confirmed. This commit disables `ToggleGlobalReads` in safety mode to reduce test failures. This change would also help confirm that this is the only operation susceptible to the hanging behavior. Fixes: #160293 Release note: None 160648: kvcoord: deflake BenchmarkTxnWriteBuffer r=stevendanna a=miraradeva The `SendLocked` subtest has been consistently flaky since the test was introduced. The suspicion is that the ratio of time spent between stop/start is extremely small compared to outside the timed section. This tends to cause the benchmark not to be able to reach the 1s target, as 1s of the timed part could equal 20 minutes in total outside of the timed section. This commit removes the starting and stopping of the timer. The downside is that the benchmark now also measures the allocations as part of `makeBuffer`. But it also makes this subtest consistent with the second subtest (`flushBufferAndSendBatch`) in that respect. Fixes: #151650 Release note: None 160778: authors: add eric.alton to authors r=eric-alton a=eric-alton Epic: None Release note: None Co-authored-by: Arul Ajmani <[email protected]> Co-authored-by: Alek Merani <[email protected]> Co-authored-by: Mira Radeva <[email protected]> Co-authored-by: eric-alton <[email protected]>

craig · 2026-01-09T17:36:13Z

Build failed (retrying...):

local_roachtest_fips

158639: storepool: consider StoreLiveness status when determining suspect stores r=miraradeva a=arulajmani First 4 commits from #158629; only the last one is relevant for this PR. Feel free to hold off on reviewing until the thing merges. ---- This patch treats stores from which we recently (within the suspect duration) withdrew support in StoreLiveness as suspect. Doing so ensures we don't transfer leases to such stores, as shown by the changes to TestLeaseTransferAfterStoreLivenessSupportWithdrawn. Closes #158513 Release note: None 160648: kvcoord: deflake BenchmarkTxnWriteBuffer r=stevendanna a=miraradeva The `SendLocked` subtest has been consistently flaky since the test was introduced. The suspicion is that the ratio of time spent between stop/start is extremely small compared to outside the timed section. This tends to cause the benchmark not to be able to reach the 1s target, as 1s of the timed part could equal 20 minutes in total outside of the timed section. This commit removes the starting and stopping of the timer. The downside is that the benchmark now also measures the allocations as part of `makeBuffer`. But it also makes this subtest consistent with the second subtest (`flushBufferAndSendBatch`) in that respect. Fixes: #151650 Release note: None 160746: scripts/gceworker: add -4 to recent dig patch r=yuzefovich a=yuzefovich Epic: None Release note: None 160778: authors: add eric.alton to authors r=eric-alton a=eric-alton Epic: None Release note: None 160783: storage: refactor TestMVCCHistories r=annrpom a=jbowens Refactor some of TestMVCCHistories to pull more of the logic of executing a run command out of the main Test func. This is in preparation for adding a variant that injects I/O errors into the iterator operations and ensures an error is bubbled up to the caller. Epic: PEBBLE-1158 Informs #139746. Release note: None Co-authored-by: Arul Ajmani <[email protected]> Co-authored-by: Mira Radeva <[email protected]> Co-authored-by: Yahor Yuzefovich <[email protected]> Co-authored-by: eric-alton <[email protected]> Co-authored-by: Jackson Owens <[email protected]>

craig · 2026-01-09T19:38:21Z

Build failed (retrying...):

local_roachtest_fips

craig · 2026-01-09T20:33:56Z

Build succeeded:

miraradeva requested a review from stevendanna January 7, 2026 18:28

miraradeva requested a review from a team as a code owner January 7, 2026 18:28

stevendanna approved these changes Jan 9, 2026

View reviewed changes

craig bot merged commit 8ae5c17 into cockroachdb:master Jan 9, 2026
34 of 35 checks passed

celeste-cockroachdb bot added the target-release-26.2.0 label Jan 9, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

kvcoord: deflake BenchmarkTxnWriteBuffer #160648

kvcoord: deflake BenchmarkTxnWriteBuffer #160648

miraradeva commented Jan 7, 2026

Uh oh!

cockroach-teamcity commented Jan 7, 2026

Uh oh!

miraradeva commented Jan 9, 2026

Uh oh!

craig bot commented Jan 9, 2026

Uh oh!

craig bot commented Jan 9, 2026

Uh oh!

craig bot commented Jan 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kvcoord: deflake BenchmarkTxnWriteBuffer #160648

kvcoord: deflake BenchmarkTxnWriteBuffer #160648

Conversation

miraradeva commented Jan 7, 2026

Uh oh!

cockroach-teamcity commented Jan 7, 2026

Uh oh!

miraradeva commented Jan 9, 2026

Uh oh!

craig bot commented Jan 9, 2026

Uh oh!

craig bot commented Jan 9, 2026

Uh oh!

craig bot commented Jan 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants