Skip to content

issue-1751: [Filestore] WriteBackCache: Replace ReadWriteRangeLock with pinning mechanism, move remaining code to TUtils#5297

Open
e673 wants to merge 4 commits intomainfrom
users/nasonov/wbc-refactor-10
Open

issue-1751: [Filestore] WriteBackCache: Replace ReadWriteRangeLock with pinning mechanism, move remaining code to TUtils#5297
e673 wants to merge 4 commits intomainfrom
users/nasonov/wbc-refactor-10

Conversation

@e673
Copy link
Collaborator

@e673 e673 commented Feb 23, 2026

Part of #5064

Replace TReadWriteRangeLock with pinning mechanism. Instead of blocking at overlapping read and write operations, the reader now sets a pin that prevents cached data from being evicted from cache. This makes possible to:

  • avoid copying cached data parts under mutex;
  • avoid using temporary buffer for storing cached data.

Additionally, some helper functions were moved to TUtils.

@e673 e673 changed the base branch from main to users/nasonov/wbc-refactor-9 February 23, 2026 17:35
@e673 e673 requested review from SvartMetal, debnatkh, neihar and qkrorlqr and removed request for debnatkh February 23, 2026 17:35
@github-actions
Copy link
Contributor

github-actions bot commented Feb 23, 2026

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-relwithdebinfo target: cloud/disk_manager/ (test time: 224s): all tests PASSED for commit eab0cbf.

TESTS PASSED ERRORS FAILED FAILED BUILD SKIPPED MUTED?
991 991 0 0 0 0 0

🟢 linux-x86_64-relwithdebinfo target: cloud/tasks/,cloud/storage/ (test time: 241s): all tests PASSED for commit eab0cbf.

TESTS PASSED ERRORS FAILED FAILED BUILD SKIPPED MUTED?
847 847 0 0 0 0 0

🟢 linux-x86_64-relwithdebinfo target: cloud/blockstore/ (test time: 1538s): all tests PASSED for commit eab0cbf.

TESTS PASSED ERRORS FAILED FAILED BUILD SKIPPED MUTED?
5812 5811 0 0 0 1 0

🟢 linux-x86_64-relwithdebinfo target: cloud/filestore/ (test time: 4740s): all tests PASSED for commit eab0cbf.

TESTS PASSED ERRORS FAILED FAILED BUILD SKIPPED MUTED?
3404 3404 0 0 0 0 0

@e673 e673 added filestore Add this label to run only cloud/filestore build and tests on PR asan Launch builds with address sanitizer along with regular build tsan Launch builds with thread sanitizer along with regular build msan Launch builds with memory sanitizer along with regular build ubsan Launch builds with undefined behaviour sanitizer along with regular build recheck Add this label to relaunch checks, it will be automatically removed and removed filestore Add this label to run only cloud/filestore build and tests on PR asan Launch builds with address sanitizer along with regular build tsan Launch builds with thread sanitizer along with regular build recheck Add this label to relaunch checks, it will be automatically removed msan Launch builds with memory sanitizer along with regular build ubsan Launch builds with undefined behaviour sanitizer along with regular build labels Feb 23, 2026
Base automatically changed from users/nasonov/wbc-refactor-9 to main February 26, 2026 15:39
…th pinning mechanism, move remaining code to TUtils
@e673 e673 force-pushed the users/nasonov/wbc-refactor-10 branch from eab0cbf to 9647acd Compare February 26, 2026 15:42

// Prevent from concurrent read and write requests with overlapping ranges
TReadWriteRangeLock RangeLock;
// Prevents flushed requests from being evicted from Cache
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let’s emphasize why this logic is needed

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added more info about why this is needed

Comment on lines 574 to 588
IFileStorePtr session,
ISchedulerPtr scheduler,
ITimerPtr timer,
IWriteBackCacheStatsPtr stats,
TLog log,
const TString& fileSystemId,
const TString& clientId,
const TString& filePath,
ui64 capacityBytes,
TDuration automaticFlushPeriod,
TDuration flushRetryPeriod,
ui32 maxWriteRequestSize,
ui32 maxWriteRequestsCount,
ui32 maxSumWriteRequestsSize,
bool zeroCopyWriteEnabled)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary diff

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted (I will replace it with DTO in the future)

auto& nodeState = Nodes.GetOrCreateNodeState(nodeId);

auto it = nodeState.CachedDataPins.find(pinId);
Y_ENSURE(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use Y_ABORT_UNLESS

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

@github-actions
Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-relwithdebinfo target: cloud/filestore/ (test time: 4739s): all tests PASSED for commit 9647acd.

TESTS PASSED ERRORS FAILED FAILED BUILD SKIPPED MUTED?
3406 3406 0 0 0 0 0

@e673 e673 added asan Launch builds with address sanitizer along with regular build tsan Launch builds with thread sanitizer along with regular build labels Feb 26, 2026
@e673 e673 added msan Launch builds with memory sanitizer along with regular build ubsan Launch builds with undefined behaviour sanitizer along with regular build labels Feb 26, 2026
@github-actions
Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-release-asan target: cloud/filestore/ (test time: 3968s): all tests PASSED for commit e334109.

TESTS PASSED ERRORS FAILED FAILED BUILD SKIPPED MUTED?
2912 2912 0 0 0 0 0

@github-actions
Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-relwithdebinfo target: cloud/filestore/ (test time: 4820s): all tests PASSED for commit e334109.

TESTS PASSED ERRORS FAILED FAILED BUILD SKIPPED MUTED?
3406 3406 0 0 0 0 0

@github-actions
Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-release-msan target: cloud/filestore/ (test time: 3788s): all tests PASSED for commit e334109.

TESTS PASSED ERRORS FAILED FAILED BUILD SKIPPED MUTED?
2912 2912 0 0 0 0 0

@github-actions
Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-release-tsan target: cloud/filestore/ (test time: 4075s): all tests PASSED for commit e334109.

TESTS PASSED ERRORS FAILED FAILED BUILD SKIPPED MUTED?
2902 2902 0 0 0 0 0

@github-actions
Copy link
Contributor

Note

This is an automated comment that will be appended during run.

🟢 linux-x86_64-release-ubsan target: cloud/filestore/ (test time: 3772s): all tests PASSED for commit e334109.

TESTS PASSED ERRORS FAILED FAILED BUILD SKIPPED MUTED?
2943 2943 0 0 0 0 0

@e673 e673 requested a review from SvartMetal February 27, 2026 10:56
SvartMetal
SvartMetal previously approved these changes Feb 27, 2026

// Prevent unflushed requests from being evicted after flush
const ui64 sequenceId =
nodeState.Cache.GetMinPendingOrUnflushedSequenceId(0);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

0 /* paramName */

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the default value and changed logic a bit.
Now all Get...SequenceId methods act the same.

UNIT_ASSERT_VALUES_EQUAL("", b.VisitUnflushedCachedRequests(1));
}

Y_UNIT_TEST(PinCachedData)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's add a test with multiple pins per 1 sequence id

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This scenario is already covered by the test
Added more comments inside

{
auto guard = Guard(QueuedOperations);

auto& nodeState = Nodes.GetOrCreateNodeState(nodeId);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetNodeState? and check that exist?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

asan Launch builds with address sanitizer along with regular build filestore Add this label to run only cloud/filestore build and tests on PR msan Launch builds with memory sanitizer along with regular build tsan Launch builds with thread sanitizer along with regular build ubsan Launch builds with undefined behaviour sanitizer along with regular build

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants