Skip to content

Conversation

@cachemeifyoucan
Copy link
Collaborator

Add mapped_file_region::sync(), equivalent to POSIX msync,
synchronizing written content to disk without unmapping the region.

This brings back D95494 originally authored by @dexonsmith.

Created using spr 1.3.6
@llvmbot
Copy link
Member

llvmbot commented Aug 14, 2025

@llvm/pr-subscribers-platform-windows

@llvm/pr-subscribers-llvm-support

Author: Steven Wu (cachemeifyoucan)

Changes

Add mapped_file_region::sync(), equivalent to POSIX msync,
synchronizing written content to disk without unmapping the region.

This brings back D95494 originally authored by @dexonsmith.


Full diff: https://github.com/llvm/llvm-project/pull/153632.diff

4 Files Affected:

  • (modified) llvm/include/llvm/Support/FileSystem.h (+3)
  • (modified) llvm/lib/Support/Unix/Path.inc (+6)
  • (modified) llvm/lib/Support/Windows/Path.inc (+6)
  • (modified) llvm/unittests/Support/Path.cpp (+26)
diff --git a/llvm/include/llvm/Support/FileSystem.h b/llvm/include/llvm/Support/FileSystem.h
index 31fedc37bf776..b66b3e66018ef 100644
--- a/llvm/include/llvm/Support/FileSystem.h
+++ b/llvm/include/llvm/Support/FileSystem.h
@@ -1342,6 +1342,9 @@ class mapped_file_region {
   LLVM_ABI size_t size() const;
   LLVM_ABI char *data() const;
 
+  /// Write changes to disk and synchronize. Equivalent to POSIX msync.
+  LLVM_ABI std::error_code sync() const;
+
   /// Get a const view of the data. Modifying this memory has undefined
   /// behavior.
   LLVM_ABI const char *const_data() const;
diff --git a/llvm/lib/Support/Unix/Path.inc b/llvm/lib/Support/Unix/Path.inc
index cc02cae40ec76..31fb1e8fe9b75 100644
--- a/llvm/lib/Support/Unix/Path.inc
+++ b/llvm/lib/Support/Unix/Path.inc
@@ -876,6 +876,12 @@ void mapped_file_region::unmapImpl() {
     ::munmap(Mapping, Size);
 }
 
+std::error_code mapped_file_region::sync() const {
+  if (int Res = ::msync(Mapping, Size, MS_SYNC))
+    return std::error_code(Res, std::generic_category());
+  return std::error_code();
+}
+
 void mapped_file_region::dontNeedImpl() {
   assert(Mode == mapped_file_region::readonly);
   if (!Mapping)
diff --git a/llvm/lib/Support/Windows/Path.inc b/llvm/lib/Support/Windows/Path.inc
index fdf9d540a6488..f3db7196ffdb8 100644
--- a/llvm/lib/Support/Windows/Path.inc
+++ b/llvm/lib/Support/Windows/Path.inc
@@ -1006,6 +1006,12 @@ void mapped_file_region::unmapImpl() {
 
 void mapped_file_region::dontNeedImpl() {}
 
+std::error_code mapped_file_region::sync() const {
+  if (::FlushViewOfFile(Mapping, 0))
+    return std::error_code();
+  return mapWindowsError(::GetLastError());
+}
+
 int mapped_file_region::alignment() {
   SYSTEM_INFO SysInfo;
   ::GetSystemInfo(&SysInfo);
diff --git a/llvm/unittests/Support/Path.cpp b/llvm/unittests/Support/Path.cpp
index 355aa6b9ade06..552c9f339166d 100644
--- a/llvm/unittests/Support/Path.cpp
+++ b/llvm/unittests/Support/Path.cpp
@@ -1471,6 +1471,32 @@ TEST_F(FileSystemTest, FileMapping) {
   ASSERT_NO_ERROR(fs::remove(TempPath));
 }
 
+TEST_F(FileSystemTest, FileMappingSync) {
+  // Create a temp file.
+  auto TempFileOrError = fs::TempFile::create(TestDirectory + "/test-%%%%");
+  ASSERT_TRUE((bool)TempFileOrError);
+  fs::TempFile File = std::move(*TempFileOrError);
+  StringRef Content("hello there");
+  ASSERT_NO_ERROR(
+      fs::resize_file_before_mapping_readwrite(File.FD, Content.size()));
+  {
+    // Map in the file and write some content.
+    std::error_code EC;
+    fs::mapped_file_region MFR(fs::convertFDToNativeFile(File.FD),
+                               fs::mapped_file_region::readwrite,
+                               Content.size(), 0, EC);
+    ASSERT_NO_ERROR(EC);
+    std::copy(Content.begin(), Content.end(), MFR.data());
+
+    // Synchronize and check the contents before unmapping.
+    MFR.sync();
+    auto Buffer = MemoryBuffer::getFile(File.TmpName);
+    ASSERT_TRUE((bool)Buffer);
+    ASSERT_EQ(Content, Buffer->get()->getBuffer());
+  }
+  ASSERT_FALSE((bool)File.discard());
+}
+
 TEST(Support, NormalizePath) {
   //                           Input,        Expected Win, Expected Posix
   using TestTuple = std::tuple<const char *, const char *, const char *>;

@cachemeifyoucan
Copy link
Collaborator Author

Link to old review: https://reviews.llvm.org/D95494

@cachemeifyoucan
Copy link
Collaborator Author

I did some simplification of the original PR. Will be good to have some windows expert to comment.

  • I am not sure if FlushFileBuffers is needed for sync
  • Update test to use TempFile

Some comments from origin PR:

(2) that the test be shown to fail without the msync call.
I don't think this is possible since there are OS will just flush in background regardless of explicit sync or not. The only thing that we care about is the file on disk is updated after sync (and the best we can do is to read through file I/O, but OS can still keep a cached version but that will be a bug on OS).

@compnerd
Copy link
Member

I did some simplification of the original PR. Will be good to have some windows expert to comment.

* I am not sure if `FlushFileBuffers` is needed for `sync`

* Update test to use `TempFile`

Some comments from origin PR:

(2) that the test be shown to fail without the msync call.
I don't think this is possible since there are OS will just flush in background regardless of explicit sync or not. The only thing that we care about is the file on disk is updated after sync (and the best we can do is to read through file I/O, but OS can still keep a cached version but that will be a bug on OS).

I think that it might be a good idea to call FlushFileBuffers. The FlushViewOfFile will flush the view to the page file, but it will not ensure that the content is written to disk. If we do a msync, that is roughly equivalent to FlushViewOfFile and FlushFileBuffers combined I believe.

@aganea
Copy link
Member

aganea commented Aug 14, 2025

I did some simplification of the original PR. Will be good to have some windows expert to comment.

* I am not sure if `FlushFileBuffers` is needed for `sync`

* Update test to use `TempFile`

Some comments from origin PR:

(2) that the test be shown to fail without the msync call.
I don't think this is possible since there are OS will just flush in background regardless of explicit sync or not. The only thing that we care about is the file on disk is updated after sync (and the best we can do is to read through file I/O, but OS can still keep a cached version but that will be a bug on OS).

I think that it might be a good idea to call FlushFileBuffers. The FlushViewOfFile will flush the view to the page file, but it will not ensure that the content is written to disk. If we do a msync, that is roughly equivalent to FlushViewOfFile and FlushFileBuffers combined I believe.

I agree with Saleem. If we absolutely want to mimic msync (if my understanding is correct), we must also call FlushFileBuffers. However that is a slow operation which is not needed in most cases, only very specific ones. I was gonna ask, what is purpose/usage for this new mapped_file_region::sync() API?

Created using spr 1.3.6
@aganea
Copy link
Member

aganea commented Aug 14, 2025

If we simply want the mmap to be flushed back to the file, which would be mostly likely to the OS cache in RAM, we can just use FlushViewOfFile. If we want to ensure that the data is physically written to the disk, which includes flushing the data present in the SSD internal buffers to the actual storage, we must also use FlushFileBuffers.

@cachemeifyoucan
Copy link
Collaborator Author

However that is a slow operation which is not needed in most cases, only very specific ones. I was gonna ask, what is purpose/usage for this new mapped_file_region::sync() API?

It is going to be used as part of CAS implementation I try to upstream (downstream example: https://github.com/swiftlang/llvm-project/blob/next/llvm/lib/CAS/MappedFileRegionBumpPtr.cpp#L232). It will not be used frequently, just to flush the file if all the users closes the database, to prevent data lost if there is a power lost afterwards. If we don't sync, it is more likely to get into situations that part of the file is up-to-date, but not the entire file.

@cachemeifyoucan
Copy link
Collaborator Author

If we simply want the mmap to be flushed back to the file, which would be mostly likely to the OS cache in RAM, we can just use FlushViewOfFile. If we want to ensure that the data is physically written to the disk, which includes flushing the data present in the SSD internal buffers to the actual storage, we must also use FlushFileBuffers.

I guess FlushFileBuffers is also needed here to match what msync does. Maybe I should document the API as slow and use with caution.

@cachemeifyoucan
Copy link
Collaborator Author

Hmm, the test is failing on Windows. Can you read the file through IO before memory is unmapped?

Created using spr 1.3.6
Created using spr 1.3.6
Created using spr 1.3.6
@aganea
Copy link
Member

aganea commented Aug 14, 2025

Hmm, the test is failing on Windows. Can you read the file through IO before memory is unmapped?

I think this is because of the fs::TempFile is open with a delete disposition (flag on the handle). If you do File.keep() that'll remove the "delete on close" flag, then you'll be able to MemoryBuffer::getFile afterwards (but you'd have to delete the file manually at the end).

@aganea
Copy link
Member

aganea commented Aug 14, 2025

The following passes the test:

TEST_F(FileSystemTest, FileMappingSync) {
  // Create a temp file.
  SmallString<0> TempPath(TestDirectory);
  sys::path::append(TempPath, "test-%%%%");
  auto TempFileOrError = fs::TempFile::create(TempPath);
  ASSERT_TRUE((bool)TempFileOrError);
  fs::TempFile File = std::move(*TempFileOrError);
  StringRef Content("hello there");
  ASSERT_NO_ERROR(
      fs::resize_file_before_mapping_readwrite(File.FD, Content.size()));
  std::string Name = File.TmpName;
  {
    // Map in the file and write some content.
    std::error_code EC;
    fs::mapped_file_region MFR(fs::convertFDToNativeFile(File.FD),
                               fs::mapped_file_region::readwrite,
                               Content.size(), 0, EC);
    ASSERT_NO_ERROR(EC);
    std::copy(Content.begin(), Content.end(), MFR.data());

    // Synchronize and check the contents before unmapping.
    MFR.sync();
    ASSERT_FALSE((bool)File.keep());
    auto Buffer = MemoryBuffer::getFile(Name);
    if (!Buffer)
      llvm::outs() << "failed to open \'" << Name
                   << "\': " << Buffer.getError().message() << "\n";
    ASSERT_TRUE((bool)Buffer);
    ASSERT_EQ(Content, Buffer->get()->getBuffer());
  }
  ASSERT_NO_ERROR(fs::remove(Name));
}

@cachemeifyoucan
Copy link
Collaborator Author

The following passes the test

Thanks. I don't know the deleteOnClose file will impact reading before the file is closed!

I will take your example and update the test with a tweak. I think the better check is to make sure file can be read with correct content before memory region is unmapped.

  SmallString<0> TempPath(TestDirectory);
  sys::path::append(TempPath, "test-%%%%");
  auto TempFileOrError = fs::TempFile::create(TempPath);
  ASSERT_TRUE((bool)TempFileOrError);
  fs::TempFile File = std::move(*TempFileOrError);
  StringRef Content("hello there");
  std::string FileName = File.TmpName;
  ASSERT_NO_ERROR(
      fs::resize_file_before_mapping_readwrite(File.FD, Content.size()));
  {
    // Map in the file and write some content.
    std::error_code EC;
    fs::mapped_file_region MFR(fs::convertFDToNativeFile(File.FD),
                               fs::mapped_file_region::readwrite,
                               Content.size(), 0, EC);

    // Keep the file so it can be read.
    ASSERT_FALSE((bool)File.keep());

    // Write content through mapped memory.
    ASSERT_NO_ERROR(EC);
    std::copy(Content.begin(), Content.end(), MFR.data());

    // Synchronize to file system.
    ASSERT_FALSE((bool)MFR.sync());

    // Check the file content using file IO APIs.
    auto Buffer = MemoryBuffer::getFile(FileName);
    ASSERT_TRUE((bool)Buffer);
    ASSERT_EQ(Content, Buffer->get()->getBuffer());
  }
  // Manually remove the test file.
  ASSERT_FALSE((bool)fs::remove(FileName));

@aganea
Copy link
Member

aganea commented Aug 14, 2025

The following passes the test

Thanks. I don't know the deleteOnClose file will impact reading before the file is closed!

I will take your example and update the test with a tweak. I think the better check is to make sure file can be read with correct content before memory region is unmapped.

  SmallString<0> TempPath(TestDirectory);
  sys::path::append(TempPath, "test-%%%%");
  auto TempFileOrError = fs::TempFile::create(TempPath);
  ASSERT_TRUE((bool)TempFileOrError);
  fs::TempFile File = std::move(*TempFileOrError);
  StringRef Content("hello there");
  std::string FileName = File.TmpName;
  ASSERT_NO_ERROR(
      fs::resize_file_before_mapping_readwrite(File.FD, Content.size()));
  {
    // Map in the file and write some content.
    std::error_code EC;
    fs::mapped_file_region MFR(fs::convertFDToNativeFile(File.FD),
                               fs::mapped_file_region::readwrite,
                               Content.size(), 0, EC);

    // Keep the file so it can be read.
    ASSERT_FALSE((bool)File.keep());

    // Write content through mapped memory.
    ASSERT_NO_ERROR(EC);
    std::copy(Content.begin(), Content.end(), MFR.data());

    // Synchronize to file system.
    ASSERT_FALSE((bool)MFR.sync());

    // Check the file content using file IO APIs.
    auto Buffer = MemoryBuffer::getFile(FileName);
    ASSERT_TRUE((bool)Buffer);
    ASSERT_EQ(Content, Buffer->get()->getBuffer());
  }
  // Manually remove the test file.
  ASSERT_FALSE((bool)fs::remove(FileName));

Yeah I did change my example after posting it! Your example passes the test.

Created using spr 1.3.6
@cachemeifyoucan
Copy link
Collaborator Author

Yeah I did change my example after posting it! Your example passes the test.

Thanks @aganea for the help! Updated.

@aganea
Copy link
Member

aganea commented Aug 14, 2025

Yeah I did change my example after posting it! Your example passes the test.

Thanks @aganea for the help! Updated.

You're very welcome! Let me know if you need further help for Windows.

@cachemeifyoucan cachemeifyoucan merged commit 7e46f5d into main Aug 15, 2025
9 checks passed
@cachemeifyoucan cachemeifyoucan deleted the users/cachemeifyoucan/spr/support-add-mapped_file_regionsync-equivalent-to-msync branch August 15, 2025 00:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants