-
Notifications
You must be signed in to change notification settings - Fork 2
linux optimization: frtuncate() while mapped #5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
WalkthroughThe changes update the Linux implementation of the Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant ResizableMappedFile
participant OS/FileSystem
User->>ResizableMappedFile: construct(path, maxSize)
ResizableMappedFile->>OS/FileSystem: open file, map full maxSize (MAP_SHARED)
ResizableMappedFile->>ResizableMappedFile: store current file size in m_size
User->>ResizableMappedFile: resize(newSize)
ResizableMappedFile->>OS/FileSystem: truncate file to newSize
ResizableMappedFile->>ResizableMappedFile: update m_size
User->>ResizableMappedFile: data(), size(), capacity()
ResizableMappedFile->>User: return pointer, m_size, mapped size
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~15–20 minutes Possibly related PRs
Poem
Note ⚡️ Unit Test Generation is now available in beta!Learn more here, or try it out under "Finishing Touches" below. 📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (3)
🧰 Additional context used🪛 LanguageToolREADME.md[grammar] ~96-~96: Use correct spacing (QB_NEW_EN_OTHER_ERROR_IDS_5) [grammar] ~97-~97: Use correct spacing (QB_NEW_EN_OTHER_ERROR_IDS_5) [grammar] ~98-~98: Use correct spacing (QB_NEW_EN_OTHER_ERROR_IDS_5) 🔇 Additional comments (9)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (2)
include/decodeless/detail/mappedfile_linux.hpp(1 hunks)test/src/mappedfile.cpp(5 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build (windows-latest, Release, cl)
🔇 Additional comments (6)
test/src/mappedfile.cpp (3)
157-176: Well-designed test for overmapping behaviorThis test effectively validates the optimization where files can be mapped larger than their current size and then grown with
ftruncate()without remapping. The test correctly verifies that writes to the grown region persist.
454-467: Good edge case test for empty file creationThis test properly verifies that resizable files can be created with zero initial size, which is an important edge case for the new implementation.
469-479: Effective test for clearing existing file contentThe test validates that resizing an existing file to zero properly truncates it, which is important behavior to verify.
include/decodeless/detail/mappedfile_linux.hpp (3)
212-212: Good null-safety for empty filesThe data() method correctly returns nullptr for empty files, preventing potential issues with zero-sized mappings.
215-218: Excellent optimization eliminating remap overheadThe simplified resize() implementation that only calls ftruncate() without remapping is the core optimization mentioned in the PR objectives. This approach leverages the fact that the entire range is already mapped, significantly improving performance.
235-237: Correct member ordering for RAIIThe member variable ordering ensures proper destruction sequence (m_mapped before m_file), which is critical for RAII correctness. The addition of m_size to track the actual file size separately from the mapped region size is a clean design choice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (2)
include/decodeless/detail/mappedfile_linux.hpp (1)
223-230: Document SIGBUS behavior in class documentationThe optimization of mapping the entire range upfront is good, but users need to be aware of the SIGBUS behavior when accessing beyond the actual file size.
test/src/mappedfile.cpp (1)
172-172: Add spaces around the minus operator- ifile.seekg(overmapSize - sizeof(lastByte)); + ifile.seekg(overmapSize - sizeof(lastByte));
📜 Review details
Configuration used: CodeRabbit UI
Review profile: ASSERTIVE
Plan: Pro
📒 Files selected for processing (3)
README.md(1 hunks)include/decodeless/detail/mappedfile_linux.hpp(2 hunks)test/src/mappedfile.cpp(6 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (1)
test/src/mappedfile.cpp (1)
include/decodeless/detail/mappedfile_linux.hpp (10)
size(76-79)size(76-76)size(151-160)size(151-151)size(235-238)size(235-235)size(284-315)size(284-284)result(69-74)LastError(30-31)
🪛 LanguageTool
README.md
[grammar] ~96-~96: Use correct spacing
Context: ...s more than the file size and truncates without remapping. Simple and very fast, although not expl...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~97-~97: Use correct spacing
Context: ...lthough not explicitly supported in the man pages. Tests indicate the right thing still ha...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
[grammar] ~98-~98: Use correct spacing
Context: ...s. Tests indicate the right thing still happens. ## Contributing Issues and pull requests a...
(QB_NEW_EN_OTHER_ERROR_IDS_5)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build (windows-latest, Release, cl)
🔇 Additional comments (10)
include/decodeless/detail/mappedfile_linux.hpp (3)
131-140: LGTM!The addition of the
size()getter method and the existingsync()implementation are correct.
263-265: Member ordering is critical for correct destructionThe ordering of
m_filebeforem_mappedensures the file descriptor outlives the mapping during destruction. This is essential for correctness.
235-238: Efficient resize implementationThe simplified resize logic that only truncates the file without remapping is a significant performance optimization that aligns with the PR objectives.
test/src/mappedfile.cpp (6)
69-94: Good refactoring to reduce fixture overheadConverting these tests from
TEST_FtoTESTeliminates unnecessary fixture setup/teardown for tests that don't need the pre-created file.Also applies to: 96-139, 178-198, 210-272
282-301: Comprehensive test for truncation behaviorThis test effectively validates that truncating a mapped file releases the physical pages, which is crucial for the new optimization approach.
302-345: Well-structured memory decommit testThe test properly validates memory decommit behavior using
mprotectandmadvise(MADV_DONTNEED). The comment referencing the Go issue provides valuable context for the chosen approach.
549-562: Good edge case coverageTesting empty file creation ensures the implementation handles the zero-size case correctly.
564-573: Validates file truncation to zeroThis test ensures that resizing an existing file to zero works correctly, which is important for the new truncation-based approach.
575-595: Documentation example validationTesting the README example ensures the documentation stays accurate and the API works as advertised.
README.md (1)
96-98: Important documentation of Linux-specific behaviorThe added note clearly explains the Linux implementation strategy of overmapping and truncating without remapping. This transparency helps users understand the performance characteristics and potential edge cases.
|
https://stackoverflow.com/questions/6875771/mmap-what-happens-if-underlying-file-changes-shrinks The answer here sound like none of this should be working. Maybe it just works on Linux and not necessarily elsewhere. https://stackoverflow.com/questions/7587625/truncating-memory-mapped-file
Seems to be true in practice. I haven't found anything in the man pages about it though. |
The main point of this change is to avoid msync() when resizing the file. It turns out remapping is not even needed. Disclaimer: > If the size of the mapped file changes after the call to mmap() as a > result of some other operation on the mapped file, the effect of > references to portions of the mapped region that correspond to added > or removed portions of the file is unspecified. On linux this is a SIGBUS, but by contract of the API the user shouldn't be writing past the resized area anyway. An alternative could be to reserve with anonymous+private and mmap() just the grown region like ResizableMappedMemory. However, mmap is quite slow, even incrementally.
The main point of this change is to avoid msync() when resizing the
file. It turns out remapping is not even needed.
Disclaimer:
On linux this is a SIGBUS, but by contract of the API the user shouldn't
be writing past the resized area anyway.
An alternative could be to reserve with anonymous+private and mmap()
just the grown region like ResizableMappedMemory. However, mmap is quite
slow, even incrementally.
Summary by CodeRabbit