-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[Support] mmap when possible in getSTDIN. #162013
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Support] mmap when possible in getSTDIN. #162013
Conversation
@llvm/pr-subscribers-llvm-binary-utilities @llvm/pr-subscribers-llvm-support Author: None (aokblast) ChangesWhen stdin is executed like ./prog < file, we are able to mmap the buffer so that we can reduce the total memory usage if we try to open a large file. Full diff: https://github.com/llvm/llvm-project/pull/162013.diff 1 Files Affected:
diff --git a/llvm/lib/Support/MemoryBuffer.cpp b/llvm/lib/Support/MemoryBuffer.cpp
index 1c4645ad83641..97ea5f205b0ee 100644
--- a/llvm/lib/Support/MemoryBuffer.cpp
+++ b/llvm/lib/Support/MemoryBuffer.cpp
@@ -573,6 +573,27 @@ ErrorOr<std::unique_ptr<MemoryBuffer>> MemoryBuffer::getSTDIN() {
// FIXME: That isn't necessarily true, we should try to mmap stdin and
// fallback if it fails.
sys::ChangeStdinMode(sys::fs::OF_Text);
+ std::error_code EC;
+ sys::fs::file_type Type;
+ sys::fs::file_status Status;
+ EC = sys::fs::status(sys::fs::getStdinHandle(), Status);
+ if (EC)
+ return EC;
+
+ Type = Status.type();
+
+ /*
+ * If the FD is regular file or block file,
+ * we try to create a mmap buffer first.
+ * If failed, rollback to read and copy.
+ */
+ if (Type == sys::fs::file_type::regular_file ||
+ Type == sys::fs::file_type::block_file) {
+ std::unique_ptr<MemoryBuffer> Result(new MemoryBufferMMapFile<MemoryBuffer>(
+ false, sys::fs::getStdinHandle(), Status.getSize(), 0, EC));
+ if (!EC)
+ return std::move(Result);
+ }
return getMemoryBufferForStream(sys::fs::getStdinHandle(), "<stdin>");
}
|
One thing that I am not sure is if we need to overwrite operator new like new (NamedBufferAlloc(BufferName)). If so, what is the proper BufferName? stdin? |
7939f0b
to
fdf0d85
Compare
ccaa772
to
a9d7f6c
Compare
Currently, some programs rely *c = '\0' to find the end of a buffer instead of the provided size. |
Without knowing the context, it sounds to me like a tool that ignores the provided size and looks for a trailing null terminator has a bug in it (what if the input contains null bytes other than at the end of the data?), so the tool should be updated. What tools rely on this behaviour? |
I am checking that. When I add the trailing zero test to judge if we can mmap, it passes the test-suite. When I remove the test, CI failed. I think these programs just have some behavior like to find the end of a string in string sections. |
This is the CI result for removing trailing zero check. |
I plan to run the CI that turns off RequiresNullTermination by defualt again after sucess this time. |
3d53878
to
f15f5b4
Compare
When stdin is executed like ./prog < file, we are possible to mmap the buffer so that we can reduce the total memory usage if we try to open a large file. For example, programs like llvm-strings iterates the buffer until the end of given size.
f15f5b4
to
c56c44d
Compare
Oops, I am so sorry for the noise... I forget to rebase main before push to this branch. |
Emm, the problem only happens on Windows this time. I think l need to get a windows machine since it is definately a WIndows related bug unrelated with my patch. |
When stdin is executed like ./prog < file, we are able to mmap the buffer so that we can reduce the total memory usage if we try to open a large file.