-
Notifications
You must be signed in to change notification settings - Fork 15k
[MC] AsmLexer assert buffer is null-terminated at CurBuf.end() #154972
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Need a llvm/test/MC/AsmParser test to confirm we actually have the out-of-bounds error and fix it. |
86367d2 to
3108689
Compare
|
Hey, @MaskRay I added a test case where invalid read is performed without the In order to detect the out of bounds access I modified the while loop condition to access the memory through CurBuf, which being a StringRef contains asserts. The test I added will be reported a failure on debug builds unless This is the simplest approach to creating the test that I came up with but there might be a better way. Regards, |
|
Ping |
|
@MaskRay Ping |
|
Ping |
3108689 to
b05aa62
Compare
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
llvm/unittests/MC/MCParser.cpp
Outdated
| if (Lexer.getTok().getKind() == AsmToken::Error) | ||
| Error = true; | ||
| } | ||
| ASSERT_TRUE(Error == false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanted to EXPECT_THROW here, unfortunately exceptions are disabled.
The test (and with the CI Checks) will fail in current state. Is there a good way to expect failure in unit tests?
This is meant to test if null-termination is at least checked for. A lit test cannot be created, because llvm-mc will always null-terminate whatever buffer it's provided.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
googletest supports https://google.github.io/googletest/reference/assertions.html#death but llvm-project doesn't use the feature. There is possibly portability issue, but if the CI is ok I think it's good to have it.
I think it's also fine not to have the coverage (i.e. drop this negative test), but thanks for thinking about it and adding a MCParser unittest
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unittest removed in 92a16a4
| // `Buf.end()`. It must be safe to dereference `Buf.end()`. | ||
| assert(*Buf.end() == '\0' && | ||
| "Buffer provided to AsmLexer lacks null terminator."); | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent some time playing around with buffer allocation in order to ensure buffer allocated in by llvm-mc doesn't allocate any additional memory (WritableMemoryBuffer::getNewUninitMemBuffer adds BufAlign.value() to RealLen instead of BufAlign.value() - 1, which means the buffer will always have at least 1 extra byte allocated)
After a while I finally realized that AsmLexer always expects that the null terminator of the buffer is present and it is NOT included in the buffer's length. That means that CurBuf.end() will always point to \0 and CurBuf.end() is always valid memory.
Unfortunately I couldn't find this expectation documented anywhere.
It also doesn't seem to be enforced anywhere in AsmLexer.
In contrast, it seems that AsmParser with LLLexer (not MCAsmParser) does enforce null terminator.
I could patch AsmLexer to support non-null-terminated buffers, however is that a change we desire?
Should AsmLexer also enforce null terminator?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To additionally clarify, since I didn't mention it above, if a buffer IS null-terminated, however the null terminator resides at CurBuf.end() - 1 (which doesn't seem unreasonable), the problem with invalid reads will appear.
| bool EndStatementAtEOF) { | ||
| // Null terminator must be part of the actual buffer. It must reside at | ||
| // `Buf.end()`. It must be safe to dereference `Buf.end()`. | ||
| assert(*Buf.end() == '\0' && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not the best way to test for null-termination. Invalid read won't be detected by non-LLVM_USE_SANITIZER=Address builds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's right to require a NUL terminator, like regular MemoryBuffer APIs provide (getFileOrSTDIN MemoryBuffer::getFile.
We should add a comment to the header file llvm/include/llvm/MC/MCParser/AsmLexer.h about this requirement.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Comments added in 8a0b79b
|
My confusion arose from the fact that it is very often the case with buffers (arrays, vectors), that the String is an exception, and in its case, the When receiving a buffer to be parsed, it seems reasonable not to require a null terminator and if the null terminator is present, to treat it as part of the buffer and include it in size passed to StringRef's constructor. |
c57126b to
18ecc9c
Compare
AsmLexer expects the buffer it's provided for lexing to be NULL-terminated, where the NULL terminator is pointed to by `CurBuf.end()`. However, this expectation isn't explicitly stated anywhere. This commit adds a couple of comments as well as an assert as means of documenting this expectation.
18ecc9c to
8a0b79b
Compare
AsmLexer expects the buffer it's provided for lexing to be
NULL-terminated, where the NULL terminator is pointed to by
CurBuf.end(). However, this expectation isn't explicitly statedanywhere.
This commit adds a couple of comments as well as an assert as means of
documenting this expectation.