-
Notifications
You must be signed in to change notification settings - Fork 15.4k
[clang][Sema] Add fortify warnings for unistd.h
#161737
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Define as builtin and check for overflows and over-reads in: * `read` * `pread`/`pread64` * `write` * `pwrite`/`pwrite64` * `getcwd` * `readlink` * `readlinkat` It also enables `off_t`, `off64_t` and `ssize_t` for use in builtin signatures. Signed-off-by: Colin Kinloch <[email protected]>
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
@llvm/pr-subscribers-clang-static-analyzer-1 @llvm/pr-subscribers-clang Author: Colin Kinloch (ColinKinloch) ChangesThis MR implements in clang some of the overflow and over-read checks implemented in bionics In order to get the checks working with the existing overflow checker this MR defines the following as builtin:
The MR also enables It also moves the comparison between source and destination sizes into each case of the switch block, this means that a given diagnostic can be triggered by buffer overflows or over-reads. QuestionsDo Is defining Are the names of Further WorkI plan to port more of the fortify checks from the bionic headers. I'd be interested to hear opinions on whether these belong alongside clangs
Full diff: https://github.com/llvm/llvm-project/pull/161737.diff 6 Files Affected:
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 468121f7d20ab..3c3a5a82a28a6 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -3490,6 +3490,8 @@ def StrnCaseCmp : GNULibBuiltin<"strings.h"> {
let RequiresUndef = 1;
}
+// POSIX unistd.h
+
def GNU_Exit : GNULibBuiltin<"unistd.h"> {
let Spellings = ["_exit"];
let Attributes = [NoReturn];
@@ -3502,6 +3504,69 @@ def VFork : LibBuiltin<"unistd.h"> {
let Prototype = "pid_t()";
}
+def Read : LibBuiltin<"unistd.h"> {
+ let Spellings = ["read"];
+ let Attributes = [NoThrow];
+ let Prototype = "ssize_t(int, void*, size_t)";
+ let AddBuiltinPrefixedAlias = 1;
+}
+
+def PRead : LibBuiltin<"unistd.h"> {
+ let Spellings = ["pread"];
+ let Attributes = [NoThrow];
+ let Prototype = "ssize_t(int, void*, size_t, off_t)";
+ let AddBuiltinPrefixedAlias = 1;
+}
+
+def PRead64 : LibBuiltin<"unistd.h"> {
+ let Spellings = ["pread64"];
+ let Attributes = [NoThrow];
+ let Prototype = "ssize_t(int, void*, size_t, off64_t)";
+ let AddBuiltinPrefixedAlias = 1;
+}
+
+def Write : LibBuiltin<"unistd.h"> {
+ let Spellings = ["write"];
+ let Attributes = [NoThrow];
+ let Prototype = "ssize_t(int, void const*, size_t)";
+ let AddBuiltinPrefixedAlias = 1;
+}
+
+def PWrite : LibBuiltin<"unistd.h"> {
+ let Spellings = ["pwrite"];
+ let Attributes = [NoThrow];
+ let Prototype = "ssize_t(int, void const*, size_t, off_t)";
+ let AddBuiltinPrefixedAlias = 1;
+}
+
+def PWrite64 : LibBuiltin<"unistd.h"> {
+ let Spellings = ["pwrite64"];
+ let Attributes = [NoThrow];
+ let Prototype = "ssize_t(int, void const*, size_t, off64_t)";
+ let AddBuiltinPrefixedAlias = 1;
+}
+
+def GetCWD : LibBuiltin<"unistd.h"> {
+ let Spellings = ["getcwd"];
+ let Attributes = [NoThrow];
+ let Prototype = "char*(char*, size_t)";
+ let AddBuiltinPrefixedAlias = 1;
+}
+
+def ReadLink : LibBuiltin<"unistd.h"> {
+ let Spellings = ["readlink"];
+ let Attributes = [NoThrow];
+ let Prototype = "ssize_t(char const* restrict, char* restrict, size_t)";
+ let AddBuiltinPrefixedAlias = 1;
+}
+
+def ReadLinkAt : LibBuiltin<"unistd.h"> {
+ let Spellings = ["readlinkat"];
+ let Attributes = [NoThrow];
+ let Prototype = "ssize_t(int, char const* restrict, char* restrict, size_t)";
+ let AddBuiltinPrefixedAlias = 1;
+}
+
// POSIX pthread.h
def PthreadCreate : GNULibBuiltin<"pthread.h"> {
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index b157cbb0b8069..f2206ad08414e 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -953,6 +953,10 @@ def warn_fortify_source_overflow
def warn_fortify_source_size_mismatch : Warning<
"'%0' size argument is too large; destination buffer has size %1,"
" but size argument is %2">, InGroup<FortifySource>;
+def warn_fortify_destination_size_mismatch
+ : Warning<"'%0' size argument is too large; source buffer has size %2,"
+ " but size argument is %1">,
+ InGroup<FortifySource>;
def warn_fortify_strlen_overflow: Warning<
"'%0' will always overflow; destination buffer has size %1,"
diff --git a/clang/lib/AST/ASTContext.cpp b/clang/lib/AST/ASTContext.cpp
index 056bfe36b2a0a..d0432de14dbb3 100644
--- a/clang/lib/AST/ASTContext.cpp
+++ b/clang/lib/AST/ASTContext.cpp
@@ -12528,9 +12528,12 @@ static QualType DecodeTypeFromStr(const char *&Str, const ASTContext &Context,
assert(HowLong == 0 && !Signed && !Unsigned && "Bad modifiers for 'b'!");
Type = Context.BoolTy;
break;
- case 'z': // size_t.
- assert(HowLong == 0 && !Signed && !Unsigned && "Bad modifiers for 'z'!");
- Type = Context.getSizeType();
+ case 'z': // size_t and ssize_t.
+ assert(HowLong == 0 && "Bad modifiers for 'z'!");
+ if (Signed)
+ Type = Context.getSignedSizeType();
+ else
+ Type = Context.getSizeType();
break;
case 'w': // wchar_t.
assert(HowLong == 0 && !Signed && !Unsigned && "Bad modifiers for 'w'!");
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index 7b37e0b8d5430..c7d6d0c142eee 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -1243,6 +1243,14 @@ void Sema::checkFortifiedBuiltinMemoryFunction(FunctionDecl *FD,
std::optional<llvm::APSInt> DestinationSize;
unsigned DiagID = 0;
bool IsChkVariant = false;
+ bool IsTriggered = false;
+
+ auto CompareSizeSourceToDest = [&]() {
+ return SourceSize && DestinationSize
+ ? std::optional<int>{llvm::APSInt::compareValues(
+ *SourceSize, *DestinationSize)}
+ : std::nullopt;
+ };
auto GetFunctionName = [&]() {
std::string FunctionNameStr =
@@ -1270,6 +1278,7 @@ void Sema::checkFortifiedBuiltinMemoryFunction(FunctionDecl *FD,
DiagID = diag::warn_fortify_strlen_overflow;
SourceSize = ComputeStrLenArgument(1);
DestinationSize = ComputeSizeArgument(0);
+ IsTriggered = CompareSizeSourceToDest() > 0;
break;
}
@@ -1279,6 +1288,7 @@ void Sema::checkFortifiedBuiltinMemoryFunction(FunctionDecl *FD,
SourceSize = ComputeStrLenArgument(1);
DestinationSize = ComputeExplicitObjectSizeArgument(2);
IsChkVariant = true;
+ IsTriggered = CompareSizeSourceToDest() > 0;
break;
}
@@ -1349,11 +1359,13 @@ void Sema::checkFortifiedBuiltinMemoryFunction(FunctionDecl *FD,
} else {
DestinationSize = ComputeSizeArgument(0);
}
+ IsTriggered = CompareSizeSourceToDest() > 0;
break;
}
}
return;
}
+
case Builtin::BI__builtin___memcpy_chk:
case Builtin::BI__builtin___memmove_chk:
case Builtin::BI__builtin___memset_chk:
@@ -1369,6 +1381,7 @@ void Sema::checkFortifiedBuiltinMemoryFunction(FunctionDecl *FD,
DestinationSize =
ComputeExplicitObjectSizeArgument(TheCall->getNumArgs() - 1);
IsChkVariant = true;
+ IsTriggered = CompareSizeSourceToDest() > 0;
break;
}
@@ -1378,6 +1391,7 @@ void Sema::checkFortifiedBuiltinMemoryFunction(FunctionDecl *FD,
SourceSize = ComputeExplicitObjectSizeArgument(1);
DestinationSize = ComputeExplicitObjectSizeArgument(3);
IsChkVariant = true;
+ IsTriggered = CompareSizeSourceToDest() > 0;
break;
}
@@ -1395,6 +1409,7 @@ void Sema::checkFortifiedBuiltinMemoryFunction(FunctionDecl *FD,
DiagID = diag::warn_fortify_source_size_mismatch;
SourceSize = ComputeExplicitObjectSizeArgument(TheCall->getNumArgs() - 1);
DestinationSize = ComputeSizeArgument(0);
+ IsTriggered = CompareSizeSourceToDest() > 0;
break;
}
@@ -1409,8 +1424,58 @@ void Sema::checkFortifiedBuiltinMemoryFunction(FunctionDecl *FD,
DiagID = diag::warn_fortify_source_overflow;
SourceSize = ComputeExplicitObjectSizeArgument(TheCall->getNumArgs() - 1);
DestinationSize = ComputeSizeArgument(0);
+ IsTriggered = CompareSizeSourceToDest() > 0;
+ break;
+ }
+
+ case Builtin::BIread:
+ case Builtin::BI__builtin_read:
+ case Builtin::BIreadlink:
+ case Builtin::BI__builtin_readlink:
+ case Builtin::BIreadlinkat:
+ case Builtin::BI__builtin_readlinkat:
+ case Builtin::BIgetcwd:
+ case Builtin::BI__builtin_getcwd: {
+ DiagID = diag::warn_fortify_source_size_mismatch;
+ SourceSize = ComputeExplicitObjectSizeArgument(TheCall->getNumArgs() - 1);
+ DestinationSize = ComputeSizeArgument(TheCall->getNumArgs() - 2);
+ IsTriggered = CompareSizeSourceToDest() > 0;
+ break;
+ }
+
+ case Builtin::BIpread:
+ case Builtin::BI__builtin_pread:
+ case Builtin::BIpread64:
+ case Builtin::BI__builtin_pread64: {
+ DiagID = diag::warn_fortify_source_size_mismatch;
+ SourceSize = ComputeExplicitObjectSizeArgument(TheCall->getNumArgs() - 2);
+ DestinationSize = ComputeSizeArgument(TheCall->getNumArgs() - 3);
+ IsTriggered = CompareSizeSourceToDest() > 0;
+ break;
+ }
+
+ case Builtin::BIwrite:
+ case Builtin::BI__builtin_write: {
+ DiagID = diag::warn_fortify_destination_size_mismatch;
+ SourceSize = ComputeSizeArgument(TheCall->getNumArgs() - 2);
+ DestinationSize =
+ ComputeExplicitObjectSizeArgument(TheCall->getNumArgs() - 1);
+ IsTriggered = CompareSizeSourceToDest() < 0;
break;
}
+
+ case Builtin::BIpwrite:
+ case Builtin::BI__builtin_pwrite:
+ case Builtin::BIpwrite64:
+ case Builtin::BI__builtin_pwrite64: {
+ DiagID = diag::warn_fortify_destination_size_mismatch;
+ SourceSize = ComputeSizeArgument(TheCall->getNumArgs() - 3);
+ DestinationSize =
+ ComputeExplicitObjectSizeArgument(TheCall->getNumArgs() - 2);
+ IsTriggered = CompareSizeSourceToDest() < 0;
+ break;
+ }
+
case Builtin::BIsnprintf:
case Builtin::BI__builtin_snprintf:
case Builtin::BIvsnprintf:
@@ -1446,11 +1511,12 @@ void Sema::checkFortifiedBuiltinMemoryFunction(FunctionDecl *FD,
}
}
DestinationSize = ComputeSizeArgument(0);
+ IsTriggered = CompareSizeSourceToDest() > 0;
+ break;
}
}
- if (!SourceSize || !DestinationSize ||
- llvm::APSInt::compareValues(*SourceSize, *DestinationSize) <= 0)
+ if (!IsTriggered)
return;
std::string FunctionName = GetFunctionName();
diff --git a/clang/test/Sema/warn-fortify-source.c b/clang/test/Sema/warn-fortify-source.c
index 216878c0836d8..5d4a8d81b0b3e 100644
--- a/clang/test/Sema/warn-fortify-source.c
+++ b/clang/test/Sema/warn-fortify-source.c
@@ -96,6 +96,61 @@ void call_memset(void) {
__builtin_memset(buf, 0xff, 11); // expected-warning {{'memset' will always overflow; destination buffer has size 10, but size argument is 11}}
}
+void call_read(void) {
+ char buf[10];
+ __builtin_read(0, buf, 10);
+ __builtin_read(0, buf, 20); // expected-warning {{'read' size argument is too large; destination buffer has size 10, but size argument is 20}}
+}
+
+void call_pread(void) {
+ char buf[10];
+ __builtin_pread(0, buf, 10, 0);
+ __builtin_pread(0, buf, 20, 0); // expected-warning {{'pread' size argument is too large; destination buffer has size 10, but size argument is 20}}
+}
+
+void call_pread64(void) {
+ char buf[10];
+ __builtin_pread64(0, buf, 10, 0);
+ __builtin_pread64(0, buf, 20, 0); // expected-warning {{'pread64' size argument is too large; destination buffer has size 10, but size argument is 20}}
+}
+
+void call_write(void) {
+ char buf[10];
+ __builtin_write(0, buf, 10);
+ __builtin_write(0, buf, 20); // expected-warning {{'write' size argument is too large; source buffer has size 10, but size argument is 20}}
+}
+
+void call_pwrite(void) {
+ char buf[10];
+ __builtin_pwrite(0, buf, 10, 0);
+ __builtin_pwrite(0, buf, 20, 0); // expected-warning {{'pwrite' size argument is too large; source buffer has size 10, but size argument is 20}}
+}
+
+void call_pwrite64(void) {
+ char buf[10];
+ __builtin_pwrite64(0, buf, 10, 0);
+ __builtin_pwrite64(0, buf, 20, 0); // expected-warning {{'pwrite64' size argument is too large; source buffer has size 10, but size argument is 20}}
+}
+
+void call_getcwd(void) {
+ char buf[10];
+ __builtin_getcwd(buf, 10);
+ __builtin_getcwd(buf, 20); // expected-warning {{'getcwd' size argument is too large; destination buffer has size 10, but size argument is 20}}
+}
+
+void call_readlink(void) {
+ char buf[10];
+ __builtin_readlink("path", buf, 10);
+ __builtin_readlink("path", buf, 20); // expected-warning {{'readlink' size argument is too large; destination buffer has size 10, but size argument is 20}}
+}
+
+void call_readlinkat(void) {
+ char buf[10];
+ __builtin_readlinkat(0, "path", buf, 10);
+ __builtin_readlinkat(0, "path", buf, 20); // expected-warning {{'readlinkat' size argument is too large; destination buffer has size 10, but size argument is 20}}
+}
+
+
void call_snprintf(double d, int n) {
char buf[10];
__builtin_snprintf(buf, 10, "merp");
diff --git a/clang/utils/TableGen/ClangBuiltinsEmitter.cpp b/clang/utils/TableGen/ClangBuiltinsEmitter.cpp
index 352765acf5338..30c7afd95a89a 100644
--- a/clang/utils/TableGen/ClangBuiltinsEmitter.cpp
+++ b/clang/utils/TableGen/ClangBuiltinsEmitter.cpp
@@ -347,12 +347,15 @@ class PrototypeParser {
.Case("msint32_t", "Ni")
.Case("msuint32_t", "UNi")
.Case("objc_super", "M")
+ .Case("off_t", "Li")
+ .Case("off64_t", "Wi")
.Case("pid_t", "p")
.Case("ptrdiff_t", "Y")
.Case("SEL", "H")
.Case("short", "s")
.Case("sigjmp_buf", "SJ")
.Case("size_t", "z")
+ .Case("ssize_t", "Sz")
.Case("ucontext_t", "K")
.Case("uint32_t", "UZi")
.Case("uint64_t", "UWi")
|
| Type = Context.getSizeType(); | ||
| case 'z': // size_t and ssize_t. | ||
| assert(HowLong == 0 && "Bad modifiers for 'z'!"); | ||
| if (Signed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are both branches covered in testing? Please specify which test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think so.
Could you give me some guidance as to what tests would be suitable for this? Or pointers to relevant existing code?
|
(I'm looking at this now) |
ojhunt
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that this is the best approach, it requires a lot of duplicated checks, and those checks are significantly removed from the primary bound check.
I think a better approach would be to instead change this code to record:
- The true size of the destination object (i.e. ignore the explicitly stated size)
- The true size of the source object (ditto)
- The minimum requested operation size (e.g. size parameter in memcpy)
- The maximum operation size (size in strl/strncpy)
Then have a single set of tests at the end that verify the minimum size is <= to the source and destination buffers, and the destination size vs the stated maximum operation size, and when relevant the object sizes (e.g. char foo[5]; strcpy(foo, "too long");).
Independently of all of this it might be nice to warn on strncpy(buffer[5], "seven", 5) due to the truncation of the null termination - again completely unrelated to this PR.
As `readlink` is now defined as bultin it throws an error when redeclared with a different type: "incompatible redeclaration of library function function 'readlink'" As `sync` has the type `void(void)` it this issue should not happen in the future.
|
I've added a commit based on your suggestions. It attempts to select a pair of diagnostic strings based on whether it's certain to overflow or over-read. I decided to swap out I've also updated the tests and added a few more based on this. I'm a little uncertain on how strictly "fortify" is defined and how strongly clang intends to stick to that definition. |
This MR implements in clang some of the overflow and over-read checks implemented in bionics
FORTIFYheaders.In order to get the checks working with the existing overflow checker this MR defines the following as builtin:
readpread/pread64writepwrite/pwrite64getcwdreadlinkreadlinkatThe MR also enables
ssize_tin clangs AST and adds theoff_t,off64_tandssize_ttypes for use in builtin signatures. It also defines a new diagnosticwarn_fortify_destination_size_mismatchto warn when the called function may read beyond the bounds of the given source buffer.It also moves the comparison between source and destination sizes into each case of the switch block, this means that a given diagnostic can be triggered by buffer overflows or over-reads.
Tackles a few some items from #142230
Questions
Do
chkvariants of these functions e.g.__builtin___getcwd_chketc. makes sense to implement forfortify-source(LSB)?Is defining
off_tasunsigned longrobust enough?Are the names of
warn_fortify_destination_size_mismatchandwarn_fortify_source_size_mismatcheasy enough to distinguish?Should the
pread64/pwrite64builtin be marked asGNULibBuiltinas it is linux specific?Further Work
I plan to port more of the fortify checks from the bionic headers. I'd be interested to hear opinions on whether these belong alongside clangs
fortify-sourcediagnostics or would be better suited elsewhere in Sema. Specifically I plan to work on:fdsmatches thenfdsin a call topoll(bionic).SSIZE_MAXandPATH_MAXforunistd.hfunctions andrealpath.realpathisn't passed a nullpathstring (maybe [clang][sema] Add nonnull attribute to builtin format functions #160988 would be useful) (bionic).flagsandmodevalues passed toopenandopenatare meaningful (bionic).