-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[Sema] Suggest missing format attributes #166738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Implements the `-Wmissing-format-attribute` diagnostic. It suggests adding format attributes to function declarations that call other format functions and pass the format string and arguments to them. Co-authored-by: Budimir Arandjelovic <[email protected]>
|
@llvm/pr-subscribers-clang Author: Vladimir Vuksanovic (vvuksanovic) ChangesImplements the Patch is 27.90 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/166738.diff 7 Files Affected:
diff --git a/clang/docs/ReleaseNotes.rst b/clang/docs/ReleaseNotes.rst
index 32f669f8d70d8..be384b7db72a3 100644
--- a/clang/docs/ReleaseNotes.rst
+++ b/clang/docs/ReleaseNotes.rst
@@ -401,6 +401,8 @@ Improvements to Clang's diagnostics
or continue (#GH166013)
- Clang now emits a diagnostic in case `vector_size` or `ext_vector_type`
attributes are used with a negative size (#GH165463).
+- Clang now detects potential missing format attributes on function declarations
+ when calling format functions. (#GH60718)
Improvements to Clang's time-trace
----------------------------------
diff --git a/clang/include/clang/Basic/DiagnosticGroups.td b/clang/include/clang/Basic/DiagnosticGroups.td
index 1e0321de3f4b6..7b534d416fdd9 100644
--- a/clang/include/clang/Basic/DiagnosticGroups.td
+++ b/clang/include/clang/Basic/DiagnosticGroups.td
@@ -607,7 +607,6 @@ def MainReturnType : DiagGroup<"main-return-type">;
def MaxUnsignedZero : DiagGroup<"max-unsigned-zero">;
def MissingBraces : DiagGroup<"missing-braces">;
def MissingDeclarations: DiagGroup<"missing-declarations">;
-def : DiagGroup<"missing-format-attribute">;
def MissingIncludeDirs : DiagGroup<"missing-include-dirs">;
def MissingNoreturn : DiagGroup<"missing-noreturn">;
def MultiChar : DiagGroup<"multichar">;
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index fa509536bf021..78c5caf79eab7 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -3478,6 +3478,11 @@ def err_format_attribute_result_not : Error<"function does not return %0">;
def err_format_attribute_implicit_this_format_string : Error<
"format attribute cannot specify the implicit this argument as the format "
"string">;
+def warn_missing_format_attribute
+ : Warning<"diagnostic behavior may be improved by adding the '%0' format "
+ "attribute to the declaration of %1">,
+ InGroup<DiagGroup<"missing-format-attribute">>,
+ DefaultIgnore;
def err_callback_attribute_no_callee : Error<
"'callback' attribute specifies no callback callee">;
def err_callback_attribute_invalid_callee : Error<
diff --git a/clang/lib/Sema/SemaChecking.cpp b/clang/lib/Sema/SemaChecking.cpp
index ad2c2e4a97bb9..502e9fd099144 100644
--- a/clang/lib/Sema/SemaChecking.cpp
+++ b/clang/lib/Sema/SemaChecking.cpp
@@ -6517,7 +6517,8 @@ static StringLiteralCheckType checkFormatStringExpr(
unsigned format_idx, unsigned firstDataArg, FormatStringType Type,
VariadicCallType CallType, bool InFunctionCall,
llvm::SmallBitVector &CheckedVarArgs, UncoveredArgHandler &UncoveredArg,
- llvm::APSInt Offset, bool IgnoreStringsWithoutSpecifiers = false) {
+ llvm::APSInt Offset, std::optional<unsigned> *CallerParamIdx = nullptr,
+ bool IgnoreStringsWithoutSpecifiers = false) {
if (S.isConstantEvaluatedContext())
return SLCT_NotALiteral;
tryAgain:
@@ -6542,7 +6543,7 @@ static StringLiteralCheckType checkFormatStringExpr(
return checkFormatStringExpr(
S, ReferenceFormatString, SLE, Args, APK, format_idx, firstDataArg,
Type, CallType, /*InFunctionCall*/ false, CheckedVarArgs,
- UncoveredArg, Offset, IgnoreStringsWithoutSpecifiers);
+ UncoveredArg, Offset, CallerParamIdx, IgnoreStringsWithoutSpecifiers);
}
return SLCT_NotALiteral;
case Stmt::BinaryConditionalOperatorClass:
@@ -6577,7 +6578,7 @@ static StringLiteralCheckType checkFormatStringExpr(
Left = checkFormatStringExpr(
S, ReferenceFormatString, C->getTrueExpr(), Args, APK, format_idx,
firstDataArg, Type, CallType, InFunctionCall, CheckedVarArgs,
- UncoveredArg, Offset, IgnoreStringsWithoutSpecifiers);
+ UncoveredArg, Offset, CallerParamIdx, IgnoreStringsWithoutSpecifiers);
if (Left == SLCT_NotALiteral || !CheckRight) {
return Left;
}
@@ -6586,7 +6587,7 @@ static StringLiteralCheckType checkFormatStringExpr(
StringLiteralCheckType Right = checkFormatStringExpr(
S, ReferenceFormatString, C->getFalseExpr(), Args, APK, format_idx,
firstDataArg, Type, CallType, InFunctionCall, CheckedVarArgs,
- UncoveredArg, Offset, IgnoreStringsWithoutSpecifiers);
+ UncoveredArg, Offset, CallerParamIdx, IgnoreStringsWithoutSpecifiers);
return (CheckLeft && Left < Right) ? Left : Right;
}
@@ -6635,10 +6636,11 @@ static StringLiteralCheckType checkFormatStringExpr(
if (InitList->isStringLiteralInit())
Init = InitList->getInit(0)->IgnoreParenImpCasts();
}
- return checkFormatStringExpr(
- S, ReferenceFormatString, Init, Args, APK, format_idx,
- firstDataArg, Type, CallType,
- /*InFunctionCall*/ false, CheckedVarArgs, UncoveredArg, Offset);
+ return checkFormatStringExpr(S, ReferenceFormatString, Init, Args,
+ APK, format_idx, firstDataArg, Type,
+ CallType,
+ /*InFunctionCall*/ false, CheckedVarArgs,
+ UncoveredArg, Offset, CallerParamIdx);
}
}
@@ -6690,6 +6692,8 @@ static StringLiteralCheckType checkFormatStringExpr(
// format arguments, in all cases.
//
if (const auto *PV = dyn_cast<ParmVarDecl>(VD)) {
+ if (CallerParamIdx)
+ *CallerParamIdx = PV->getFunctionScopeIndex();
if (const auto *D = dyn_cast<Decl>(PV->getDeclContext())) {
for (const auto *PVFormatMatches :
D->specific_attrs<FormatMatchesAttr>()) {
@@ -6715,7 +6719,7 @@ static StringLiteralCheckType checkFormatStringExpr(
S, ReferenceFormatString, PVFormatMatches->getFormatString(),
Args, APK, format_idx, firstDataArg, Type, CallType,
/*InFunctionCall*/ false, CheckedVarArgs, UncoveredArg,
- Offset, IgnoreStringsWithoutSpecifiers);
+ Offset, CallerParamIdx, IgnoreStringsWithoutSpecifiers);
}
}
@@ -6770,7 +6774,7 @@ static StringLiteralCheckType checkFormatStringExpr(
StringLiteralCheckType Result = checkFormatStringExpr(
S, ReferenceFormatString, Arg, Args, APK, format_idx, firstDataArg,
Type, CallType, InFunctionCall, CheckedVarArgs, UncoveredArg,
- Offset, IgnoreStringsWithoutSpecifiers);
+ Offset, CallerParamIdx, IgnoreStringsWithoutSpecifiers);
if (IsFirst) {
CommonResult = Result;
IsFirst = false;
@@ -6784,10 +6788,11 @@ static StringLiteralCheckType checkFormatStringExpr(
if (BuiltinID == Builtin::BI__builtin___CFStringMakeConstantString ||
BuiltinID == Builtin::BI__builtin___NSStringMakeConstantString) {
const Expr *Arg = CE->getArg(0);
- return checkFormatStringExpr(
- S, ReferenceFormatString, Arg, Args, APK, format_idx,
- firstDataArg, Type, CallType, InFunctionCall, CheckedVarArgs,
- UncoveredArg, Offset, IgnoreStringsWithoutSpecifiers);
+ return checkFormatStringExpr(S, ReferenceFormatString, Arg, Args, APK,
+ format_idx, firstDataArg, Type, CallType,
+ InFunctionCall, CheckedVarArgs,
+ UncoveredArg, Offset, CallerParamIdx,
+ IgnoreStringsWithoutSpecifiers);
}
}
}
@@ -6795,7 +6800,7 @@ static StringLiteralCheckType checkFormatStringExpr(
return checkFormatStringExpr(
S, ReferenceFormatString, SLE, Args, APK, format_idx, firstDataArg,
Type, CallType, /*InFunctionCall*/ false, CheckedVarArgs,
- UncoveredArg, Offset, IgnoreStringsWithoutSpecifiers);
+ UncoveredArg, Offset, CallerParamIdx, IgnoreStringsWithoutSpecifiers);
return SLCT_NotALiteral;
}
case Stmt::ObjCMessageExprClass: {
@@ -6821,7 +6826,7 @@ static StringLiteralCheckType checkFormatStringExpr(
return checkFormatStringExpr(
S, ReferenceFormatString, Arg, Args, APK, format_idx, firstDataArg,
Type, CallType, InFunctionCall, CheckedVarArgs, UncoveredArg,
- Offset, IgnoreStringsWithoutSpecifiers);
+ Offset, CallerParamIdx, IgnoreStringsWithoutSpecifiers);
}
}
@@ -7001,6 +7006,77 @@ bool Sema::CheckFormatString(const FormatMatchesAttr *Format,
return false;
}
+static void CheckMissingFormatAttributes(Sema *S, FormatStringType FormatType,
+ unsigned FormatIdx, unsigned FirstArg,
+ ArrayRef<const Expr *> Args,
+ Sema::FormatArgumentPassingKind APK,
+ unsigned CallerParamIdx,
+ SourceLocation Loc) {
+ const FunctionDecl *Caller = S->getCurFunctionDecl();
+ if (!Caller)
+ return;
+
+ // Find the offset to convert between attribute and parameter indexes.
+ unsigned CallerArgumentIndexOffset =
+ hasImplicitObjectParameter(Caller) ? 2 : 1;
+
+ unsigned FirstArgumentIndex = -1;
+ switch (APK) {
+ case Sema::FormatArgumentPassingKind::FAPK_Fixed:
+ case Sema::FormatArgumentPassingKind::FAPK_Variadic: {
+ // As an extension, clang allows the format attribute on non-variadic
+ // functions.
+ // Caller must have fixed arguments to pass them to a fixed or variadic
+ // function. Try to match caller and callee arguments. If successful, then
+ // emit a diag with the caller idx, otherwise we can't determine the callee
+ // arguments.
+ unsigned NumCalleeArgs = Args.size() - FirstArg;
+ if (NumCalleeArgs == 0 || Caller->getNumParams() < NumCalleeArgs) {
+ // There aren't enough arugments in the caller to pass to callee.
+ return;
+ }
+ for (unsigned CalleeIdx = Args.size() - 1,
+ CallerIdx = Caller->getNumParams() - 1;
+ CalleeIdx >= FirstArg; --CalleeIdx, --CallerIdx) {
+ const auto *Arg =
+ dyn_cast<DeclRefExpr>(Args[CalleeIdx]->IgnoreParenCasts());
+ if (!Arg)
+ return;
+ const auto *Param = dyn_cast<ParmVarDecl>(Arg->getDecl());
+ if (!Param || Param->getFunctionScopeIndex() != CallerIdx)
+ return;
+ }
+ FirstArgumentIndex =
+ Caller->getNumParams() + CallerArgumentIndexOffset - NumCalleeArgs;
+ break;
+ }
+ case Sema::FormatArgumentPassingKind::FAPK_VAList:
+ // Caller arguments are either variadic or a va_list.
+ FirstArgumentIndex =
+ Caller->isVariadic()
+ ? (Caller->getNumParams() + CallerArgumentIndexOffset)
+ : 0;
+ break;
+ case Sema::FormatArgumentPassingKind::FAPK_Elsewhere:
+ // Args are not passed to the callee.
+ return;
+ }
+
+ // Emit the diagnostic and fixit.
+ unsigned FormatStringIndex = CallerParamIdx + CallerArgumentIndexOffset;
+ StringRef FormatTypeName = S->GetFormatStringTypeName(FormatType);
+ S->Diag(Loc, diag::warn_missing_format_attribute)
+ << FormatTypeName << Caller
+ << FixItHint::CreateInsertion(Caller->getFirstDecl()->getLocation(),
+ (llvm::Twine("__attribute__((format(") +
+ FormatTypeName + ", " +
+ llvm::Twine(FormatStringIndex) + ", " +
+ llvm::Twine(FirstArgumentIndex) + ")))")
+ .str());
+ S->Diag(Caller->getFirstDecl()->getLocation(), diag::note_callee_decl)
+ << Caller;
+}
+
bool Sema::CheckFormatArguments(ArrayRef<const Expr *> Args,
Sema::FormatArgumentPassingKind APK,
const StringLiteral *ReferenceFormatString,
@@ -7030,11 +7106,12 @@ bool Sema::CheckFormatArguments(ArrayRef<const Expr *> Args,
// ObjC string uses the same format specifiers as C string, so we can use
// the same format string checking logic for both ObjC and C strings.
UncoveredArgHandler UncoveredArg;
+ std::optional<unsigned> CallerParamIdx;
StringLiteralCheckType CT = checkFormatStringExpr(
*this, ReferenceFormatString, OrigFormatExpr, Args, APK, format_idx,
firstDataArg, Type, CallType,
/*IsFunctionCall*/ true, CheckedVarArgs, UncoveredArg,
- /*no string offset*/ llvm::APSInt(64, false) = 0);
+ /*no string offset*/ llvm::APSInt(64, false) = 0, &CallerParamIdx);
// Generate a diagnostic where an uncovered argument is detected.
if (UncoveredArg.hasUncoveredArg()) {
@@ -7047,11 +7124,6 @@ bool Sema::CheckFormatArguments(ArrayRef<const Expr *> Args,
// Literal format string found, check done!
return CT == SLCT_CheckedLiteral;
- // Strftime is particular as it always uses a single 'time' argument,
- // so it is safe to pass a non-literal string.
- if (Type == FormatStringType::Strftime)
- return false;
-
// Do not emit diag when the string param is a macro expansion and the
// format is either NSString or CFString. This is a hack to prevent
// diag when using the NSLocalizedString and CFCopyLocalizedString macros
@@ -7061,6 +7133,15 @@ bool Sema::CheckFormatArguments(ArrayRef<const Expr *> Args,
SourceMgr.isInSystemMacro(FormatLoc))
return false;
+ if (CallerParamIdx)
+ CheckMissingFormatAttributes(this, Type, format_idx, firstDataArg, Args,
+ APK, *CallerParamIdx, Loc);
+
+ // Strftime is particular as it always uses a single 'time' argument,
+ // so it is safe to pass a non-literal string.
+ if (Type == FormatStringType::Strftime)
+ return false;
+
// If there are no arguments specified, warn with -Wformat-security, otherwise
// warn only with -Wformat-nonliteral.
if (Args.size() == firstDataArg) {
diff --git a/clang/lib/Sema/SemaDeclAttr.cpp b/clang/lib/Sema/SemaDeclAttr.cpp
index a9e7b44ac9d73..9849e125889cf 100644
--- a/clang/lib/Sema/SemaDeclAttr.cpp
+++ b/clang/lib/Sema/SemaDeclAttr.cpp
@@ -3572,7 +3572,7 @@ static void handleEnumExtensibilityAttr(Sema &S, Decl *D,
}
/// Handle __attribute__((format_arg((idx)))) attribute based on
-/// http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html
+/// https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html
static void handleFormatArgAttr(Sema &S, Decl *D, const ParsedAttr &AL) {
const Expr *IdxExpr = AL.getArgAsExpr(0);
ParamIdx Idx;
@@ -3771,7 +3771,7 @@ struct FormatAttrCommon {
};
/// Handle __attribute__((format(type,idx,firstarg))) attributes based on
-/// http://gcc.gnu.org/onlinedocs/gcc/Function-Attributes.html
+/// https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html
static bool handleFormatAttrCommon(Sema &S, Decl *D, const ParsedAttr &AL,
FormatAttrCommon *Info) {
// Checks the first two arguments of the attribute; this is shared between
diff --git a/clang/test/Sema/attr-format-missing.c b/clang/test/Sema/attr-format-missing.c
new file mode 100644
index 0000000000000..7e5131153f661
--- /dev/null
+++ b/clang/test/Sema/attr-format-missing.c
@@ -0,0 +1,217 @@
+// RUN: %clang_cc1 -fsyntax-only -verify -Wmissing-format-attribute %s
+// RUN: %clang_cc1 -fsyntax-only -Wmissing-format-attribute -fdiagnostics-parseable-fixits %s 2>&1 | FileCheck %s
+
+typedef unsigned long size_t;
+typedef long ssize_t;
+typedef __builtin_va_list va_list;
+
+__attribute__((format(printf, 1, 2)))
+int printf(const char *, ...);
+
+__attribute__((format(scanf, 1, 2)))
+int scanf(const char *, ...);
+
+__attribute__((format(printf, 1, 0)))
+int vprintf(const char *, va_list);
+
+__attribute__((format(scanf, 1, 0)))
+int vscanf(const char *, va_list);
+
+__attribute__((format(printf, 2, 0)))
+int vsprintf(char *, const char *, va_list);
+
+struct tm { unsigned i; };
+__attribute__((format(strftime, 3, 0)))
+size_t strftime(char *, size_t, const char *, const struct tm *);
+
+__attribute__((format(strfmon, 3, 4)))
+ssize_t strfmon(char *, size_t, const char *, ...);
+
+// CHECK: fix-it:"{{.*}}":{[[@LINE+1]]:6-[[@LINE+1]]:6}:"__attribute__((format(printf, 1, 0)))"
+void f1(char *out, va_list args) // #f1
+{
+ vprintf(out, args); // expected-warning {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f1'}}
+ // expected-note@#f1 {{'f1' declared here}}
+}
+
+// CHECK: fix-it:"{{.*}}":{[[@LINE+1]]:6-[[@LINE+1]]:6}:"__attribute__((format(scanf, 1, 0)))"
+void f2(const char *out, va_list args) // #f2
+{
+ vscanf(out, args); // expected-warning {{diagnostic behavior may be improved by adding the 'scanf' format attribute to the declaration of 'f2'}}
+ // expected-note@#f2 {{'f2' declared here}}
+}
+
+// CHECK: fix-it:"{{.*}}":{[[@LINE+1]]:6-[[@LINE+1]]:6}:"__attribute__((format(printf, 1, 2)))"
+void f3(const char *out, ... /* args */) // #f3
+{
+ va_list args;
+ vprintf(out, args); // expected-warning {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f3'}}
+ // expected-note@#f3 {{'f3' declared here}}
+}
+
+// CHECK: fix-it:"{{.*}}":{[[@LINE+1]]:6-[[@LINE+1]]:6}:"__attribute__((format(scanf, 1, 2)))"
+void f4(const char *out, ... /* args */) // #f4
+{
+ va_list args;
+ vscanf(out, args); // expected-warning {{diagnostic behavior may be improved by adding the 'scanf' format attribute to the declaration of 'f4'}}
+ // expected-note@#f4 {{'f4' declared here}}
+}
+
+// CHECK: fix-it:"{{.*}}":{[[@LINE+2]]:6-[[@LINE+2]]:6}:"__attribute__((format(printf, 2, 3)))"
+__attribute__((format(printf, 1, 3)))
+void f5(char *out, const char *format, ... /* args */) // #f5
+{
+ va_list args;
+ vsprintf(out, format, args); // expected-warning {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f5'}}
+ // expected-note@#f5 {{'f5' declared here}}
+}
+
+__attribute__((format(scanf, 1, 3)))
+void f6(char *out, const char *format, ... /* args */) // #f6
+{
+ va_list args;
+ vsprintf(out, format, args); // expected-warning {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f6'}}
+ // expected-note@#f6 {{'f6' declared here}}
+}
+
+// Ok, out is not passed to print functions.
+void f7(char* out, ... /* args */)
+{
+ va_list args;
+
+ const char *ch = "format";
+ vprintf(ch, args);
+ vprintf("test", args);
+}
+
+// Ok, out is not passed to print functions.
+void f8(char *out, va_list args)
+{
+ const char *ch = "format";
+ vprintf(ch, args);
+ vprintf("test", args);
+}
+
+// Ok, out is not passed to scan functions.
+void f9(va_list args)
+{
+ const char *ch = "format";
+ vscanf(ch, args);
+ vscanf("test", args);
+}
+
+// CHECK: fix-it:"{{.*}}":{[[@LINE+2]]:6-[[@LINE+2]]:6}:"__attribute__((format(scanf, 1, 2)))"
+// CHECK: fix-it:"{{.*}}":{[[@LINE+1]]:6-[[@LINE+1]]:6}:"__attribute__((format(printf, 1, 2)))"
+void f10(const char *out, ... /* args */) // #f10
+{
+ va_list args;
+ vscanf(out, args); // expected-warning {{diagnostic behavior may be improved by adding the 'scanf' format attribute to the declaration of 'f10'}}
+ // expected-note@#f10 {{'f10' declared here}}
+ vprintf(out, args); // expected-warning {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f10'}}
+ // expected-note@#f10 {{'f10' declared here}}
+}
+
+// CHECK: fix-it:"{{.*}}":{[[@LINE+1]]:6-[[@LINE+1]]:6}:"__attribute__((format(printf, 1, 2)))"
+void f11(const char out[], ... /* args */) // #f11
+{
+ va_list args;
+ char ch[10] = "format";
+ vprintf(ch, args);
+ vsprintf(ch, out, args); // expected-warning {{diagnostic behavior may be improved by adding the 'printf' format attribute to the declaration of 'f11'}}
+ // expected-note@#f11 {{'f11' declared here}}
+}
+
+// CHECK: fix-it:"{{.*}}":{[[@LINE+1]]:6-[[@LINE+1]]:6}:"__attribute__((format(printf, 1, 0)))"
+void f12(char* out) // #f12
+{
+ va_list args;
+ const char *ch = "format";
+ vsprintf(ou...
[truncated]
|
| - Clang now emits a diagnostic in case `vector_size` or `ext_vector_type` | ||
| attributes are used with a negative size (#GH165463). | ||
| - Clang now detects potential missing format attributes on function declarations | ||
| when calling format functions. (#GH60718) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#60718 seems to be closed already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sirraide
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One concern I have with this is that __attribute__(()) is non-standard syntax, so suggesting that in a fix-it is... not great for portability, because e.g. MSVC doesn’t support that syntax at all. We probably want to only emit this diagnostic (or perform the check at all really) if we’re in -std=gnuXY/-std=gnu++XY mode.
Also, in non-GNU C23/C++11 mode, we can just suggest |
Switched to emitting Another possible issue is that clang supports the format attribute on non-variadic functions, which GCC doesn't. This diagnostic can suggest adding the attribute to such functions, which would cause an error when compiling with GCC. Should I just disable that case? |
|
Pinging @aaronpuchert @apple-fcloutier as reviewers of #105479 which this is an update of. |
apple-fcloutier
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing the comments I had on the last iteration. I would mostly like us to resolve quality-of-life issues with -Wmissing-format-attribute and how it combines with other format warnings to proceed.
| unsigned NumCalleeArgs = Args.size() - FirstArg; | ||
| if (NumCalleeArgs == 0 || Caller->getNumParams() < NumCalleeArgs) { | ||
| // There aren't enough arugments in the caller to pass to callee. | ||
| return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's possible in this case that one could use __attribute__((format_matches)) instead, which I believe I landed between your last PR and this one. I haven't thought very deeply about whether you have all the information you need here to do it, though, so there's a decent likelihood that you would look at it and decide it's not actually possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd have to generate a format string for format_matches which I guess should be doable based on the argument types in the call, unless some non-scalar type is used.
| SourceLocation Loc) { | ||
| const FunctionDecl *Caller = S->getCurFunctionDecl(); | ||
| if (!Caller) | ||
| return; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will always fail in Objective-C because ObjCMethodDecl does not inherit from FunctionDecl. Would you add an Objective-C test to ensure this works? If you're not super familiar with it, the declaration/use syntax should be:
#include <stdarg.h>
@interface Foo
-(void)myprintf:(const char *)fmt, ... __attribute__((format(printf, 1, 2)));
-(void)myvprintf:(const char *)fmt list:(va_list)ap __attribute__((format(printf, 1, 0)));
@end
void bar(Foo *f) {
[f myprintf:"hello %i", 123];
}
void baz(Foo *f, const char *fmt, va_list ap) {
[f myvprintf:fmt list:ap];
}
Happy to answer other questions you may have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this also always fails for blocks, but it matters less because __attribute__((format)) is sugar so adding the attribute on the block type doesn't guarantee that the block you receive also has it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not familiar with objective-c and there doesn't seem to be much documentation regarding attributes, but I think I have a working implementation. Just one question: is there an implicit 'this' argument on obj-c methods?
I'll have to look into blocks to see how they fit in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Objective-C methods have two implicit arguments (self and _cmd), but IIRC index 1 corresponds to the first explicit argument (ie, not like how it's done for C++).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added support for Objective-C. Let me know if there are other cases I should test. Also, I wasn't sure which attribute syntax is supported, so I used the GNU extension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to cover it, thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested where Objective-C tolerates attributes, and the fixit is incorrect currently:
#include <stdarg.h>
__attribute__((format(printf, 1, 0)))
int vprintf(const char *, va_list ap);
@interface Foo
-(void)printf:(const char *)fmt, ...;
@end
@implementation Foo
-(void)printf:(const char *)fmt, ... {
va_list ap;
va_start(ap, fmt);
vprintf(fmt, ap);
va_end(ap);
}
@endThe fixit will suggest __attribute__((format(printf, 1, 2))) -(void)printf:(const char *)fmt, ... {, which is a syntax error. Attributes go at the end of the declaration, so it should be -(void)printf:(const char *)fmt, ... __attribute__((...)).
To make it as effective as possible, the fixit should also go to the canonical declaration, as returned by getCanonicalDecl(), which seems to do the right thing for both FunctionDecl (returns the first decl) and ObjCMethodDecl (finds the decl in the relevant @interface block).
Here's a patch for your change which I think does the trick (but that I didn't test extensively): objc-placement.patch
clang/lib/Sema/SemaChecking.cpp
Outdated
| : 0; | ||
| break; | ||
| case Sema::FormatArgumentPassingKind::FAPK_Elsewhere: | ||
| // Args are not passed to the callee. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should be able to suggest a format_matches attribute here, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking further into this, I can suggest format_matches in a diagnostic message, but there are some problematic cases for creating the actual format string parameter to format_matches from the call arguments. From the attribute docs:
While checks for the format attribute tolerate sone size mismatches that standard argument promotion renders immaterial (such as formatting an int with %hhd, which specifies a char-sized integer), checks for format_matches require specified argument sizes to match exactly.
In this case the actual format string might be "%d", but pass a char argument. That is correct because of arg promotion, but based on the arg type, I would deduce this as "%c".
Format strings expecting a variable modifier (such as %*s) are incompatible with format strings that would itemize the variable modifiers (such as %i %s), even if the two specify ABI-compatible argument lists.
This is a bigger problem as it can't be deduced only based on the arguments.
All pointer specifiers, modifiers aside, are mutually incompatible. For instance, %s is not compatible with %p, and %p is not compatible with %n, and %hhn is incompatible with %s, even if the pointers are ABI-compatible or identical on the selected platform.
Similarly, some pointer type specifiers can't be determined based on the argument type. For example const char * could be printed as '%s' or '%p'.
Should I just emit a diagnostic without specific attribute parameters in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are two similar issues which I think we may be mixing up. There's the issue of suggesting a format_matches attribute out of thin air, which I attribute to this comment, and this convinces me that we can't have a fixit.
The issue I wanted to point to here is FAPK_Elsewhere indicates the callee has the format_matches attribute, so in this case there is a format attribute that we can just grab out of it instead of having to make one up.
| return false; | ||
|
|
||
| const LangOptions &LO = getLangOpts(); | ||
| if (CallerParamIdx && (LO.GNUMode || LO.C23 || LO.CPlusPlus11)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should warn in all cases and use the language mode only to decide whether to offer a fixit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue I see w/ that is that for language modes that don’t support [[]]-style attributes, there is no way to actually make the warning go away in a portable manner (you can use a #pragma but that’s a bit sad and also not portable).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should warn in all cases and use the language mode only to decide whether to offer a fixit.
I'm not sure if the warning that suggests adding an attribute makes sense if there isn't a portable way to actually add the attribute.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are portable ways to annotate your functions, it's just that you can't offer them as fixits for a variety of reasons (for instance, create a macro that adds the format attribute based on a __has_attribute test); in fact, even when a language-defined syntax is available, it might not be the preferred way to add the attribute in any given project.
But even then, I disagree that the only value (or even the main value) of any diagnostic is the fixit. Clang doesn't only diagnose situations that it knows how to fix. For a codebase that truly can't do anything about missing format attributes, the solution is to not take the intentional step of enabling the diagnostic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, the user can figure out a way to write the attribute in a suitable way. I have changed the diagnostic message to include the attribute parameters as well like:
diagnostic behavior may be improved by adding the 'format(printf, X, Y)' attribute to the declaration of 'foo'
so even if there is no fixit, the full suggested attribute is still shown.
clang/lib/Sema/SemaChecking.cpp
Outdated
| const LangOptions &LO = getLangOpts(); | ||
| if (CallerParamIdx && (LO.GNUMode || LO.C23 || LO.CPlusPlus11)) | ||
| CheckMissingFormatAttributes(this, Type, format_idx, firstDataArg, Args, | ||
| APK, *CallerParamIdx, Loc); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whenever -Wmissing-format-attribute hits, you will also get -Wformat-nonliteral from line 7173, which I think we need to think about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is an exception for strftime calls which don't emit -Wformat-nonliteral because of the early return on line 7153.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, what about the other format families? I haven't built your change to play with it, but my suspicion is that if you enable -Wformat-nonliteral and -Wmissing-format-attribute, then you usually will trip both:
#include <stdarg.h>
__attribute__((format(printf, 1, 0))) void foo(const char *, va_list);
void bar(const char *fmt, ...) {
va_list ap;
va_start(ap, fmt);
foo(fmt, ap); // <----
va_end(ap);
}
If that's not the case (or actually, even if that's the case), I think you should have a test that shows we emit only one or the other.
| return; | ||
| } | ||
|
|
||
| // Emit the diagnostic and fixit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a recovery action, should we also add an implicit format attribute to Caller? It might catch issues in the rest of the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That changes some behavior:
void foo(const char *fmt, va_list args) {
vprintf(fmt, args);
vscanf(fmt, args);
}Previously, this would emit two diagnostics to suggest a printf and scanf format attribute. If we add an implicit attr, then when looking at vscanf we would get passing 'printf' format string where 'scanf' format string is expected. Also, this would eliminate duplicate diagnostics created by multiple format function calls of the same type.
I'll see if there are any other differences before updating the PR, but these changes make sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These changes also make sense to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, one thing to mention is that we should only add the attribute if the diagnostic is enabled at the call site, otherwise people will start seeing inexplicable format warnings that cannot be silenced. This would probably be a net positive, but the respectful way to do this would be to enable the diagnostic by default.
clang/lib/Sema/SemaChecking.cpp
Outdated
| AttrSuffix) | ||
| .str()); | ||
| S->Diag(Caller->getFirstDecl()->getLocation(), diag::note_callee_decl) | ||
| << Caller; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When it's not implicit, you can also show the source location of the format attribute as it's often a macro and it might inspire users to use it instead of raw attribute syntax.
| def warn_missing_format_attribute : Warning< | ||
| "diagnostic behavior may be improved by adding the '%0' format " | ||
| "attribute to the declaration of %1">, | ||
| InGroup<DiagGroup<"missing-format-attribute">>, DefaultIgnore; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you help me put in context how -Wmissing-format-attribute works with GCC and its relationship with -Wformat, -Wformat=2 and -Wformat-nonliteral? It's kind of a least favorite way forward to add new off-by-default diagnostics.
Here's my thinking (before knowing how GCC deals with this):
- Could -Wmissing-format-attribute be enabled by default and/or be part of -Wformat? (Do we need an adult to confirm the bar for adding on-by-default diagnostics? I think this should be OK given that this shouldn't have false positives at all.)
- If -Wmissing-format-attribute cannot be enabled by default, what are our options for splitting the difference between it and -Wformat-nonliteral? Should -Wmissing-format-attribute be part of -Wformat-nonliteral so that users of -Wformat-nonliteral get it when possible?
- In all cases, -Wformat-nonliteral always hitting when -Wmissing-format-attribute does is an overlook that we should address.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In GCC -Wmissing-format-attribute or -Wsuggest-attribute=format is an off-by-default warning unrelated to all other -Wformat* options. I think it makes sense to group this in -Wformat-nonliteral or -Wformat, though it is more of a suggestion to improve diagnostics than an actual warning like -Wformat-nonliteral.
In all cases, -Wformat-nonliteral always hitting when -Wmissing-format-attribute does is an overlook that we should address.
I mentioned in another comment, there is an exception with strftime calls. It does not trigger -Wformat-nonliteral since it always has one argument, but it still makes sense to add the attribute to the caller. Note that GCC actually doesn't emit -Wmissing-format-attribute in that case because the callee is not variadic.
That’s a good point; we probably shouldn’t issue a fixit that when applied causes GCC to error... we might still be able to add an implicit format attribute as suggested by #166738 (comment) though. |
|
As far as I'm concerned, This does mean code that was previously clean with (Or, alternatively, change my mind!) |
erichkeane
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy when @apple-fcloutier is here, I don't intend to review this again unless separately pinged, consider my feedback to be theirs/vice-versa.
- Check now works or Objective-C - Add attribute arguments to diagnostic message - Rename CallerParamIdx to CallerFormatParamIdx - Fix typo
Let me make sure I understand. The idea is to emit |
Yes, exactly (in addition to making -Wmissing-format-attribute a subgroup of -Wformat-nonliteral so that users of -Wformat-nonliteral get it too, and writing a test to ensure only one fires). |
|
Thank you for addressing my comments; I think now we either agree on the changes to make or have made them. The last thing I forgot before is to mention -Wmissing-format-attributes in clang/docs/ReleaseNotes.rst, under "Improvements to Clang's diagnostics". |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the format_matches support (and reviewing again, I now see that ReleaseNotes.rst was updated very early on). The last bits that I'm still tracking:
- -Wmissing-format-attribute should be a subgroup of -Wformat-nonliteral (should have a test that with both enabled, you only get -Wmissing-format-attribute).
- We should not add the implicit attribute if diag::warn_missing_format_attribute is ignored, since it would create downstream diagnostics that cannot be disabled in any reasonable way. We should have a test that with it enabled, you do get diagnostics caused by the implicit attribute, and with it disabled, you do not.
I expect these are the last changes I'll request, everything else looks good.
Example for point 1:
// clang -fsyntax-only -Wformat-nonliteral -Wmissing-format-attribute test.c
#include <stdarg.h>
__attribute__((format(printf, 1, 0)))
int vprintf(const char *, va_list);
void foo(const char *fmt, ...) {
va_list ap;
va_start(ap, fmt);
vprintf(fmt, ap);
// warning: diagnostic behavior may be improved by adding the 'format(printf, 1, 2)' attribute to the declaration of 'foo' [-Wmissing-format-attribute]
// warning: format string is not a string literal [-Wformat-nonliteral]
}The expected result is that we have just the -Wmissing-format-attribute since they are both complaining about the same thing but -Wmissing-format-attribute knows how to fix it. Enabling -Wformat-nonliteral should enable -Wmissing-format-attribute as well since it is now the better diagnostic for that situation.
Example for point 2:
// clang -fsyntax-only -Wformat
#include <stdarg.h>
__attribute__((format(printf, 1, 0)))
int vprintf(const char *, va_list);
void foo(const char *fmt, ...) {
va_list ap;
va_start(ap, fmt);
vprintf(fmt, ap);
}
void bar(void) {
foo("%g\n", 123);
// warning: format specifies type 'double' but the argument has type 'int' [-Wformat]
}foo is not attributed in source and there is no clear explanation for what's going on; if there's ever a false positive, there is no way to prevent clang from adding the format attribute implicitly to foo. The expected result is that if -Wmissing-format-attribute is disabled, then we should not implicitly add the format attribute.
| SourceLocation Loc) { | ||
| std::string escapeFormatString(StringRef Input) { | ||
| std::string Result; | ||
| llvm::raw_string_ostream OS(Result); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but OS.write_escaped does something pretty similar to this loop.
| } while (false); | ||
| S->Diag(Caller->getLocation(), diag::note_entity_declared_at) << Caller; | ||
|
|
||
| if (APK != Sema::FormatArgumentPassingKind::FAPK_Elsewhere) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should only do this if diag::warn_missing_format_attribute is enabled because it can cause new diagnostics to cascade, which users would have no way to disable. You can check with !Diag.isIgnored(diag::warn_missing_format_attribute, Loc).
Implements the
-Wmissing-format-attributediagnostic. It suggests adding format attributes to function declarations that call other format functions and pass the format string and arguments to them.