-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[Sema] Suggest missing format attributes #166738
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 3 commits
aebddd9
c62ec64
a0a4ca6
ba5e3d8
c1b001a
948c282
4fd960b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -41,6 +41,7 @@ | |
| #include "clang/AST/TypeLoc.h" | ||
| #include "clang/AST/UnresolvedSet.h" | ||
| #include "clang/Basic/AddressSpaces.h" | ||
| #include "clang/Basic/CharInfo.h" | ||
| #include "clang/Basic/Diagnostic.h" | ||
| #include "clang/Basic/DiagnosticSema.h" | ||
| #include "clang/Basic/IdentifierTable.h" | ||
|
|
@@ -7012,15 +7013,28 @@ bool Sema::CheckFormatString(const FormatMatchesAttr *Format, | |
| return false; | ||
| } | ||
|
|
||
| static void CheckMissingFormatAttributes(Sema *S, FormatStringType FormatType, | ||
| unsigned FormatIdx, unsigned FirstArg, | ||
| ArrayRef<const Expr *> Args, | ||
| Sema::FormatArgumentPassingKind APK, | ||
| unsigned CallerParamIdx, | ||
| SourceLocation Loc) { | ||
| std::string escapeFormatString(StringRef Input) { | ||
| std::string Result; | ||
| llvm::raw_string_ostream OS(Result); | ||
| for (char C : Input) { | ||
| StringRef Escaped = escapeCStyle<EscapeChar::Double>(C); | ||
| if (Escaped.empty()) | ||
| OS << C; | ||
| else | ||
| OS << Escaped; | ||
| } | ||
| return Result; | ||
| } | ||
|
|
||
| static void CheckMissingFormatAttributes( | ||
| Sema *S, ArrayRef<const Expr *> Args, Sema::FormatArgumentPassingKind APK, | ||
| StringLiteral *ReferenceFormatString, unsigned FormatIdx, | ||
| unsigned FirstDataArg, FormatStringType FormatType, unsigned CallerParamIdx, | ||
| SourceLocation Loc) { | ||
| NamedDecl *Caller = S->getCurFunctionOrMethodDecl(); | ||
| if (!Caller) | ||
| return; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This will always fail in Objective-C because ObjCMethodDecl does not inherit from FunctionDecl. Would you add an Objective-C test to ensure this works? If you're not super familiar with it, the declaration/use syntax should be: Happy to answer other questions you may have.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this also always fails for blocks, but it matters less because
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not familiar with objective-c and there doesn't seem to be much documentation regarding attributes, but I think I have a working implementation. Just one question: is there an implicit 'this' argument on obj-c methods? I'll have to look into blocks to see how they fit in here.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Objective-C methods have two implicit arguments (
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Added support for Objective-C. Let me know if there are other cases I should test. Also, I wasn't sure which attribute syntax is supported, so I used the GNU extension.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Seems to cover it, thank you!
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tested where Objective-C tolerates attributes, and the fixit is incorrect currently: #include <stdarg.h>
__attribute__((format(printf, 1, 0)))
int vprintf(const char *, va_list ap);
@interface Foo
-(void)printf:(const char *)fmt, ...;
@end
@implementation Foo
-(void)printf:(const char *)fmt, ... {
va_list ap;
va_start(ap, fmt);
vprintf(fmt, ap);
va_end(ap);
}
@endThe fixit will suggest To make it as effective as possible, the fixit should also go to the canonical declaration, as returned by Here's a patch for your change which I think does the trick (but that I didn't test extensively): objc-placement.patch |
||
| Caller = dyn_cast<NamedDecl>(Caller->getCanonicalDecl()); | ||
|
|
||
| unsigned NumCallerParams = getFunctionOrMethodNumParams(Caller); | ||
|
|
||
|
|
@@ -7038,13 +7052,13 @@ static void CheckMissingFormatAttributes(Sema *S, FormatStringType FormatType, | |
| // function. Try to match caller and callee arguments. If successful, then | ||
| // emit a diag with the caller idx, otherwise we can't determine the callee | ||
| // arguments. | ||
| unsigned NumCalleeArgs = Args.size() - FirstArg; | ||
| unsigned NumCalleeArgs = Args.size() - FirstDataArg; | ||
| if (NumCalleeArgs == 0 || NumCallerParams < NumCalleeArgs) { | ||
| // There aren't enough arguments in the caller to pass to callee. | ||
| return; | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's possible in this case that one could use
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'd have to generate a format string for |
||
| } | ||
| for (unsigned CalleeIdx = Args.size() - 1, CallerIdx = NumCallerParams - 1; | ||
| CalleeIdx >= FirstArg; --CalleeIdx, --CallerIdx) { | ||
| CalleeIdx >= FirstDataArg; --CalleeIdx, --CallerIdx) { | ||
| const auto *Arg = | ||
| dyn_cast<DeclRefExpr>(Args[CalleeIdx]->IgnoreParenCasts()); | ||
| if (!Arg) | ||
|
|
@@ -7064,40 +7078,68 @@ static void CheckMissingFormatAttributes(Sema *S, FormatStringType FormatType, | |
| : 0; | ||
| break; | ||
| case Sema::FormatArgumentPassingKind::FAPK_Elsewhere: | ||
| // Args are not passed to the callee. | ||
| return; | ||
| // The callee has a format_matches attribute. We will emit that instead. | ||
| if (!ReferenceFormatString) | ||
| return; | ||
| break; | ||
| } | ||
|
|
||
| // Emit the diagnostic and fixit. | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. As a recovery action, should we also add an implicit format attribute to
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. That changes some behavior: void foo(const char *fmt, va_list args) {
vprintf(fmt, args);
vscanf(fmt, args);
}Previously, this would emit two diagnostics to suggest a printf and scanf format attribute. If we add an implicit attr, then when looking at I'll see if there are any other differences before updating the PR, but these changes make sense to me.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. These changes also make sense to me.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, one thing to mention is that we should only add the attribute if the diagnostic is enabled at the call site, otherwise people will start seeing inexplicable format warnings that cannot be silenced. This would probably be a net positive, but the respectful way to do this would be to enable the diagnostic by default. |
||
| unsigned FormatStringIndex = CallerParamIdx + CallerArgumentIndexOffset; | ||
| StringRef FormatTypeName = S->GetFormatStringTypeName(FormatType); | ||
| { | ||
| std::string Attr = | ||
| ("format(" + FormatTypeName + ", " + llvm::Twine(FormatStringIndex) + | ||
| ", " + llvm::Twine(FirstArgumentIndex) + ")") | ||
| .str(); | ||
| do { | ||
| std::string Attr, Fixit; | ||
| if (APK != Sema::FormatArgumentPassingKind::FAPK_Elsewhere) | ||
| llvm::raw_string_ostream(Attr) | ||
| << "format(" << FormatTypeName << ", " << FormatStringIndex << ", " | ||
| << FirstArgumentIndex << ")"; | ||
| else | ||
| llvm::raw_string_ostream(Attr) | ||
| << "format_matches(" << FormatTypeName << ", " << FormatStringIndex | ||
| << ", \"" << escapeFormatString(ReferenceFormatString->getString()) | ||
| << "\")"; | ||
| auto DB = S->Diag(Loc, diag::warn_missing_format_attribute) | ||
| << Attr << Caller; | ||
| const LangOptions &LO = S->getLangOpts(); | ||
| StringRef AttrPrefix, AttrSuffix; | ||
| if (LO.C23 || LO.CPlusPlus11) { | ||
| AttrPrefix = "[[gnu::"; | ||
| AttrSuffix = "]] "; | ||
| } else if (LO.ObjC || LO.GNUMode) { | ||
| AttrPrefix = "__attribute__(("; | ||
| AttrSuffix = ")) "; | ||
|
|
||
| SourceLocation SL; | ||
| llvm::raw_string_ostream IS(Fixit); | ||
| // The attribute goes at the start of the declaration in C/C++ functions | ||
| // and methods, but after the declaration for Objective-C methods. | ||
| if (isa<ObjCMethodDecl>(Caller)) { | ||
| IS << ' '; | ||
| SL = Caller->getEndLoc(); | ||
| } | ||
| if (!AttrPrefix.empty()) { | ||
| DB << FixItHint::CreateInsertion(Caller->getBeginLoc(), | ||
| (AttrPrefix + Attr + AttrSuffix).str()); | ||
| const LangOptions &LO = S->getLangOpts(); | ||
| if (LO.C23 || LO.CPlusPlus11) | ||
| IS << "[[gnu::" << Attr << "]]"; | ||
| else if (LO.ObjC || LO.GNUMode) | ||
| IS << "__attribute__((" << Attr << "))"; | ||
| else | ||
| break; | ||
| if (!isa<ObjCMethodDecl>(Caller)) { | ||
| IS << ' '; | ||
| SL = Caller->getBeginLoc(); | ||
| } | ||
| } | ||
| IS.flush(); | ||
|
|
||
| DB << FixItHint::CreateInsertion(SL, Fixit); | ||
| } while (false); | ||
| S->Diag(Caller->getLocation(), diag::note_entity_declared_at) << Caller; | ||
|
|
||
| if (APK != Sema::FormatArgumentPassingKind::FAPK_Elsewhere) { | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We should only do this if |
||
| Caller->addAttr(FormatAttr::CreateImplicit( | ||
| S->getASTContext(), &S->getASTContext().Idents.get(FormatTypeName), | ||
| FormatStringIndex, FirstArgumentIndex)); | ||
| } else { | ||
| Caller->addAttr(FormatMatchesAttr::CreateImplicit( | ||
| S->getASTContext(), &S->getASTContext().Idents.get(FormatTypeName), | ||
| FormatStringIndex, ReferenceFormatString)); | ||
| } | ||
| } | ||
|
|
||
| bool Sema::CheckFormatArguments(ArrayRef<const Expr *> Args, | ||
| Sema::FormatArgumentPassingKind APK, | ||
| const StringLiteral *ReferenceFormatString, | ||
| StringLiteral *ReferenceFormatString, | ||
| unsigned format_idx, unsigned firstDataArg, | ||
| FormatStringType Type, | ||
| VariadicCallType CallType, SourceLocation Loc, | ||
|
|
@@ -7153,8 +7195,9 @@ bool Sema::CheckFormatArguments(ArrayRef<const Expr *> Args, | |
|
|
||
| const LangOptions &LO = getLangOpts(); | ||
| if (CallerParamIdx && (LO.GNUMode || LO.C23 || LO.CPlusPlus11)) | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You should warn in all cases and use the language mode only to decide whether to offer a fixit.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The issue I see w/ that is that for language modes that don’t support
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the warning that suggests adding an attribute makes sense if there isn't a portable way to actually add the attribute.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There are portable ways to annotate your functions, it's just that you can't offer them as fixits for a variety of reasons (for instance, create a macro that adds the format attribute based on a But even then, I disagree that the only value (or even the main value) of any diagnostic is the fixit. Clang doesn't only diagnose situations that it knows how to fix. For a codebase that truly can't do anything about missing format attributes, the solution is to not take the intentional step of enabling the diagnostic.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, the user can figure out a way to write the attribute in a suitable way. I have changed the diagnostic message to include the attribute parameters as well like: so even if there is no fixit, the full suggested attribute is still shown. |
||
| CheckMissingFormatAttributes(this, Type, format_idx, firstDataArg, Args, | ||
| APK, *CallerParamIdx, Loc); | ||
| CheckMissingFormatAttributes(this, Args, APK, ReferenceFormatString, | ||
| format_idx, firstDataArg, Type, | ||
| *CallerParamIdx, Loc); | ||
|
|
||
| // Strftime is particular as it always uses a single 'time' argument, | ||
| // so it is safe to pass a non-literal string. | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is fine, but OS.write_escaped does something pretty similar to this loop.