[clang][sema] Add nonnull attribute to builtin format functions #160988

bozicrHT · 2025-09-27T07:50:49Z

Annotate printf/scanf and related builtins with the nonnull attribute on their format string parameters. This enables diagnostics when NULL is passed, matching GCC behavior. Updated existing Sema tests and added new ones for coverage. Closes issue #33923

Annotate printf/scanf and related builtins with the nonnull attribute on their format string parameters. This enables diagnostics when NULL is passed, matching GCC behavior. Updated existing Sema tests and added new one for coverage. Closes issue llvm#33923

llvmbot · 2025-09-27T07:51:24Z

@llvm/pr-subscribers-clang-static-analyzer-1

@llvm/pr-subscribers-clang

Author: Radovan Božić (bozicrHT)

Changes

Annotate printf/scanf and related builtins with the nonnull attribute on their format string parameters. This enables diagnostics when NULL is passed, matching GCC behavior. Updated existing Sema tests and added new one for coverage. Closes issue #33923

Full diff: https://github.com/llvm/llvm-project/pull/160988.diff

10 Files Affected:

(modified) clang/include/clang/Basic/Builtins.h (+4)
(modified) clang/include/clang/Basic/Builtins.td (+16-15)
(modified) clang/include/clang/Basic/BuiltinsBase.td (+1-1)
(modified) clang/lib/Basic/Builtins.cpp (+33-15)
(modified) clang/lib/Sema/SemaDecl.cpp (+9)
(added) clang/test/Sema/format-strings-nonnull.c (+74)
(added) clang/test/Sema/format-strings-nonnull.cpp (+40)
(modified) clang/test/Sema/format-strings.c (+2-4)
(modified) clang/test/SemaCXX/format-strings-0x.cpp (+2)
(modified) clang/test/SemaObjC/format-strings-objc.m (+2-1)

diff --git a/clang/include/clang/Basic/Builtins.h b/clang/include/clang/Basic/Builtins.h
index 3a5e31de2bc50..6d26a7b92a0fd 100644
--- a/clang/include/clang/Basic/Builtins.h
+++ b/clang/include/clang/Basic/Builtins.h
@@ -392,6 +392,10 @@ class Context {
   bool performsCallback(unsigned ID,
                         llvm::SmallVectorImpl<int> &Encoding) const;
 
+  /// Return true if this builtin has parameters that must be non-null.
+  /// The parameter indices are appended into 'Indxs'.
+  bool isNonNull(unsigned ID, llvm::SmallVectorImpl<int> &Indxs) const;
+
   /// Return true if this function has no side effects and doesn't
   /// read memory, except for possibly errno or raising FP exceptions.
   ///
diff --git a/clang/include/clang/Basic/Builtins.td b/clang/include/clang/Basic/Builtins.td
index 9bc70ea5e5858..5f9843f482276 100644
--- a/clang/include/clang/Basic/Builtins.td
+++ b/clang/include/clang/Basic/Builtins.td
@@ -3095,104 +3095,105 @@ def StrLen : LibBuiltin<"string.h"> {
 // FIXME: This list is incomplete.
 def Printf : LibBuiltin<"stdio.h"> {
   let Spellings = ["printf"];
-  let Attributes = [PrintfFormat<0>];
+  let Attributes = [PrintfFormat<0>, NonNull<[0]>];
   let Prototype = "int(char const*, ...)";
 }
 
 // FIXME: The builtin and library function should have the same signature.
 def BuiltinPrintf : Builtin {
   let Spellings = ["__builtin_printf"];
-  let Attributes = [NoThrow, PrintfFormat<0>, FunctionWithBuiltinPrefix];
+  let Attributes = [NoThrow, PrintfFormat<0>, FunctionWithBuiltinPrefix,
+                    NonNull<[0]>];
   let Prototype = "int(char const* restrict, ...)";
 }
 
 def FPrintf : LibBuiltin<"stdio.h"> {
   let Spellings = ["fprintf"];
-  let Attributes = [NoThrow, PrintfFormat<1>];
+  let Attributes = [NoThrow, PrintfFormat<1>, NonNull<[1]>];
   let Prototype = "int(FILE* restrict, char const* restrict, ...)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def SnPrintf : LibBuiltin<"stdio.h"> {
   let Spellings = ["snprintf"];
-  let Attributes = [NoThrow, PrintfFormat<2>];
+  let Attributes = [NoThrow, PrintfFormat<2>, NonNull<[2]>];
   let Prototype = "int(char* restrict, size_t, char const* restrict, ...)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def SPrintf : LibBuiltin<"stdio.h"> {
   let Spellings = ["sprintf"];
-  let Attributes = [NoThrow, PrintfFormat<1>];
+  let Attributes = [NoThrow, PrintfFormat<1>, NonNull<[1]>];
   let Prototype = "int(char* restrict, char const* restrict, ...)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def VPrintf : LibBuiltin<"stdio.h"> {
   let Spellings = ["vprintf"];
-  let Attributes = [NoThrow, VPrintfFormat<0>];
+  let Attributes = [NoThrow, VPrintfFormat<0>, NonNull<[0]>];
   let Prototype = "int(char const* restrict, __builtin_va_list)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def VfPrintf : LibBuiltin<"stdio.h"> {
   let Spellings = ["vfprintf"];
-  let Attributes = [NoThrow, VPrintfFormat<1>];
+  let Attributes = [NoThrow, VPrintfFormat<1>, NonNull<[1]>];
   let Prototype = "int(FILE* restrict, char const* restrict, __builtin_va_list)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def VsnPrintf : LibBuiltin<"stdio.h"> {
   let Spellings = ["vsnprintf"];
-  let Attributes = [NoThrow, VPrintfFormat<2>];
+  let Attributes = [NoThrow, VPrintfFormat<2>, NonNull<[2]>];
   let Prototype = "int(char* restrict, size_t, char const* restrict, __builtin_va_list)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def VsPrintf : LibBuiltin<"stdio.h"> {
   let Spellings = ["vsprintf"];
-  let Attributes = [NoThrow, VPrintfFormat<1>];
+  let Attributes = [NoThrow, VPrintfFormat<1>, NonNull<[1]>];
   let Prototype = "int(char* restrict, char const* restrict, __builtin_va_list)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def Scanf : LibBuiltin<"stdio.h"> {
   let Spellings = ["scanf"];
-  let Attributes = [ScanfFormat<0>];
+  let Attributes = [ScanfFormat<0>, NonNull<[0]>];
   let Prototype = "int(char const* restrict, ...)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def FScanf : LibBuiltin<"stdio.h"> {
   let Spellings = ["fscanf"];
-  let Attributes = [ScanfFormat<1>];
+  let Attributes = [ScanfFormat<1>, NonNull<[1]>];
   let Prototype = "int(FILE* restrict, char const* restrict, ...)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def SScanf : LibBuiltin<"stdio.h"> {
   let Spellings = ["sscanf"];
-  let Attributes = [ScanfFormat<1>];
+  let Attributes = [ScanfFormat<1>, NonNull<[1]>];
   let Prototype = "int(char const* restrict, char const* restrict, ...)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def VScanf : LibBuiltin<"stdio.h"> {
   let Spellings = ["vscanf"];
-  let Attributes = [VScanfFormat<0>];
+  let Attributes = [VScanfFormat<0>, NonNull<[0]>];
   let Prototype = "int(char const* restrict, __builtin_va_list)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def VFScanf : LibBuiltin<"stdio.h"> {
   let Spellings = ["vfscanf"];
-  let Attributes = [VScanfFormat<1>];
+  let Attributes = [VScanfFormat<1>, NonNull<[1]>];
   let Prototype = "int(FILE* restrict, char const* restrict, __builtin_va_list)";
   let AddBuiltinPrefixedAlias = 1;
 }
 
 def VSScanf : LibBuiltin<"stdio.h"> {
   let Spellings = ["vsscanf"];
-  let Attributes = [VScanfFormat<1>];
+  let Attributes = [VScanfFormat<1>, NonNull<[1]>];
   let Prototype = "int(char const* restrict, char const* restrict, __builtin_va_list)";
   let AddBuiltinPrefixedAlias = 1;
 }
diff --git a/clang/include/clang/Basic/BuiltinsBase.td b/clang/include/clang/Basic/BuiltinsBase.td
index 09bc9f89059fe..73918ab167b8d 100644
--- a/clang/include/clang/Basic/BuiltinsBase.td
+++ b/clang/include/clang/Basic/BuiltinsBase.td
@@ -32,7 +32,6 @@ def Const : Attribute<"c">;
 def NoThrow : Attribute<"n">;
 def Pure : Attribute<"U">;
 def ReturnsTwice : Attribute<"j">;
-//  FIXME: gcc has nonnull
 
 // builtin-specific attributes
 // ---------------------------
@@ -85,6 +84,7 @@ def Consteval : Attribute<"EG">;
 // Callback behavior: the first index argument is called with the arguments
 // indicated by the remaining indices.
 class Callback<list<int> ArgIndices> : MultiIndexAttribute<"C", ArgIndices>;
+class NonNull<list<int> ArgIndices> : MultiIndexAttribute<"N", ArgIndices>;
 
 // Prefixes
 // ========
diff --git a/clang/lib/Basic/Builtins.cpp b/clang/lib/Basic/Builtins.cpp
index acd98fe84adf5..a813726b7b848 100644
--- a/clang/lib/Basic/Builtins.cpp
+++ b/clang/lib/Basic/Builtins.cpp
@@ -293,30 +293,48 @@ bool Builtin::Context::isScanfLike(unsigned ID, unsigned &FormatIdx,
   return isLike(ID, FormatIdx, HasVAListArg, "sS");
 }
 
-bool Builtin::Context::performsCallback(unsigned ID,
-                                        SmallVectorImpl<int> &Encoding) const {
-  const char *CalleePos = ::strchr(getAttributesString(ID), 'C');
-  if (!CalleePos)
-    return false;
-
-  ++CalleePos;
-  assert(*CalleePos == '<' &&
-         "Callback callee specifier must be followed by a '<'");
-  ++CalleePos;
+static void parseCommaSeparatedIndices(const char *CurrPos,
+                                       llvm::SmallVectorImpl<int> &Indxs) {
+  assert(*CurrPos == '<' && "Expected '<' to start index list");
+  ++CurrPos;
 
   char *EndPos;
-  int CalleeIdx = ::strtol(CalleePos, &EndPos, 10);
-  assert(CalleeIdx >= 0 && "Callee index is supposed to be positive!");
-  Encoding.push_back(CalleeIdx);
+  int PosIdx = ::strtol(CurrPos, &EndPos, 10);
+  assert(PosIdx >= 0 && "Index is supposed to be positive!");
+  Indxs.push_back(PosIdx);
 
   while (*EndPos == ',') {
     const char *PayloadPos = EndPos + 1;
 
     int PayloadIdx = ::strtol(PayloadPos, &EndPos, 10);
-    Encoding.push_back(PayloadIdx);
+    Indxs.push_back(PayloadIdx);
   }
 
-  assert(*EndPos == '>' && "Callback callee specifier must end with a '>'");
+  assert(*EndPos == '>' && "Index list must end with '>'");
+}
+
+bool Builtin::Context::isNonNull(unsigned ID,
+                                 llvm::SmallVectorImpl<int> &Indxs) const {
+
+  const char *AttrPos = ::strchr(getAttributesString(ID), 'N');
+  if (!AttrPos)
+    return false;
+
+  ++AttrPos;
+  parseCommaSeparatedIndices(AttrPos, Indxs);
+
+  return true;
+}
+
+bool Builtin::Context::performsCallback(unsigned ID,
+                                        SmallVectorImpl<int> &Encoding) const {
+  const char *CalleePos = ::strchr(getAttributesString(ID), 'C');
+  if (!CalleePos)
+    return false;
+
+  ++CalleePos;
+  parseCommaSeparatedIndices(CalleePos, Encoding);
+
   return true;
 }
 
diff --git a/clang/lib/Sema/SemaDecl.cpp b/clang/lib/Sema/SemaDecl.cpp
index 9ef7a2698913d..d9df60af114fb 100644
--- a/clang/lib/Sema/SemaDecl.cpp
+++ b/clang/lib/Sema/SemaDecl.cpp
@@ -17141,6 +17141,15 @@ void Sema::AddKnownFunctionAttributes(FunctionDecl *FD) {
       }
     }
 
+    SmallVector<int, 4> Indxs;
+    if (Context.BuiltinInfo.isNonNull(BuiltinID, Indxs) &&
+        !FD->hasAttr<NonNullAttr>()) {
+      llvm::SmallVector<ParamIdx, 4> ParamIndxs;
+      for (int I : Indxs)
+        ParamIndxs.push_back(ParamIdx(I + 1, FD));
+      FD->addAttr(NonNullAttr::CreateImplicit(Context, ParamIndxs.data(),
+                                              ParamIndxs.size()));
+    }
     if (Context.BuiltinInfo.isReturnsTwice(BuiltinID) &&
         !FD->hasAttr<ReturnsTwiceAttr>())
       FD->addAttr(ReturnsTwiceAttr::CreateImplicit(Context,
diff --git a/clang/test/Sema/format-strings-nonnull.c b/clang/test/Sema/format-strings-nonnull.c
new file mode 100644
index 0000000000000..86ce24a27ca11
--- /dev/null
+++ b/clang/test/Sema/format-strings-nonnull.c
@@ -0,0 +1,74 @@
+// RUN: %clang_cc1 -fsyntax-only --std=c23 -verify -Wnonnull -Wno-format-security %s
+
+#define NULL  (void*)0
+
+typedef struct _FILE FILE;
+typedef __SIZE_TYPE__ size_t;
+typedef __builtin_va_list va_list;
+int printf(char const* restrict, ...);
+int __builtin_printf(char const* restrict, ...);
+int fprintf(FILE* restrict, char const* restrict, ...);
+int snprintf(char* restrict, size_t, char const* restrict, ...);
+int sprintf(char* restrict, char const* restrict, ...);
+int vprintf(char const* restrict, __builtin_va_list);
+int vfprintf(FILE* restrict, char const* restrict, __builtin_va_list);
+int vsnprintf(char* restrict, size_t, char const* restrict, __builtin_va_list);
+int vsprintf(char* restrict, char const* restrict, __builtin_va_list);
+
+int scanf(char const* restrict, ...);
+int fscanf(FILE* restrict, char const* restrict, ...);
+int sscanf(char const* restrict, char const* restrict, ...);
+int vscanf(char const* restrict, __builtin_va_list);
+int vfscanf(FILE* restrict, char const* restrict, __builtin_va_list);
+int vsscanf(char const* restrict, char const* restrict, __builtin_va_list);
+
+
+void check_format_string(FILE *fp, va_list ap) {
+    char buf[256];
+    char* const fmt = NULL;
+
+    printf(fmt);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    __builtin_printf(NULL, "xxd");
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    fprintf(fp, NULL, 25);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    sprintf(buf, NULL, 42);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    snprintf(buf, 10, 0, 42);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    vprintf(fmt, ap);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    vfprintf(fp, 0, ap);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    vsprintf(buf, nullptr, ap);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    vsnprintf(buf, 10, fmt, ap);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    scanf(NULL);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    fscanf(fp, nullptr);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    sscanf(buf, fmt);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    vscanf(NULL, ap);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    vfscanf(fp, fmt, ap);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+
+    vsscanf(buf, NULL, ap);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+}
\ No newline at end of file
diff --git a/clang/test/Sema/format-strings-nonnull.cpp b/clang/test/Sema/format-strings-nonnull.cpp
new file mode 100644
index 0000000000000..55a3ed1788c79
--- /dev/null
+++ b/clang/test/Sema/format-strings-nonnull.cpp
@@ -0,0 +1,40 @@
+// RUN: %clang_cc1 -fsyntax-only -verify -Wnonnull %s
+
+#ifdef __cplusplus
+# define EXTERN_C extern "C"
+#else
+# define EXTERN_C extern
+#endif
+
+typedef struct _FILE FILE;
+typedef __SIZE_TYPE__ size_t;
+typedef __builtin_va_list va_list;
+
+EXTERN_C int printf(const char *, ...);
+EXTERN_C int fprintf(FILE *, const char *restrict, ...);
+EXTERN_C int sprintf(char* restrict, char const* res, ...);
+
+EXTERN_C int scanf(char const *restrict, ...);
+EXTERN_C int fscanf(FILE* restrict, char const* res, ...);
+
+void test(FILE *fp) {
+  char buf[256];
+
+  __builtin_printf(__null, "x");
+  // expected-warning@-1 {{null passed to a callee that requires a non-null argument}}
+
+  printf(__null, "xxd");
+  // expected-warning@-1 {{null passed to a callee that requires a non-null argument}}
+
+  fprintf(fp, __null, 42);
+  // expected-warning@-1 {{null passed to a callee that requires a non-null argument}}
+
+  sprintf(buf, __null);
+  // expected-warning@-1 {{null passed to a callee that requires a non-null argument}}
+
+  scanf(__null);
+  // expected-warning@-1 {{null passed to a callee that requires a non-null argument}}
+
+  fscanf(fp, __null);
+  // expected-warning@-1 {{null passed to a callee that requires a non-null argument}}
+}
diff --git a/clang/test/Sema/format-strings.c b/clang/test/Sema/format-strings.c
index 103dd8ab5a85c..164db45fa2053 100644
--- a/clang/test/Sema/format-strings.c
+++ b/clang/test/Sema/format-strings.c
@@ -480,11 +480,9 @@ void pr7981(wint_t c, wchar_t c2) {
 #endif
 }
 
-// -Wformat-security says NULL is not a string literal
 void rdar8269537(void) {
-  // This is likely to crash in most cases, but -Wformat-nonliteral technically
-  // doesn't warn in this case.
-  printf(0); // no-warning
+  printf(0);
+  // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
 }
 
 // Handle functions with multiple format attributes.
diff --git a/clang/test/SemaCXX/format-strings-0x.cpp b/clang/test/SemaCXX/format-strings-0x.cpp
index 7d37f8276f29f..e0ca7a270c993 100644
--- a/clang/test/SemaCXX/format-strings-0x.cpp
+++ b/clang/test/SemaCXX/format-strings-0x.cpp
@@ -14,6 +14,7 @@ void f(char **sp, float *fp) {
   printf("%a", 1.0);
   scanf("%afoobar", fp);
   printf(nullptr);
+  // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
   printf(*sp); // expected-warning {{not a string literal}}
   // expected-note@-1{{treat the string as an argument to avoid this}}
 
@@ -32,4 +33,5 @@ void f(char **sp, float *fp) {
   printf("init list: %d", { 0 }); // expected-error {{cannot pass initializer list to variadic function; expected type from format string was 'int'}}
   printf("void: %d", f(sp, fp)); // expected-error {{cannot pass expression of type 'void' to variadic function; expected type from format string was 'int'}}
   printf(0, { 0 }); // expected-error {{cannot pass initializer list to variadic function}}
+  // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
 }
diff --git a/clang/test/SemaObjC/format-strings-objc.m b/clang/test/SemaObjC/format-strings-objc.m
index 40c1d31b1fd4c..babbb40394267 100644
--- a/clang/test/SemaObjC/format-strings-objc.m
+++ b/clang/test/SemaObjC/format-strings-objc.m
@@ -130,7 +130,7 @@ void rdar10743758(id x) {
   printf(s2); // expected-warning {{more '%' conversions than data arguments}}
 
   const char * const s3 = (const char *)0;
-  printf(s3); // no-warning (NULL is a valid format string)
+  printf(s3); // expected-warning {{null passed to a callee that requires a non-null argument}}
 
   NSString * const ns1 = @"constant string %s"; // expected-note {{format string is defined here}}
   NSLog(ns1); // expected-warning {{more '%' conversions than data arguments}}
@@ -259,6 +259,7 @@ void testByValueObjectInFormat(Foo *obj) {
   printf("%d %d %d", 1L, *obj, 1L); // expected-error {{cannot pass object with interface type 'Foo' by value to variadic function; expected type from format string was 'int'}} expected-warning 2 {{format specifies type 'int' but the argument has type 'long'}}
   printf("%!", *obj); // expected-error {{cannot pass object with interface type 'Foo' by value through variadic function}} expected-warning {{invalid conversion specifier}}
   printf(0, *obj); // expected-error {{cannot pass object with interface type 'Foo' by value through variadic function}}
+  // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
 
   [Bar log2:@"%d", *obj]; // expected-error {{cannot pass object with interface type 'Foo' by value to variadic method; expected type from format string was 'int'}}
 }

bozicrHT · 2025-09-30T12:54:45Z

@Sirraide @erichkeane

erichkeane

This is very much in @AaronBallman 's range here, particularly with the C stuff. I don't see anything of question, so it is just a matter of whether he thinks this is a good idea. he's supposed to come back any day now (maybe next week?) so please ping him in another week+ if you don't hear from him.

philnik777 · 2025-09-30T13:17:22Z

For reference, this was #158626 before.

ojhunt · 2025-10-09T05:16:19Z

clang/test/Sema/format-strings-nonnull.c

+
+    vsscanf(buf, NULL, ap);
+    // expected-warning@-1{{null passed to a callee that requires a non-null argument}}
+}


smallest nit: newline

ojhunt · 2025-10-09T05:21:40Z

clang/lib/Sema/SemaDecl.cpp


+    SmallVector<int, 4> Indxs;
+    if (Context.BuiltinInfo.isNonNull(BuiltinID, Indxs) &&
+        !FD->hasAttr<NonNullAttr>()) {


Shouldn't the hasAttr<NonNullAttr>() check be per parameter?

This reads as if

__attribute__((nonnull)) void printf(const char*, ...)

Would not get the attribute annotation, and similarly

void printf(const char* __attribute__((nonnull)), ...)

would have the attribute added again?

Yes, ideally it could check at the parameter level, but this works equally well because nonnull can be applied at the function level (e.g. __attribute__((nonnull(idx1, idx2))) int printf(...);). Since I didn’t find a parameter-specific equivalent to AddKnownFunctionAttributes, I went with this approach. I could alternatively iterate over the ParmVarDecls and attach attributes individually if needed.

AaronBallman

Thank you for working on this! I kind of wonder if we want a different approach though.

In C, any invalid argument is UB unless otherwise specified, and null is an invalid argument. 7.1.4p1:

Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow:
— If an argument to a function has an invalid value (such as a value outside the domain of
the function, or a pointer outside the address space of the program, or a null pointer, or a
pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after default argument promotion) not expected by a variadic function, the behavior is undefined.

So I almost wonder if a better approach is to go the opposite way and require marking arguments which can be null? (We don't have GCC's nonnull_if_nonzero attribute which would also be very helpful if we go this direction; there are a number of functions where null is fine only so long as some count variable is nonzero.)

AaronBallman · 2025-10-10T12:31:27Z

clang/include/clang/Basic/Builtins.td

 def FPrintf : LibBuiltin<"stdio.h"> {
  let Spellings = ["fprintf"];
-  let Attributes = [NoThrow, PrintfFormat<1>];
+  let Attributes = [NoThrow, PrintfFormat<1>, NonNull<[1]>];


Both the FILE * and the const char * must be nonnull.

AaronBallman · 2025-10-10T12:33:38Z

clang/include/clang/Basic/Builtins.td

 def SPrintf : LibBuiltin<"stdio.h"> {
  let Spellings = ["sprintf"];
-  let Attributes = [NoThrow, PrintfFormat<1>];
+  let Attributes = [NoThrow, PrintfFormat<1>, NonNull<[1]>];


Both the char * and the const char * must be nonnull.

AaronBallman · 2025-10-10T12:33:54Z

clang/include/clang/Basic/Builtins.td

 def VfPrintf : LibBuiltin<"stdio.h"> {
  let Spellings = ["vfprintf"];
-  let Attributes = [NoThrow, VPrintfFormat<1>];
+  let Attributes = [NoThrow, VPrintfFormat<1>, NonNull<[1]>];


FILE * as well.

AaronBallman · 2025-10-10T12:34:09Z

clang/include/clang/Basic/Builtins.td

 def VsPrintf : LibBuiltin<"stdio.h"> {
  let Spellings = ["vsprintf"];
-  let Attributes = [NoThrow, VPrintfFormat<1>];
+  let Attributes = [NoThrow, VPrintfFormat<1>, NonNull<[1]>];


char * as well.

AaronBallman · 2025-10-10T12:34:20Z

clang/include/clang/Basic/Builtins.td

 def FScanf : LibBuiltin<"stdio.h"> {
  let Spellings = ["fscanf"];
-  let Attributes = [ScanfFormat<1>];
+  let Attributes = [ScanfFormat<1>, NonNull<[1]>];


FILE * as well.

I'll stop commenting on this, you should take a pass through the rest and fix up accordingly.

bozicrHT · 2025-10-14T10:00:00Z

Thank you for working on this! I kind of wonder if we want a different approach though.

In C, any invalid argument is UB unless otherwise specified, and null is an invalid argument. 7.1.4p1:

Each of the following statements applies unless explicitly stated otherwise in the detailed descriptions that follow: — If an argument to a function has an invalid value (such as a value outside the domain of the function, or a pointer outside the address space of the program, or a null pointer, or a pointer to non-modifiable storage when the corresponding parameter is not const-qualified) or a type (after default argument promotion) not expected by a variadic function, the behavior is undefined.

So I almost wonder if a better approach is to go the opposite way and require marking arguments which can be null? (We don't have GCC's nonnull_if_nonzero attribute which would also be very helpful if we go this direction; there are a number of functions where null is fine only so long as some count variable is nonzero.)

Thank you for the thoughtful feedback! I was wondering - are you suggesting introducing a new attribute (e.g. nullable) to explicitly mark arguments that can accept null? Wouldn’t that approach potentially break existing code and tests by triggering new warnings, since every unannotated pointer would be treated as non-null by default?

It also seems like a larger-scope change and would diverge from GCC’s current behavior, which only checks for null arguments when marked with nonnull.

Extend nonnull coverage to include not only format string parameters, but also FILE* and char* arguments (e.g. for sscanf, fprintf, etc.).

AaronBallman · 2025-10-14T16:18:20Z

Thank you for the thoughtful feedback! I was wondering - are you suggesting introducing a new attribute (e.g. nullable) to explicitly mark arguments that can accept null? Wouldn’t that approach potentially break existing code and tests by triggering new warnings, since every unannotated pointer would be treated as non-null by default?

No, but yes. :-D

I think most pointers in standard library functions cannot be null. So it might be easier to maintain for our tablegen to assume all pointers are nonnull unless we add an annotation to say which parameters can be null. Then when generating the headers from tablegen, we'd add __attribute__((nonnull)) to the parameters which must be nonnull.

But the attribute we should add, but not as part of this PR, is one from GCC: nonnull_if_nonzero. We need that for a number of the APIs where you can pass a null pointer but only when the size argument is zero.

AaronBallman · 2025-10-14T16:23:00Z

clang/include/clang/Basic/Builtins.td

 def SnPrintf : LibBuiltin<"stdio.h"> {
  let Spellings = ["snprintf"];
-  let Attributes = [NoThrow, PrintfFormat<2>];
+  let Attributes = [NoThrow, PrintfFormat<2>, NonNull<[0, 2]>];


I think this one is wrong (as are a few others); the first argument can be null if the second argument is 0. This is a case where we'd want nonnull_if_nonzero.

You should verify these annotations against the latest C standard draft: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3685.pdf

According to the C standard, the first argument to snprintf and vsnprintf may be null if the size argument is zero.

bozicrHT · 2025-10-15T13:39:30Z

I think most pointers in standard library functions cannot be null. So it might be easier to maintain for our tablegen to assume all pointers are nonnull unless we add an annotation to say which parameters can be null. Then when generating the headers from tablegen, we'd add __attribute__((nonnull)) to the parameters which must be nonnull.

But the attribute we should add, but not as part of this PR, is one from GCC: nonnull_if_nonzero. We need that for a number of the APIs where you can pass a null pointer but only when the size argument is zero.

Ah, I see your point now — that makes sense. The idea is to treat pointer parameters as nonnull by default, and only allow null explicitly in cases like sprintf/vsnprintf, where the buffer can be null if the size argument is 0.

llvmbot added clang Clang issues not falling into any other category clang:frontend Language frontend issues, e.g. anything involving "Sema" labels Sep 27, 2025

erichkeane reviewed Sep 30, 2025

View reviewed changes

ColinKinloch mentioned this pull request Oct 2, 2025

[clang][Sema] Add fortify warnings for unistd.h #161737

Open

ojhunt reviewed Oct 9, 2025

View reviewed changes

Sirraide requested a review from AaronBallman October 9, 2025 15:18

AaronBallman reviewed Oct 10, 2025

View reviewed changes

bozicrHT added 2 commits October 14, 2025 12:07

Add newline at the end of file

c7a6ed9

Add additional nonnull attributes for printf/scanf family functions

775fcf1

Extend nonnull coverage to include not only format string parameters, but also FILE* and char* arguments (e.g. for sscanf, fprintf, etc.).

llvmbot added the clang:static analyzer label Oct 14, 2025

AaronBallman reviewed Oct 14, 2025

View reviewed changes

Remove unnecessary nonnull attribute for snprintf and vsnprintf

265c9a6

According to the C standard, the first argument to snprintf and vsnprintf may be null if the size argument is zero.

[clang][sema] Add nonnull attribute to builtin format functions #160988

Are you sure you want to change the base?

[clang][sema] Add nonnull attribute to builtin format functions #160988

Conversation

bozicrHT commented Sep 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Sep 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bozicrHT commented Sep 30, 2025

Uh oh!

erichkeane left a comment

Choose a reason for hiding this comment

Uh oh!

philnik777 commented Sep 30, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

AaronBallman left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bozicrHT commented Oct 14, 2025

Uh oh!

AaronBallman commented Oct 14, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

bozicrHT commented Oct 15, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

bozicrHT commented Sep 27, 2025 •

edited

Loading

llvmbot commented Sep 27, 2025 •

edited

Loading