Skip to content

Conversation

@antoniofrighetto
Copy link
Contributor

@antoniofrighetto antoniofrighetto commented Jan 28, 2025

Floating-point libcalls are currently conservatively marked as may write any memory. Restrict these to clobber only errno.

@llvmbot
Copy link
Member

llvmbot commented Jan 28, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-mlir-llvm
@llvm/pr-subscribers-llvm-ir
@llvm/pr-subscribers-llvm-support
@llvm/pr-subscribers-llvm-transforms

@llvm/pr-subscribers-mlir

Author: Antonio Frighetto (antoniofrighetto)

Changes

Mark C standard library functions that set errno as such, as included in the POSIX specification. Likewise, while deducing function attributes, determine whether a memory access affects errno memory (this possibly should leverage Loc.TBAAErrno in upcoming PR).

Rebased on: #120783.


Patch is 109.86 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/124742.diff

29 Files Affected:

  • (modified) clang/test/CodeGen/sanitize-metadata-nosanitize.c (+9-9)
  • (modified) clang/test/CodeGenOpenCL/convergent.cl (+1-1)
  • (modified) llvm/include/llvm/AsmParser/LLToken.h (+4)
  • (modified) llvm/include/llvm/Bitcode/LLVMBitCodes.h (+3)
  • (modified) llvm/include/llvm/IR/Function.h (+15)
  • (modified) llvm/include/llvm/IR/InstrTypes.h (+15)
  • (modified) llvm/include/llvm/Support/ModRef.h (+56-1)
  • (modified) llvm/lib/AsmParser/LLLexer.cpp (+4)
  • (modified) llvm/lib/AsmParser/LLParser.cpp (+12-1)
  • (modified) llvm/lib/Bitcode/Reader/BitcodeReader.cpp (+28)
  • (modified) llvm/lib/IR/Attributes.cpp (+3)
  • (modified) llvm/lib/IR/Function.cpp (+29)
  • (modified) llvm/lib/IR/Instructions.cpp (+29)
  • (modified) llvm/lib/Support/ModRef.cpp (+3)
  • (modified) llvm/lib/Transforms/IPO/FunctionAttrs.cpp (+12)
  • (modified) llvm/lib/Transforms/Utils/BuildLibCalls.cpp (+82-16)
  • (modified) llvm/test/Assembler/memory-attribute-errors.ll (+3-3)
  • (modified) llvm/test/Assembler/memory-attribute.ll (+12)
  • (modified) llvm/test/Transforms/FunctionAttrs/argmemonly.ll (+11-11)
  • (modified) llvm/test/Transforms/FunctionAttrs/nocapture.ll (+15-15)
  • (modified) llvm/test/Transforms/FunctionAttrs/read-write-scc.ll (+2-2)
  • (modified) llvm/test/Transforms/FunctionAttrs/readattrs.ll (+1-1)
  • (modified) llvm/test/Transforms/FunctionAttrs/writeonly.ll (+2-2)
  • (modified) llvm/test/Transforms/InferFunctionAttrs/annotate.ll (+96-93)
  • (modified) llvm/test/Transforms/InferFunctionAttrs/norecurse_debug.ll (+1-1)
  • (modified) llvm/test/Transforms/LowerTypeTests/cfi-nounwind-direct-call.ll (+1-1)
  • (modified) llvm/test/Transforms/SCCP/ipscp-drop-argmemonly.ll (+6-6)
  • (modified) llvm/unittests/Support/ModRefTest.cpp (+2-2)
  • (modified) mlir/test/Target/LLVMIR/llvmir.mlir (+5-5)
diff --git a/clang/test/CodeGen/sanitize-metadata-nosanitize.c b/clang/test/CodeGen/sanitize-metadata-nosanitize.c
index fd2fdce31b52fb..77dc7d9d492ecc 100644
--- a/clang/test/CodeGen/sanitize-metadata-nosanitize.c
+++ b/clang/test/CodeGen/sanitize-metadata-nosanitize.c
@@ -10,7 +10,7 @@
 // CHECK: @llvm.global_ctors = appending global [2 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 2, ptr @__sanitizer_metadata_covered2.module_ctor, ptr @__sanitizer_metadata_covered2.module_ctor }, { i32, ptr, ptr } { i32 2, ptr @__sanitizer_metadata_atomics2.module_ctor, ptr @__sanitizer_metadata_atomics2.module_ctor }]
 // CHECK: @llvm.global_dtors = appending global [2 x { i32, ptr, ptr }] [{ i32, ptr, ptr } { i32 2, ptr @__sanitizer_metadata_covered2.module_dtor, ptr @__sanitizer_metadata_covered2.module_dtor }, { i32, ptr, ptr } { i32 2, ptr @__sanitizer_metadata_atomics2.module_dtor, ptr @__sanitizer_metadata_atomics2.module_dtor }]
 //.
-// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(write, argmem: none, inaccessiblemem: none)
+// CHECK: Function Attrs: mustprogress nofree noinline norecurse nosync nounwind willreturn memory(write, argmem: none, inaccessiblemem: none, errnomem: none)
 // CHECK-LABEL: define dso_local void @escape
 // CHECK-SAME: (ptr noundef [[P:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] !pcsections [[META2:![0-9]+]] {
 // CHECK-NEXT:  entry:
@@ -21,7 +21,7 @@ __attribute__((noinline, not_tail_called)) void escape(const volatile void *p) {
   sink = p;
 }
 
-// CHECK: Function Attrs: mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none)
+// CHECK: Function Attrs: mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none, errnomem: none)
 // CHECK-LABEL: define dso_local i32 @normal_function
 // CHECK-SAME: (ptr noundef [[X:%.*]], ptr nocapture noundef readonly [[Y:%.*]]) local_unnamed_addr #[[ATTR1:[0-9]+]] !pcsections [[META4:![0-9]+]] {
 // CHECK-NEXT:  entry:
@@ -38,7 +38,7 @@ int normal_function(int *x, int *y) {
   return *y;
 }
 
-// CHECK: Function Attrs: disable_sanitizer_instrumentation mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none)
+// CHECK: Function Attrs: disable_sanitizer_instrumentation mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none, errnomem: none)
 // CHECK-LABEL: define dso_local i32 @test_disable_sanitize_instrumentation
 // CHECK-SAME: (ptr noundef [[X:%.*]], ptr nocapture noundef readonly [[Y:%.*]]) local_unnamed_addr #[[ATTR2:[0-9]+]] {
 // CHECK-NEXT:  entry:
@@ -55,7 +55,7 @@ __attribute__((disable_sanitizer_instrumentation)) int test_disable_sanitize_ins
   return *y;
 }
 
-// CHECK: Function Attrs: mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none)
+// CHECK: Function Attrs: mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none, errnomem: none)
 // CHECK-LABEL: define dso_local i32 @test_no_sanitize_thread
 // CHECK-SAME: (ptr noundef [[X:%.*]], ptr nocapture noundef readonly [[Y:%.*]]) local_unnamed_addr #[[ATTR3:[0-9]+]] !pcsections [[META14:![0-9]+]] {
 // CHECK-NEXT:  entry:
@@ -72,7 +72,7 @@ __attribute__((no_sanitize("thread"))) int test_no_sanitize_thread(int *x, int *
   return *y;
 }
 
-// CHECK: Function Attrs: mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none)
+// CHECK: Function Attrs: mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none, errnomem: none)
 // CHECK-LABEL: define dso_local i32 @test_no_sanitize_all
 // CHECK-SAME: (ptr noundef [[X:%.*]], ptr nocapture noundef readonly [[Y:%.*]]) local_unnamed_addr #[[ATTR3]] !pcsections [[META14]] {
 // CHECK-NEXT:  entry:
@@ -89,10 +89,10 @@ __attribute__((no_sanitize("all"))) int test_no_sanitize_all(int *x, int *y) {
   return *y;
 }
 //.
-// CHECK: attributes #[[ATTR0]] = { mustprogress nofree noinline norecurse nosync nounwind willreturn memory(write, argmem: none, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
-// CHECK: attributes #[[ATTR1]] = { mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
-// CHECK: attributes #[[ATTR2]] = { disable_sanitizer_instrumentation mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
-// CHECK: attributes #[[ATTR3]] = { mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "no_sanitize_thread" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
+// CHECK: attributes #[[ATTR0]] = { mustprogress nofree noinline norecurse nosync nounwind willreturn memory(write, argmem: none, inaccessiblemem: none, errnomem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
+// CHECK: attributes #[[ATTR1]] = { mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none, errnomem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
+// CHECK: attributes #[[ATTR2]] = { disable_sanitizer_instrumentation mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none, errnomem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
+// CHECK: attributes #[[ATTR3]] = { mustprogress nofree norecurse nounwind willreturn memory(write, argmem: readwrite, inaccessiblemem: none, errnomem: none) "min-legal-vector-width"="0" "no-trapping-math"="true" "no_sanitize_thread" "stack-protector-buffer-size"="8" "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
 // CHECK: attributes #[[ATTR4:[0-9]+]] = { nounwind "target-features"="+cx8,+mmx,+sse,+sse2,+x87" }
 //.
 // CHECK: [[META0:![0-9]+]] = !{i32 1, !"wchar_size", i32 4}
diff --git a/clang/test/CodeGenOpenCL/convergent.cl b/clang/test/CodeGenOpenCL/convergent.cl
index 123adba7b40d2c..d64915205aabf6 100644
--- a/clang/test/CodeGenOpenCL/convergent.cl
+++ b/clang/test/CodeGenOpenCL/convergent.cl
@@ -133,7 +133,7 @@ kernel void assume_convergent_asm()
   __asm__ volatile("s_barrier");
 }
 
-// CHECK: attributes #0 = { nofree noinline norecurse nounwind "
+// CHECK: attributes #0 = { nofree noinline norecurse nounwind memory(readwrite, errnomem: none) "
 // CHECK: attributes #1 = { {{[^}]*}}convergent{{[^}]*}} }
 // CHECK: attributes #2 = { {{[^}]*}}convergent{{[^}]*}} }
 // CHECK: attributes #3 = { {{[^}]*}}convergent noduplicate{{[^}]*}} }
diff --git a/llvm/include/llvm/AsmParser/LLToken.h b/llvm/include/llvm/AsmParser/LLToken.h
index 7b47bc88ddb25f..e36ebcb0f13b08 100644
--- a/llvm/include/llvm/AsmParser/LLToken.h
+++ b/llvm/include/llvm/AsmParser/LLToken.h
@@ -201,11 +201,15 @@ enum Kind {
   kw_readwrite,
   kw_argmem,
   kw_inaccessiblemem,
+  kw_errnomem,
 
   // Legacy memory attributes:
   kw_argmemonly,
   kw_inaccessiblememonly,
   kw_inaccessiblemem_or_argmemonly,
+  kw_inaccessiblemem_or_errnomemonly,
+  kw_inaccessiblemem_or_argmem_or_errnomemonly,
+  kw_errnomemonly,
 
   // Captures attribute:
   kw_address,
diff --git a/llvm/include/llvm/Bitcode/LLVMBitCodes.h b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
index 9eb38c3e448291..4d4911932fb181 100644
--- a/llvm/include/llvm/Bitcode/LLVMBitCodes.h
+++ b/llvm/include/llvm/Bitcode/LLVMBitCodes.h
@@ -789,6 +789,9 @@ enum AttributeKindCodes {
   ATTR_KIND_NO_DIVERGENCE_SOURCE = 100,
   ATTR_KIND_SANITIZE_TYPE = 101,
   ATTR_KIND_CAPTURES = 102,
+  ATTR_KIND_ERRNOMEMONLY = 103,
+  ATTR_KIND_INACCESSIBLEMEM_OR_ERRNOMEMONLY = 104,
+  ATTR_KIND_INACCESSIBLEMEM_OR_ARGMEM_OR_ERRNOMEMONLY = 105,
 };
 
 enum ComdatSelectionKindCodes {
diff --git a/llvm/include/llvm/IR/Function.h b/llvm/include/llvm/IR/Function.h
index e7afcbd31420c1..847b8367743234 100644
--- a/llvm/include/llvm/IR/Function.h
+++ b/llvm/include/llvm/IR/Function.h
@@ -575,11 +575,26 @@ class LLVM_ABI Function : public GlobalObject, public ilist_node<Function> {
   bool onlyAccessesInaccessibleMemory() const;
   void setOnlyAccessesInaccessibleMemory();
 
+  /// Determine if the function may only access errno memory.
+  bool onlyAccessesErrnoMemory() const;
+  void setOnlyAccessesErrnoMemory();
+
   /// Determine if the function may only access memory that is
   ///  either inaccessible from the IR or pointed to by its arguments.
   bool onlyAccessesInaccessibleMemOrArgMem() const;
   void setOnlyAccessesInaccessibleMemOrArgMem();
 
+  /// Determine if the function may only access memory that is
+  ///  either inaccessible from the IR or errno memory.
+  bool onlyAccessesInaccessibleMemOrErrnoMem() const;
+  void setOnlyAccessesInaccessibleMemOrErrnoMem();
+
+  /// Determine if the function may only access memory that is
+  ///  either inaccessible from the IR, pointed to by its arguments, or errno
+  ///  memory.
+  bool onlyAccessesInaccessibleMemOrArgMemOrErrnoMem() const;
+  void setOnlyAccessesInaccessibleMemOrArgMemOrErrnoMem();
+
   /// Determine if the function cannot return.
   bool doesNotReturn() const {
     return hasFnAttribute(Attribute::NoReturn);
diff --git a/llvm/include/llvm/IR/InstrTypes.h b/llvm/include/llvm/IR/InstrTypes.h
index 47ddc7555594c5..5ef72e09292f3a 100644
--- a/llvm/include/llvm/IR/InstrTypes.h
+++ b/llvm/include/llvm/IR/InstrTypes.h
@@ -1909,11 +1909,26 @@ class CallBase : public Instruction {
   bool onlyAccessesInaccessibleMemory() const;
   void setOnlyAccessesInaccessibleMemory();
 
+  /// Determine if the function may only access errno memory.
+  bool onlyAccessesErrnoMemory() const;
+  void setOnlyAccessesErrnoMemory();
+
   /// Determine if the function may only access memory that is
   /// either inaccessible from the IR or pointed to by its arguments.
   bool onlyAccessesInaccessibleMemOrArgMem() const;
   void setOnlyAccessesInaccessibleMemOrArgMem();
 
+  /// Determine if the function may only access memory that is
+  ///  either inaccessible from the IR or errno memory.
+  bool onlyAccessesInaccessibleMemOrErrnoMem() const;
+  void setOnlyAccessesInaccessibleMemOrErrnoMem();
+
+  /// Determine if the function may only access memory that is
+  ///  either inaccessible from the IR, pointed to by its arguments, or errno
+  ///  memory.
+  bool onlyAccessesInaccessibleMemOrArgMemOrErrnoMem() const;
+  void setOnlyAccessesInaccessibleMemOrArgMemOrErrnoMem();
+
   /// Determine if the call cannot return.
   bool doesNotReturn() const { return hasFnAttr(Attribute::NoReturn); }
   void setDoesNotReturn() { addFnAttr(Attribute::NoReturn); }
diff --git a/llvm/include/llvm/Support/ModRef.h b/llvm/include/llvm/Support/ModRef.h
index 9ecdab71ec8ca6..aa4ce8295c77a7 100644
--- a/llvm/include/llvm/Support/ModRef.h
+++ b/llvm/include/llvm/Support/ModRef.h
@@ -61,8 +61,10 @@ enum class IRMemLocation {
   ArgMem = 0,
   /// Memory that is inaccessible via LLVM IR.
   InaccessibleMem = 1,
+  /// Errno memory.
+  ErrnoMem = 2,
   /// Any other memory.
-  Other = 2,
+  Other = 3,
 
   /// Helpers to iterate all locations in the MemoryEffectsBase class.
   First = ArgMem,
@@ -139,6 +141,11 @@ template <typename LocationEnum> class MemoryEffectsBase {
     return MemoryEffectsBase(Location::InaccessibleMem, MR);
   }
 
+  /// Create MemoryEffectsBase that can only access errno memory.
+  static MemoryEffectsBase errnoMemOnly(ModRefInfo MR = ModRefInfo::ModRef) {
+    return MemoryEffectsBase(Location::ErrnoMem, MR);
+  }
+
   /// Create MemoryEffectsBase that can only access inaccessible or argument
   /// memory.
   static MemoryEffectsBase
@@ -149,6 +156,27 @@ template <typename LocationEnum> class MemoryEffectsBase {
     return FRMB;
   }
 
+  /// Create MemoryEffectsBase that can only access inaccessible or errno
+  /// memory.
+  static MemoryEffectsBase
+  inaccessibleOrErrnoMemOnly(ModRefInfo MR = ModRefInfo::ModRef) {
+    MemoryEffectsBase FRMB = none();
+    FRMB.setModRef(Location::ErrnoMem, MR);
+    FRMB.setModRef(Location::InaccessibleMem, MR);
+    return FRMB;
+  }
+
+  /// Create MemoryEffectsBase that can only access inaccessible, argument or
+  /// errno memory.
+  static MemoryEffectsBase
+  inaccessibleOrArgOrErrnoMemOnly(ModRefInfo MR = ModRefInfo::ModRef) {
+    MemoryEffectsBase FRMB = none();
+    FRMB.setModRef(Location::ArgMem, MR);
+    FRMB.setModRef(Location::ErrnoMem, MR);
+    FRMB.setModRef(Location::InaccessibleMem, MR);
+    return FRMB;
+  }
+
   /// Create MemoryEffectsBase from an encoded integer value (used by memory
   /// attribute).
   static MemoryEffectsBase createFromIntValue(uint32_t Data) {
@@ -207,11 +235,21 @@ template <typename LocationEnum> class MemoryEffectsBase {
     return isModOrRefSet(getModRef(Location::ArgMem));
   }
 
+  /// Whether this function may access errno memory.
+  bool doesAccessErrnoMem() const {
+    return isModOrRefSet(getModRef(Location::ErrnoMem));
+  }
+
   /// Whether this function only (at most) accesses inaccessible memory.
   bool onlyAccessesInaccessibleMem() const {
     return getWithoutLoc(Location::InaccessibleMem).doesNotAccessMemory();
   }
 
+  /// Whether this function only (at most) accesses errno memory.
+  bool onlyAccessesErrnoMem() const {
+    return getWithoutLoc(Location::ErrnoMem).doesNotAccessMemory();
+  }
+
   /// Whether this function only (at most) accesses argument and inaccessible
   /// memory.
   bool onlyAccessesInaccessibleOrArgMem() const {
@@ -220,6 +258,23 @@ template <typename LocationEnum> class MemoryEffectsBase {
         .doesNotAccessMemory();
   }
 
+  /// Whether this function only (at most) accesses inaccessible and errno
+  /// memory.
+  bool onlyAccessesInaccessibleOrErrnoMem() const {
+    return getWithoutLoc(Location::InaccessibleMem)
+        .getWithoutLoc(Location::ErrnoMem)
+        .doesNotAccessMemory();
+  }
+
+  /// Whether this function only (at most) accesses inaccessible, argument and
+  /// errno memory.
+  bool onlyAccessesInaccessibleOrArgOrErrnoMem() const {
+    return getWithoutLoc(Location::InaccessibleMem)
+        .getWithoutLoc(Location::ArgMem)
+        .getWithoutLoc(Location::ErrnoMem)
+        .doesNotAccessMemory();
+  }
+
   /// Intersect with other MemoryEffectsBase.
   MemoryEffectsBase operator&(MemoryEffectsBase Other) const {
     return MemoryEffectsBase(Data & Other.Data);
diff --git a/llvm/lib/AsmParser/LLLexer.cpp b/llvm/lib/AsmParser/LLLexer.cpp
index 5ea507c009bdc6..7fcdd5d6f0df9c 100644
--- a/llvm/lib/AsmParser/LLLexer.cpp
+++ b/llvm/lib/AsmParser/LLLexer.cpp
@@ -701,9 +701,13 @@ lltok::Kind LLLexer::LexIdentifier() {
   KEYWORD(readwrite);
   KEYWORD(argmem);
   KEYWORD(inaccessiblemem);
+  KEYWORD(errnomem);
   KEYWORD(argmemonly);
   KEYWORD(inaccessiblememonly);
+  KEYWORD(errnomemonly);
   KEYWORD(inaccessiblemem_or_argmemonly);
+  KEYWORD(inaccessiblemem_or_errnomemonly);
+  KEYWORD(inaccessiblemem_or_argmem_or_errnomemonly);
   KEYWORD(address_is_null);
   KEYWORD(address);
   KEYWORD(provenance);
diff --git a/llvm/lib/AsmParser/LLParser.cpp b/llvm/lib/AsmParser/LLParser.cpp
index fa0079bac435c1..55b352a44d8838 100644
--- a/llvm/lib/AsmParser/LLParser.cpp
+++ b/llvm/lib/AsmParser/LLParser.cpp
@@ -1670,9 +1670,18 @@ static bool upgradeMemoryAttr(MemoryEffects &ME, lltok::Kind Kind) {
   case lltok::kw_inaccessiblememonly:
     ME &= MemoryEffects::inaccessibleMemOnly();
     return true;
+  case lltok::kw_errnomemonly:
+    ME &= MemoryEffects::errnoMemOnly();
+    return true;
   case lltok::kw_inaccessiblemem_or_argmemonly:
     ME &= MemoryEffects::inaccessibleOrArgMemOnly();
     return true;
+  case lltok::kw_inaccessiblemem_or_errnomemonly:
+    ME &= MemoryEffects::inaccessibleOrErrnoMemOnly();
+    return true;
+  case lltok::kw_inaccessiblemem_or_argmem_or_errnomemonly:
+    ME &= MemoryEffects::inaccessibleOrArgOrErrnoMemOnly();
+    return true;
   default:
     return false;
   }
@@ -2490,6 +2499,8 @@ static std::optional<MemoryEffects::Location> keywordToLoc(lltok::Kind Tok) {
     return IRMemLocation::ArgMem;
   case lltok::kw_inaccessiblemem:
     return IRMemLocation::InaccessibleMem;
+  case lltok::kw_errnomem:
+    return IRMemLocation::ErrnoMem;
   default:
     return std::nullopt;
   }
@@ -2538,7 +2549,7 @@ std::optional<MemoryEffects> LLParser::parseMemoryAttr() {
     std::optional<ModRefInfo> MR = keywordToModRef(Lex.getKind());
     if (!MR) {
       if (!Loc)
-        tokError("expected memory location (argmem, inaccessiblemem) "
+        tokError("expected memory location (argmem, inaccessiblemem, errnomem) "
                  "or access kind (none, read, write, readwrite)");
       else
         tokError("expected access kind (none, read, write, readwrite)");
diff --git a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
index 551dfd4af88bb2..8165002e9d7204 100644
--- a/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
+++ b/llvm/lib/Bitcode/Reader/BitcodeReader.cpp
@@ -1910,6 +1910,10 @@ static uint64_t getRawAttributeMask(Attribute::AttrKind Val) {
     return 1ULL << 62;
   case Attribute::NoFree:
     return 1ULL << 63;
+  // 7ULL << 36 is ErrnoMemOnly, which is upgraded separately.
+  // 7ULL << 37 is InaccessibleMemOrErrnoMemOnly, which is upgraded separately.
+  // 7ULL << 38 is InaccessibleMemOrArgOrErrnoMemOnly, which is upgraded
+  // separately.
   default:
     // Other attributes are not supported in the raw format,
     // as we ran out of space.
@@ -1982,6 +1986,21 @@ static void decodeLLVMAttributesForBitcode(AttrBuilder &B,
       Attrs &= ~(1ULL << 53);
       ME &= MemoryEffects::writeOnly();
     }
+    if (Attrs & (7ULL << 36)) {
+      // ErrnoMemOnly
+      Attrs &= ~(7ULL << 36);
+      ME &= MemoryEffects::errnoMemOnly();
+    }
+    if (Attrs & (7ULL << 37)) {
+      // InaccessibleMemOrErrnoMemOnly
+      Attrs &= ~(7ULL << 37);
+      ME &= MemoryEffects::inaccessibleOrErrnoMemOnly();
+    }
+    if (Attrs & (7ULL << 38)) {
+      // InaccessibleMemOrArgOrErrnoMemOnly
+      Attrs &= ~(7ULL << 38);
+      ME &= MemoryEffects::inaccessibleOrArgOrErrnoMemOnly();
+    }
     if (ME != MemoryEffects::unknown())
       B.addMemoryAttr(ME);
   }
@@ -2289,9 +2308,18 @@ static bool upgradeOldMemoryAttribute(MemoryEffects &ME, uint64_t EncodedKind) {
   case bitc::ATTR_KIND_INACCESSIBLEMEM_ONLY:
     ME &= MemoryEffects::inaccessibleMemOnly();
     return true;
+  case bitc::ATTR_KIND_ERRNOMEMONLY:
+    ME &= MemoryEffects::errnoMemOnly();
+    return true;
   case bitc::ATTR_KIND_INACCESSIBLEMEM_OR_ARGMEMONLY:
     ME &= MemoryEffects::inaccessibleOrArgMemOnly();
     return true;
+  case bitc::ATTR_KIND_INACCESSIBLEMEM_OR_ERRNOMEMONLY:
+    ME &= MemoryEffects::inaccessibleOrErrnoMemOnly();
+    return true;
+  case bitc::ATTR_KIND_INACCESSIBLEMEM_OR_ARGMEM_OR_ERRNOMEMONLY:
+    ME &= MemoryEffects::inaccessibleOrArgOrErrnoMemOnly();
+    return true;
   default:
     return false;
   }
diff --git a/llvm/lib/IR/Attributes.cpp b/llvm/lib/IR/Attributes.cpp
index ceb31856283c9a..0ead5a26ce5bd3 100644
--- a/llvm/lib/IR/Attributes.cpp
+++ b/llvm/lib/IR/Attributes.cpp
@@ -643,6 +643,9 @@ std::string Attribute::getAsString(bool InAttrGrp) const {
       case IRMemLocation::InaccessibleMem:
         OS << "inaccessiblemem: ";
         break;
+      case IRMemLocation::ErrnoMem:
+        OS << "errnomem: ";
+        break;
       case IRMemLocation::Other:
         llvm_unreachable("This is represented as the default access kind");
       }
diff --git a/llvm/lib/IR/Function.cpp b/llvm/lib/IR/Function.cpp
index 9c5dd5aeb92e97..e4c64bcf9bc5d0 100644
--- a/llvm/lib/IR/Function.cpp
+++ b/llvm/lib/IR/Function.cpp
@@ -922,6 +922,...
[truncated]

Copy link
Member

@dtcxzyw dtcxzyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you double-check libcalls that also access the argument memory?

Changed |= setDoesNotThrow(F);
Changed |= setDoesNotCapture(F, 0);
Changed |= setOnlyReadsMemory(F, 0);
Changed |= setOnlyAccessesErrnoMemory(F);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC it should be memory(argmem: read, errnomem: write).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, shouldn’t marking all pointer arguments as readonly be semantically equivalent to having ArgMem: Ref?

// May throw; "read" is a valid pthread cancellation point.
Changed |= setRetAndArgsNoUndef(F);
Changed |= setDoesNotCapture(F, 1);
Changed |= setOnlyAccessesErrnoMemory(F);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be memory(argmem: write, errnomem: write).

// memory, rather than manually inspecting the underlying object.
if (isa<CallInst>(UO)) {
auto *Callee = cast<CallInst>(UO)->getCalledFunction();
if (Callee && Callee->getName() == "__errno_location") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use _errno on Windows: https://godbolt.org/z/T6oanEKKa
See also

// Names of functions that return a location of the "errno" value.
// FIXME: Are there other similar function names?
CallDescriptionSet ErrnoLocationCalls{
{CDM::CLibrary, {"__errno_location"}, 0, 0},
{CDM::CLibrary, {"___errno"}, 0, 0},
{CDM::CLibrary, {"__errno"}, 0, 0},
{CDM::CLibrary, {"_errno"}, 0, 0},
{CDM::CLibrary, {"__error"}, 0, 0}};
.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, clang on Windows links against UCRT, which declares _errno. This has to be taken into account.

@dtcxzyw
Copy link
Member

dtcxzyw commented Jan 29, 2025

Can you double-check libcalls that also access the argument memory?

We may need a new helper setOnlyAccessesArgMemOrErrnoMem.

Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please take a second look at your markings: in particular, syscalls with side-effects should be treated as modifying inaccessible memory.

Changed |= setDoesNotThrow(F);
Changed |= setDoesNotCapture(F, 0);
Changed |= setOnlyReadsMemory(F, 0);
Changed |= setOnlyAccessesErrnoMemory(F);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setOnlyAccessesInaccessibleMemOrErrnoMem()?

Also, why is this setOnlyReadsMemory?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does not modify the content of the path string in input, so it should make sense to have the pointer argument readonly? Agree that all the other functions have OS/file-system side-effects, so they do access inaccessible memory and errno.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the argument should be readonly. Might need to reformulate the setter functions to make clear what this is doing...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed, thanks.

Changed |= setRetAndArgsNoUndef(F);
Changed |= setDoesNotThrow(F);
Changed |= setDoesNotCapture(F, 0);
Changed |= setOnlyAccessesErrnoMemory(F);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setOnlyAccessesInaccessibleMemOrErrnoMem()?

Changed |= setDoesNotCapture(F, 1);
Changed |= setOnlyReadsMemory(F, 0);
Changed |= setOnlyReadsMemory(F, 1);
Changed |= setOnlyAccessesErrnoMemory(F);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setOnlyAccessesInaccessibleMemOrErrnoMem()?

Changed |= setRetDoesNotAlias(F);
Changed |= setDoesNotCapture(F, 1);
Changed |= setOnlyReadsMemory(F, 1);
Changed |= setOnlyAccessesErrnoMemory(F);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setOnlyAccessesInaccessibleMemOrErrnoMem()?

@antoniofrighetto antoniofrighetto force-pushed the feature/infer-errnomem-libcalls branch from 8a71390 to 7c51af5 Compare February 12, 2025 20:07
@antoniofrighetto
Copy link
Contributor Author

antoniofrighetto commented Feb 12, 2025

Changes overview from The Open Group Base Specifications:

  1. malloc, memalign, calloc, tmpfile: setOnlyAccessesInaccessibleMemOrErrnoMem
  2. Do access inaccessible memory / may have side-effects / access filesystem / FILE stream memory / access locale: strdup, system, mkdir, mktime, realloc, open (and variants), gettimeofday, chmod, chown, fscanf, fprintf, fread, fgetpos, unlink, unsetenv, utime, getpwnam, opendir, rename, rmdir, pclose, write, strtol (and variants), strxfrm, strcoll, scanf (and variants), printf/sprintf (and variants), stat (and variants), setitimer, getitimer, read, realpath, fseek, ftell, uname, perror: setOnlyAccessesInaccessibleMemOrArgMemOrErrnoMem

@github-actions
Copy link

github-actions bot commented Feb 12, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@antoniofrighetto antoniofrighetto force-pushed the feature/infer-errnomem-libcalls branch from 7c51af5 to 5c8d7bc Compare February 12, 2025 20:11
Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I stopped halfway; please make a second pass once you fix the noted issues.

Changed |= setDoesNotCapture(F, 0);
Changed |= setDoesNotCapture(F, 1);
Changed |= setOnlyReadsMemory(F, 1);
Changed |= setOnlyAccessesArgMemOrErrnoMem(F);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

strxfrm/strcoll/sprintf/sscanf/etc. access locale; is that inaccessiblemem, or do we need to drop this marking completely?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, adding locale-sensitive routines as well. IIUC this should be set as inaccessiblemem: read as they do not change locale.

Changed |= setDoesNotThrow(F);
Changed |= setDoesNotCapture(F, 0);
Changed |= setOnlyReadsMemory(F, 0);
Changed |= setOnlyAccessesArgMemOrErrnoMem(F, ModRefInfo::Ref);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Umm, are you forgetting about stdin?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, should be inaccessiblemem: read.

// May throw; "read" is a valid pthread cancellation point.
Changed |= setRetAndArgsNoUndef(F);
Changed |= setDoesNotCapture(F, 1);
Changed |= setOnlyAccessesArgMemOrErrnoMem(F);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read also modifies the file descriptor

Changed |= setDoesNotThrow(F);
Changed |= setDoesNotCapture(F, 0);
Changed |= setOnlyReadsMemory(F, 0);
Changed |= setOnlyAccessesArgMemOrErrnoMem(F);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

realpath accesses the filesystem?

Changed |= setDoesNotCapture(F, 0);
Changed |= setDoesNotCapture(F, 1);
Changed |= setOnlyReadsMemory(F, 0);
Changed |= setOnlyAccessesArgMemOrErrnoMem(F);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

readlink accesses the filesystem?

Changed |= setDoesNotThrow(F);
Changed |= setDoesNotCapture(F, 0);
Changed |= setOnlyReadsMemory(F, 0);
Changed |= setOnlyAccessesArgMemOrErrnoMem(F, ModRefInfo::Ref);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

access() accesses the filesystem?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, this is being added as inaccessiblemem: read as it only reads the filesystem.

@antoniofrighetto antoniofrighetto force-pushed the feature/infer-errnomem-libcalls branch from 5c8d7bc to 661daa2 Compare February 17, 2025 19:26
@antoniofrighetto
Copy link
Contributor Author

Took a whole second iteration and updated the overview above.

Changed |= setDoesNotCapture(F, 0);
Changed |= setOnlyReadsMemory(F, 0);
Changed |=
setOnlyAccessesInaccessibleOrArgOrErrnoMem(F, MR::ModRef, MR::Ref);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the existence of environ, we certainly can't model the environment as inaccessible memory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, good point.

Changed |= setDoesNotThrow(F);
Changed |= setWillReturn(F);
Changed |= setDoesNotCapture(F, 0);
Changed |= setOnlyAccessesInaccessibleOrArgOrErrnoMem(F);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mktime also interacts with the tzname, timezone and daylight globals.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty apprehensive about the heavy lifting "inaccessible memory" is doing in this patch -- things like locales, files etc are now modelled as inaccessible memory.

  • I'm not very confident there are no non-libcall ways to access all of these even within the defined API.
  • We also have to worry about accesses of implementation details in headers. If any library calls are implemented as inline functions in headers, they may access memory that is logically inaccessible.
  • This further increases the LTO risk, where this inaccessible memory may become accessible. Currently this problem is mostly limited to allocators, but this expands it to all of libc.

TBH I think that all this patch should be doing is change this one line to restrict to errnomem:

Changed |= setOnlyWritesMemory(F);

That's the main case we're interested in handling. (Well, that and malloc, but I think that should only be added after the AA support is there.)

@antoniofrighetto antoniofrighetto force-pushed the feature/infer-errnomem-libcalls branch from 661daa2 to c13ebc1 Compare February 24, 2025 17:17
@antoniofrighetto
Copy link
Contributor Author

We also have to worry about accesses of implementation details in headers. If any library calls are implemented as inline functions in headers, they may access memory that is logically inaccessible.

This is in part an already existing issue, isn't it? Agree that, during LTO, chances of hitting this would be higher with the patch though.

I'm not very confident there are no non-libcall ways to access all of these even within the defined API.

This is what I was kind of worried about too initially, that such APIs may access some (memory) state that could be still program-visible. Might not be the case for locale, but definitely is for unsetenv, as you noted. Restricted libc functions back to errnomem only; this can be relieved further if needed.

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Restricted libc functions back to errnomem only; this can be relieved further if needed.

It looks like you still have lots of functions that access locale (like strtol, strxfrm) listed as "argmem or errnomem only"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Accesses locale?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had deduced from your previous marking that locale should not be encompassed as inaccessible memory. Have I mistaken it?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't be inaccessible memory -- but if you say this is argmem + errnomem only you're saying that locale doesn't exist at all. We just shouldn't add memory attributes for things that access locale at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, that’s very true. Thanks for the clarification!

Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To start with, please only add errnomem to the FP libcalls, and nothing else.

@antoniofrighetto antoniofrighetto force-pushed the feature/infer-errnomem-libcalls branch from e638af7 to a870604 Compare February 28, 2025 17:46
@antoniofrighetto antoniofrighetto changed the title [InferAttrs] Mark errnomem-setting libcalls as such [InferAttrs] Mark floating-point libcalls as errno-writing Feb 28, 2025
@antoniofrighetto
Copy link
Contributor Author

To start with, please only add errnomem to the FP libcalls, and nothing else.

Agree, let's begin with a reduced set of functions first off. Updated, thanks!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isascii and isdigit access locale. They shouldn't be in this list.

Copy link
Contributor Author

@antoniofrighetto antoniofrighetto Mar 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, missed the twos. isascii is mentioned as being defined on all integer values, and «cannot be used portably in a localized application», OTOH all ctype.h functions should be locale-dependant, so I'm removing isascii too.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For all othe other ones, these should be setDoesNotAccessMemory, right? As they don't access ernno or anything else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not right, sincos has out parameters. This should be argmem: write + errno.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly, thought marking out parameters as writeonly sufficed to carry this information. Exposing modref only for ArgMem, as errno is only written here, per below.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these functions only write errno, not read it. We should preserve that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to be careful with this fallthrough as well, remquo has an out param.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, moving remquo together with sincos, except the latter has an extra out parameter.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like you dropped some of the calls that don't set errno, but not all of them? E.g. floor and nearbyint are still here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ugh, was looking at a older version of POSIX. Indeed those do not set errno anymore in recent versions. Updated, thanks!

@antoniofrighetto antoniofrighetto force-pushed the feature/infer-errnomem-libcalls branch from 3b8a259 to 535719a Compare March 4, 2025 14:33
@antoniofrighetto antoniofrighetto requested a review from nikic March 12, 2025 07:38
}
void Function::setOnlyAccessesErrnoMemory() {
setMemoryEffects(getMemoryEffects() &
MemoryEffects::errnoMemOnly(ModRefInfo::Mod));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think hardcoding this Mod under an API with this name is okay. Please rebase over a502c65 and remove these Function APIs, keeping everything local to BuildLibCalls.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely, thanks for looking into it.

case LibFunc_ldexp:
case LibFunc_ldexpf:
case LibFunc_ldexpl:
Changed |= setOnlyAccessesErrnoMemory(F);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also rename this one to setOnlyWritesErrnoMemory(F) for clarity.

@antoniofrighetto antoniofrighetto force-pushed the feature/infer-errnomem-libcalls branch from 535719a to 9d20102 Compare March 12, 2025 18:55
Copy link
Contributor

@nikic nikic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

break;
case LibFunc_ldexp:
case LibFunc_ldexpf:
case LibFunc_ldexpl:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can merge ldexp with the rest of the FP libcall list? Should also be able to set doesNotThrow and setDoesNotFreeMemory on it...

Floating-point libcalls are currently conservatively marked as
may write any memory. Restrict these to clobber only `errno`.
@antoniofrighetto antoniofrighetto force-pushed the feature/infer-errnomem-libcalls branch from 9d20102 to 139add5 Compare March 13, 2025 09:41
@antoniofrighetto antoniofrighetto merged commit 139add5 into llvm:main Mar 13, 2025
5 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

clang Clang issues not falling into any other category llvm:ir llvm:support llvm:transforms mlir:llvm mlir

Projects

None yet

Development

Successfully merging this pull request may close these issues.