Skip to content

Conversation

@jurahul
Copy link
Contributor

@jurahul jurahul commented Oct 14, 2024

Also change LLVMIntrinsicCopyOverloadedName and LLVMIntrinsicCopyOverloadedName2 to return char * instead of const char* since the returned memory is owned by the caller and we expect that pointer to be passed to free to deallocate it.

@jurahul
Copy link
Contributor Author

jurahul commented Oct 14, 2024

Failures look like existing regressions. I'll rebase to get clean checks again, but wanted to start review.

@jurahul jurahul marked this pull request as ready for review October 14, 2024 14:39
@jurahul jurahul requested a review from nikic as a code owner October 14, 2024 14:39
@jurahul jurahul requested a review from nhaehnle October 14, 2024 14:39
@llvmbot
Copy link
Member

llvmbot commented Oct 14, 2024

@llvm/pr-subscribers-llvm-ir

Author: Rahul Joshi (jurahul)

Changes

Also change LLVMIntrinsicCopyOverloadedName and LLVMIntrinsicCopyOverloadedName2 to return char * instead of const char* since the returned memory is owned by the caller and we expect that pointer to be passed to free to deallocate it.


Full diff: https://github.com/llvm/llvm-project/pull/112197.diff

4 Files Affected:

  • (modified) llvm/docs/ReleaseNotes.md (+4)
  • (modified) llvm/include/llvm-c/Core.h (+13-8)
  • (modified) llvm/lib/IR/Core.cpp (+12-8)
  • (modified) llvm/unittests/IR/IntrinsicsTest.cpp (+45)
diff --git a/llvm/docs/ReleaseNotes.md b/llvm/docs/ReleaseNotes.md
index dcdd7a25c7fbee..f9857c45a31dfd 100644
--- a/llvm/docs/ReleaseNotes.md
+++ b/llvm/docs/ReleaseNotes.md
@@ -242,6 +242,10 @@ Changes to the C API
 
 * Added `LLVMAtomicRMWBinOpUSubCond` and `LLVMAtomicRMWBinOpUSubSat` to `LLVMAtomicRMWBinOp` enum for AtomicRMW instructions.
 
+* Added `LLVMIntrinsicCopyName` and changed `LLVMIntrinsicCopyOverloadedName`
+  and `LLVMIntrinsicCopyOverloadedName2` to return `const char *` instead of 
+  `const char *`.
+
 Changes to the CodeGen infrastructure
 -------------------------------------
 
diff --git a/llvm/include/llvm-c/Core.h b/llvm/include/llvm-c/Core.h
index 55649d89a6b8f4..a84a247c57200e 100644
--- a/llvm/include/llvm-c/Core.h
+++ b/llvm/include/llvm-c/Core.h
@@ -2833,11 +2833,17 @@ LLVMTypeRef LLVMIntrinsicGetType(LLVMContextRef Ctx, unsigned ID,
  */
 const char *LLVMIntrinsicGetName(unsigned ID, size_t *NameLength);
 
+/**
+ * Copies the name of an intrinsic. The caller is responsible for freeing the
+ * returned string.
+ *
+ * @see llvm::Intrinsic::getName()
+ */
+char *LLVMIntrinsicCopyName(unsigned ID, size_t *NameLength);
+
 /** Deprecated: Use LLVMIntrinsicCopyOverloadedName2 instead. */
-const char *LLVMIntrinsicCopyOverloadedName(unsigned ID,
-                                            LLVMTypeRef *ParamTypes,
-                                            size_t ParamCount,
-                                            size_t *NameLength);
+char *LLVMIntrinsicCopyOverloadedName(unsigned ID, LLVMTypeRef *ParamTypes,
+                                      size_t ParamCount, size_t *NameLength);
 
 /**
  * Copies the name of an overloaded intrinsic identified by a given list of
@@ -2850,10 +2856,9 @@ const char *LLVMIntrinsicCopyOverloadedName(unsigned ID,
  *
  * @see llvm::Intrinsic::getName()
  */
-const char *LLVMIntrinsicCopyOverloadedName2(LLVMModuleRef Mod, unsigned ID,
-                                             LLVMTypeRef *ParamTypes,
-                                             size_t ParamCount,
-                                             size_t *NameLength);
+char *LLVMIntrinsicCopyOverloadedName2(LLVMModuleRef Mod, unsigned ID,
+                                       LLVMTypeRef *ParamTypes,
+                                       size_t ParamCount, size_t *NameLength);
 
 /**
  * Obtain if the intrinsic identified by the given ID is overloaded.
diff --git a/llvm/lib/IR/Core.cpp b/llvm/lib/IR/Core.cpp
index 1cf998c6850068..65572116e48ca5 100644
--- a/llvm/lib/IR/Core.cpp
+++ b/llvm/lib/IR/Core.cpp
@@ -2478,6 +2478,13 @@ const char *LLVMIntrinsicGetName(unsigned ID, size_t *NameLength) {
   return Str.data();
 }
 
+char *LLVMIntrinsicCopyName(unsigned ID, size_t *NameLength) {
+  auto IID = llvm_map_to_intrinsic_id(ID);
+  auto Str = llvm::Intrinsic::getName(IID);
+  *NameLength = Str.size();
+  return strdup(Str.data());
+}
+
 LLVMTypeRef LLVMIntrinsicGetType(LLVMContextRef Ctx, unsigned ID,
                                  LLVMTypeRef *ParamTypes, size_t ParamCount) {
   auto IID = llvm_map_to_intrinsic_id(ID);
@@ -2485,10 +2492,8 @@ LLVMTypeRef LLVMIntrinsicGetType(LLVMContextRef Ctx, unsigned ID,
   return wrap(llvm::Intrinsic::getType(*unwrap(Ctx), IID, Tys));
 }
 
-const char *LLVMIntrinsicCopyOverloadedName(unsigned ID,
-                                            LLVMTypeRef *ParamTypes,
-                                            size_t ParamCount,
-                                            size_t *NameLength) {
+char *LLVMIntrinsicCopyOverloadedName(unsigned ID, LLVMTypeRef *ParamTypes,
+                                      size_t ParamCount, size_t *NameLength) {
   auto IID = llvm_map_to_intrinsic_id(ID);
   ArrayRef<Type*> Tys(unwrap(ParamTypes), ParamCount);
   auto Str = llvm::Intrinsic::getNameNoUnnamedTypes(IID, Tys);
@@ -2496,10 +2501,9 @@ const char *LLVMIntrinsicCopyOverloadedName(unsigned ID,
   return strdup(Str.c_str());
 }
 
-const char *LLVMIntrinsicCopyOverloadedName2(LLVMModuleRef Mod, unsigned ID,
-                                             LLVMTypeRef *ParamTypes,
-                                             size_t ParamCount,
-                                             size_t *NameLength) {
+char *LLVMIntrinsicCopyOverloadedName2(LLVMModuleRef Mod, unsigned ID,
+                                       LLVMTypeRef *ParamTypes,
+                                       size_t ParamCount, size_t *NameLength) {
   auto IID = llvm_map_to_intrinsic_id(ID);
   ArrayRef<Type *> Tys(unwrap(ParamTypes), ParamCount);
   auto Str = llvm::Intrinsic::getName(IID, Tys, unwrap(Mod));
diff --git a/llvm/unittests/IR/IntrinsicsTest.cpp b/llvm/unittests/IR/IntrinsicsTest.cpp
index 7fe0bd79b80a60..c3c283169f7fea 100644
--- a/llvm/unittests/IR/IntrinsicsTest.cpp
+++ b/llvm/unittests/IR/IntrinsicsTest.cpp
@@ -7,6 +7,7 @@
 //===----------------------------------------------------------------------===//
 
 #include "llvm/IR/Intrinsics.h"
+#include "llvm-c/Core.h"
 #include "llvm/ADT/SmallVector.h"
 #include "llvm/IR/Constant.h"
 #include "llvm/IR/IRBuilder.h"
@@ -127,6 +128,50 @@ TEST(IntrinsicNameLookup, MSBuiltinLookup) {
     EXPECT_EQ(ID, getIntrinsicForMSBuiltin(Target, Builtin));
 }
 
+// Test C API to get/copy LLVM intrinsic name.
+TEST(IntrinsicNameLookup, LLVMIntrinsicGetCopyNameSimple) {
+  static constexpr struct {
+    Intrinsic::ID ID;
+    const char *Name;
+  } Tests[] = {{Intrinsic::not_intrinsic, "not_intrinsic"},
+               {Intrinsic::assume, "llvm.assume"},
+               {Intrinsic::coro_free, "llvm.coro.free"},
+               {Intrinsic::aarch64_break, "llvm.aarch64.break"},
+               {Intrinsic::x86_int, "llvm.x86.int"}};
+
+  for (auto [ID, ExpectedName] : Tests) {
+    size_t NameSize = 0;
+    const char *CName = LLVMIntrinsicGetName(ID, &NameSize);
+    StringRef Name(CName, NameSize);
+
+    // Verify we get correct name.
+    EXPECT_EQ(Name, ExpectedName);
+    const char *CName1 = LLVMIntrinsicGetName(ID, &NameSize);
+
+    // Verify we get the same pointer and length the second time.
+    EXPECT_EQ(CName, CName1);
+    EXPECT_EQ(NameSize, Name.size());
+  }
+
+  // Now test the copy API.
+  for (auto [ID, ExpectedName] : Tests) {
+    size_t NameSize = 0;
+    char *CName = LLVMIntrinsicCopyName(ID, &NameSize);
+    StringRef Name(CName, NameSize);
+
+    // Verify we get correct name.
+    EXPECT_EQ(Name, ExpectedName);
+
+    // Verify we get the different pointer and same length the second time.
+    char *CName1 = LLVMIntrinsicCopyName(ID, &NameSize);
+    EXPECT_NE(CName, CName1);
+    EXPECT_EQ(NameSize, Name.size());
+
+    free(CName);
+    free(CName1);
+  }
+}
+
 TEST_F(IntrinsicsTest, InstrProfInheritance) {
   auto isInstrProfInstBase = [](const Instruction &I) {
     return isa<InstrProfInstBase>(I);

@jurahul jurahul force-pushed the llvm_intrinsic_copy_name branch from 22d6cf0 to 4ded4f1 Compare October 14, 2024 18:26
@jurahul jurahul force-pushed the llvm_intrinsic_copy_name branch from 4ded4f1 to 53efb5b Compare October 22, 2024 18:13
Also change `LLVMIntrinsicCopyOverloadedName` and
`LLVMIntrinsicCopyOverloadedName2` to return `char *` instead of
`const char*` since the returned memory is owned by the caller and
we expect that pointer to be passed to `free` to deallocate it.
@jurahul jurahul force-pushed the llvm_intrinsic_copy_name branch from 53efb5b to b48aa0c Compare October 22, 2024 21:38
@jurahul
Copy link
Contributor Author

jurahul commented Oct 23, 2024

@nhaehnle gentle ping

@jurahul jurahul requested a review from arsenm October 30, 2024 14:14
LLVMTypeRef *ParamTypes,
size_t ParamCount,
size_t *NameLength);
char *LLVMIntrinsicCopyOverloadedName(unsigned ID, LLVMTypeRef *ParamTypes,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think requiring clients to now free this is a pretty bad breaking change. As far as I understand this is the only way for the C API to interact with intrinsics, so the allocation is an additional cost to them. I'm still skeptical of this plan to start allocating intrinsic names

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an existing API, but just changing the return type to char * instead of const char * to indicate that the ownership of the allocated memory is not LLVM but the client. Maybe I should separate this into its own change (i.e., one to change return type of current API and another to add the new API)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For overloaded intrinsics, we always need to construct a new string as the mangling suffix is not in the table

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still skeptical of this plan to start allocating intrinsic names

Yeah, I guess may be the !/$ is not as much in trying to compress name table by dropping suffixes (given the amount of complexity). Additionally, trimming target specific intrinsics (which I am, unsuccessfully yet, trying to get closure on w.r.t Intrinsic:ID ABI/stability guarantees) is much bigger !/$, so may make sense to focus on that. Let me do that for now, and just sent a fix to change the return types of the 2 existing API to char *.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Started #114334 for just the existing API fixup

return Str.data();
}

char *LLVMIntrinsicCopyName(unsigned ID, size_t *NameLength) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the new API that @nhaehnle proposed on the discourse thread

Copy link
Collaborator

@nhaehnle nhaehnle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't been able to follow the overall discussion about intrinsic names. I think the code change is fine in terms of quality, but whether it should go ahead depends on where the overall discussion is going.

@jurahul
Copy link
Contributor Author

jurahul commented Nov 4, 2024

I haven't been able to follow the overall discussion about intrinsic names. I think the code change is fine in terms of quality, but whether it should go ahead depends on where the overall discussion is going.

My latest post summarizes the status of trimming target specific intrinsics here: https://discourse.llvm.org/t/rfc-compress-intrinsic-name-table/82412/49

Its either illegal or has high logistical cost. And since code size savings seems to be not significant from that change, this one (whose motivation was an even smaller data size savings) seems even less significant to pursue. So I am planning to close this and other reviews once the thread settles down. I have minor turly NFC improvement, which I'll implement sometime soon.

@jurahul jurahul marked this pull request as draft November 5, 2024 15:59
@jurahul jurahul removed the request for review from nikic November 5, 2024 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants