Skip to content

Conversation

@elhewaty
Copy link
Member

@elhewaty elhewaty commented Dec 9, 2024

  • [Attributor] Add pre-commit tests
  • [Attributor] Pack out arguments into a struct

PR Summary:
Transforms functions with write-only pointer arguments into functions that return a struct containing the original return value and pointer-written values, improving GPU performance, and maybe x86 too.

Example:
we need to convert this:

__device__ bool foo(int *dst) {
    *dst = threadIdx.x;
    return true;
}

into

struct Out { bool ret; int dst; };

__device__ Out foo() {
    Out result;
    result.dst = threadIdx.x;
    result.ret = true;
    return result;
}

@llvmbot
Copy link
Member

llvmbot commented Dec 9, 2024

@llvm/pr-subscribers-llvm-transforms

Author: None (elhewaty)

Changes
  • [Attributor] Add pre-commit tests
  • [Attributor] Pack out arguments into a struct

Full diff: https://github.com/llvm/llvm-project/pull/119267.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/IPO/Attributor.cpp (+113)
  • (added) llvm/test/Transforms/Attributor/remove_out_args.ll (+16)
diff --git a/llvm/lib/Transforms/IPO/Attributor.cpp b/llvm/lib/Transforms/IPO/Attributor.cpp
index 116f419129a239..90eecf45892ee6 100644
--- a/llvm/lib/Transforms/IPO/Attributor.cpp
+++ b/llvm/lib/Transforms/IPO/Attributor.cpp
@@ -2963,6 +2963,119 @@ bool Attributor::shouldSeedAttribute(AbstractAttribute &AA) {
   return Result;
 }
 
+// For now: argument can be put in the struct if it's write only and
+// has no aliases.
+static bool canBeComapctedInAStruct(const Argument &Arg, Attributor &A, const AbstractAttribute &QueryingAA) {
+  IRPosition ArgPosition = IRPosition::argument(Arg);
+  // Check if Arg has no alias.
+  auto *AAliasInfo = A.getAAFor<AANoAlias>(QueryingAA, ArgPosition, DepClassTy::NONE);
+  if (!AAliasInfo || !AAliasInfo->isKnownNoAlias())
+    return false;
+
+  // Check if Arg is write-only.
+  const auto *MemBehaviorAA =
+      A.getAAFor<AAMemoryBehavior>(QueryingAA, ArgPosition, DepClassTy::NONE);
+  if (!MemBehaviorAA || !MemBehaviorAA->isKnownWriteOnly())
+    return false;
+
+  return true;
+}
+
+static void replaceArgRetWithStructRetCalls(Function &OldFunction, Function &NewFunction) {
+  for (auto UseItr = OldFunction.use_begin(); UseItr != OldFunction.use_end(); ++UseItr) {
+    CallBase *Call = dyn_cast<CallBase>(UseItr->getUser());
+    if (!Call)
+      continue;
+
+    IRBuilder<> Builder(Call);
+    SmallVector<Value *, 8> NewArgs;
+    for (unsigned ArgIdx = 0; ArgIdx < Call->arg_size(); ++ArgIdx)
+      if (std::find_if(OldFunction.arg_begin(), OldFunction.arg_end(),
+          [&](Argument &Arg) { return &Arg == Call->getArgOperand(ArgIdx); }) == OldFunction.arg_end())
+        NewArgs.push_back(Call->getArgOperand(ArgIdx));
+
+    CallInst *NewCall = Builder.CreateCall(&NewFunction, NewArgs);
+    Call->replaceAllUsesWith(NewCall);
+    Call->eraseFromParent();
+  }
+}
+
+static bool convertOutArgsToRetStruct(Function &F, Attributor &A, AbstractAttribute &QueryingAA) {
+  // Get valid ptr args.
+  DenseMap<Argument *, Type *>  PtrToType;
+  for (unsigned ArgIdx = 0; ArgIdx < F.arg_size(); ++ArgIdx) {
+    Argument *Arg = F.getArg(ArgIdx);
+    if (Arg->getType()->isPointerTy() && canBeComapctedInAStruct(*Arg, A, QueryingAA)) {
+      // Get the the type of the pointer through its users
+      for (auto UseItr = Arg->use_begin(); UseItr != Arg->use_end(); ++UseItr) {
+        auto *Store = dyn_cast<StoreInst>(UseItr->getUser());
+        if (Store)
+          PtrToType[Arg] = Store->getValueOperand()->getType();
+      }
+    }
+  }
+
+  // If there is no valid candidates then return false.
+  if (PtrToType.empty())
+    return false;
+
+  // Create the new struct return type.
+  SmallVector<Type *, 4> OutStructElements;
+  if (auto *OriginalFuncTy = F.getReturnType(); !OriginalFuncTy->isVoidTy())
+    OutStructElements.push_back(OriginalFuncTy);
+
+  for (const auto &[Arg, Type] : PtrToType)
+    OutStructElements.push_back(Type);
+
+  auto *ReturnStructType = StructType::create(F.getContext(), OutStructElements, (F.getName() + "Out").str());
+
+  // Get the new Args.
+  SmallVector<Type *, 4> NewParamTypes;
+  for (unsigned ArgIdx = 0; ArgIdx < F.arg_size(); ++ArgIdx)
+    if (!PtrToType.count(F.getArg(ArgIdx)))
+      NewParamTypes.push_back(F.getArg(ArgIdx)->getType());
+
+  auto *NewFunctionType = FunctionType::get(ReturnStructType, NewParamTypes, F.isVarArg());
+  auto *NewFunction = Function::Create(NewFunctionType, F.getLinkage(), F.getAddressSpace(), F.getName());
+
+  // Map old args to new args.
+  ValueToValueMapTy VMap;
+  auto *NewArgIt = NewFunction->arg_begin();
+  for (Argument &OldArg : F.args())
+    if (!PtrToType.count(F.getArg(OldArg.getArgNo())))
+      VMap[&OldArg] = &(*NewArgIt++);
+
+  // Clone the old function into the new one.
+  SmallVector<ReturnInst *, 8> Returns;
+  CloneFunctionInto(NewFunction, &F, VMap, CloneFunctionChangeType::LocalChangesOnly, Returns);
+
+  // Update the return values (make it struct).
+  for (ReturnInst *Ret : Returns) {
+    IRBuilder<> Builder(Ret);
+    SmallVector<Value *, 4> StructValues;
+    // Include original return type, if any
+    if (auto *OriginalFuncTy = F.getReturnType(); !OriginalFuncTy->isVoidTy())
+      StructValues.push_back(Ret->getReturnValue());
+
+    // Create a load instruction to fill the struct element.
+    for (const auto &[Arg, Ty] : PtrToType) {
+      Value *OutVal = Builder.CreateLoad(Ty, VMap[Arg]);
+      StructValues.push_back(OutVal);
+    }
+
+    // Build the return struct incrementally.
+    Value *StructRetVal = UndefValue::get(ReturnStructType);
+    for (unsigned i = 0; i < StructValues.size(); ++i)
+      StructRetVal = Builder.CreateInsertValue(StructRetVal, StructValues[i], i);
+
+    Builder.CreateRet(StructRetVal);
+    Ret->eraseFromParent();
+  }
+
+  replaceArgRetWithStructRetCalls(F, *NewFunction);
+  F.eraseFromParent();
+}
+
 ChangeStatus Attributor::rewriteFunctionSignatures(
     SmallSetVector<Function *, 8> &ModifiedFns) {
   ChangeStatus Changed = ChangeStatus::UNCHANGED;
diff --git a/llvm/test/Transforms/Attributor/remove_out_args.ll b/llvm/test/Transforms/Attributor/remove_out_args.ll
new file mode 100644
index 00000000000000..bd52bf5d80656c
--- /dev/null
+++ b/llvm/test/Transforms/Attributor/remove_out_args.ll
@@ -0,0 +1,16 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes=attributor < %s | FileCheck %s
+
+
+
+define i1 @foo(ptr %dst) {
+; CHECK-LABEL: define noundef i1 @foo(
+; CHECK-SAME: ptr nocapture nofree noundef nonnull writeonly align 4 dereferenceable(4) [[DST:%.*]]) #[[ATTR0:[0-9]+]] {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    store i32 42, ptr [[DST]], align 4
+; CHECK-NEXT:    ret i1 true
+;
+entry:
+  store i32 42, ptr %dst
+  ret i1 1
+}

@github-actions
Copy link

github-actions bot commented Dec 9, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@elhewaty elhewaty force-pushed the attributor-remove-arg-ptrs branch from 1ab8d0c to 8e88711 Compare December 9, 2024 20:46
@elhewaty
Copy link
Member Author

elhewaty commented Dec 9, 2024

I know it's not complete. but I wonder if I am on the right track as I am confused.
another thing is where to call convertOutArgsToRetStruct 😖?

@elhewaty elhewaty changed the title Pack out arguments into a struct [Attributor] Pack out arguments into a struct Dec 9, 2024
@elhewaty elhewaty requested a review from nikic December 9, 2024 20:58
@nikic nikic removed their request for review December 9, 2024 21:12
; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
; RUN: opt -S -passes=attributor < %s | FileCheck %s


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs about 100 more tests llvm/test/CodeGen/AMDGPU/rewrite-out-arguments.ll and llvm/test/CodeGen/AMDGPU/rewrite-out-arguments-address-space.ll would be better starting points

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will. BTW, I don't have an AMD GPU, I have a nvidia one, should J add the tests in NVPTX?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't matter what GPU you have, it doesn't matter when working on the compiler. NVPTX will be less useful for IR testing since it doesn't use a non-0 address space for stack objects

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need all of these tests, using AMDGPU

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arsenm, I will test using AMDGPU, but I don't know why my changes don't change this simple test? Do I need to call the some function somewhere? the optimizer dosn't see my changes or do my changes don't affect the input?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an externally visible function, so it's not allowed to change the signature. It would need to be be internal or similar linkage, with a call use

@elhewaty elhewaty force-pushed the attributor-remove-arg-ptrs branch from 8e88711 to 461834b Compare December 21, 2024 21:40
@elhewaty elhewaty marked this pull request as draft December 21, 2024 21:41
@github-actions
Copy link

github-actions bot commented Dec 21, 2024

⚠️ undef deprecator found issues in your code. ⚠️

You can test this locally with the following command:
git diff -U0 --pickaxe-regex -S '([^a-zA-Z0-9#_-]undef[^a-zA-Z0-9_-]|UndefValue::get)' e8e75e08c9214fe25b56535fc26f5435a875a137 479165962e72bb5a9e5ac31154f092f2f08bbd4e llvm/test/Transforms/Attributor/remove_out_args.ll llvm/include/llvm/Transforms/IPO/Attributor.h llvm/lib/Transforms/IPO/Attributor.cpp llvm/lib/Transforms/IPO/AttributorAttributes.cpp

The following files introduce new uses of undef:

  • llvm/lib/Transforms/IPO/AttributorAttributes.cpp

Undef is now deprecated and should only be used in the rare cases where no replacement is possible. For example, a load of uninitialized memory yields undef. You should use poison values for placeholders instead.

In tests, avoid using undef and having tests that trigger undefined behavior. If you need an operand with some unimportant value, you can add a new argument to the function and use that instead.

For example, this is considered a bad practice:

define void @fn() {
  ...
  br i1 undef, ...
}

Please use the following instead:

define void @fn(i1 %cond) {
  ...
  br i1 %cond, ...
}

Please refer to the Undefined Behavior Manual for more information.

@elhewaty
Copy link
Member Author

@arsenm @shiltian @vidsinghal @jdoerfert @jhuber6
I am sorry but this is my first time with the attributor. I keep getting compilation errors like:

error: invalid new-expression of abstract class type ‘{anonymous}::AAConvertOutArgumentFunction’
13207 |     AA = new (A.Allocator) CLASS##SUFFIX(IRP, A);

But I need you to share your opinions so far, I will write tests to cover all possible cases, but at least I need this to work with the simple test function written before.

What did I mess up while adding the attributor?
I think I need to change the whole thing, but this is my best for now!

@arsenm
Copy link
Contributor

arsenm commented Dec 23, 2024

@arsenm @shiltian @vidsinghal @jdoerfert @jhuber6 I am sorry but this is my first time with the attributor. I keep getting compilation errors like:

error: invalid new-expression of abstract class type ‘{anonymous}::AAConvertOutArgumentFunction’
13207 |     AA = new (A.Allocator) CLASS##SUFFIX(IRP, A);

You didn't implement a virtual method of the base class. If you look further along in the error message it should tell you which ones

@elhewaty elhewaty force-pushed the attributor-remove-arg-ptrs branch from 461834b to 4faf50a Compare December 23, 2024 10:08
@elhewaty
Copy link
Member Author

@arsenm @shiltian @vidsinghal @jdoerfert @jhuber6 I am sorry but this is my first time with the attributor. I keep getting compilation errors like:

error: invalid new-expression of abstract class type ‘{anonymous}::AAConvertOutArgumentFunction’
13207 |     AA = new (A.Allocator) CLASS##SUFFIX(IRP, A);

You didn't implement a virtual method of the base class. If you look further along in the error message it should tell you which ones

@arsenm, Thanks.
Would you take a look? how to use AAConvertOutArgumentFunction It's not changing the sample function in the tests?

@elhewaty
Copy link
Member Author

elhewaty commented Dec 24, 2024

@arsenm, Thanks. Would you take a look? how to use AAConvertOutArgumentFunction It's not changing the sample function in the tests?

@shiltian can you take a look?

const IRPosition &ArgPos = IRPosition::argument(Arg);
auto *AAMem = A.getAAFor<AAMemoryBehavior>(AA, ArgPos, DepClassTy::NONE);

return Arg.hasNoAliasAttr() && AAMem && AAMem->isKnownWriteOnly() &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be querying from AANoAlias instead of directly checking the attribute is present on the argument?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arsenm, in the following example does ptr %dts have any aliases in @foo? as AANoAlias->isKnownNoAlias() returns false.

define internal i1 @foo(ptr %dst) {
entry:
  store i32 42, ptr %dst
  ret i1 true
}


define i1 @fee(i32 %x, i32 %y) {
  %ptr = alloca i32
  %a = call i1 @foo(ptr %ptr, i32 %y)
  %b = load i32, ptr %ptr
  %c = icmp sle i32 %b, %x
  %xor = xor i1 %a, %c
  ret i1 %xor
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This callsite has the wrong type, so this test is UB and it probably doesn't bother trying to optimize it correctly

// Every function can track active assumptions.
getOrCreateAAFor<AAAssumptionInfo>(FPos);

// Every function can have out arguments.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only functions with pointer arguments, not sure if it's worth checking if there are any


bool hasCandidateArg = false;
for (const Argument &Arg : F->args())
if (Arg.getType()->isPointerTy() && isEligibleArgument(Arg, A, *this))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to introduce isEligibleArgumentType, use it consistently. You also need to guard against sret arguments

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't hasPointeeInMemoryValueAttr() do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes

Argument *Arg = F.getArg(argIdx);
if (isEligibleArgument(*Arg, A, *this)) {
CandidateArgs.push_back(Arg);
for (auto UseItr = Arg->use_begin(); UseItr != Arg->use_end(); ++UseItr) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

range loop, probably needs to be early_inc_range

SmallVector<Argument *, 4> CandidateArgs;
for (unsigned argIdx = 0; argIdx < F.arg_size(); ++argIdx) {
Argument *Arg = F.getArg(argIdx);
if (isEligibleArgument(*Arg, A, *this)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Early continue and reduce indent

Comment on lines 13032 to 13034
// If there is no valid candidates then return false.
if (PtrToType.empty())
return ChangeStatus::UNCHANGED;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This shouldn't reach the manifest stage if it's not possible to do anything

Comment on lines +13038 to +13064
if (auto *OriginalFuncTy = F.getReturnType(); !OriginalFuncTy->isVoidTy())
OutStructElementsTypes.push_back(OriginalFuncTy);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the function already has a struct return, you could flatten and insert directly into the struct type. It may be a little more effort to emit though.

You also should check whether TargetLowering::CanLowerReturn succeeds, you should avoid over-committing return values if they're going to be forced to the stack. As written out arguments may be in any address space, not just allocas

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we get the functionality in and then we add the heuristic on top. It should be easier to test the "maximal" approach first.


// Redirect all uses of the old call to the new call.
for (auto &Use : CB->uses())
Use.set(NewCall);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should assert that this is the callee use, and not a data operand or other type of non-call user

; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
; RUN: opt -S -passes=attributor < %s | FileCheck %s


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need all of these tests, using AMDGPU

@elhewaty elhewaty force-pushed the attributor-remove-arg-ptrs branch from 4faf50a to 7e34621 Compare January 1, 2025 16:39
@elhewaty
Copy link
Member Author

elhewaty commented Jan 1, 2025

@arsenm, @jdoerfert @shiltian @vidsinghal

I run into this crash, any help?

opt: /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:168: llvm::ilist_iterator<OptionsT, IsReverse, IsConst>::reference llvm::ilist_iterator<OptionsT, IsReverse, IsConst>::operator*() const [with OptionsT = llvm::ilist_detail::node_options<llvm::BasicBlock, true, false, void, false, void>; bool IsReverse = false; bool IsConst = false; reference = llvm::BasicBlock&]: Assertion `!NodePtr->isKnownSentinel()' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/mohamed/Desktop/open-source/llvm/build/bin/opt -S -passes=attributor
1.	Running pass "attributor" on module "<stdin>"
 #0 0x0000628857e7ad22 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x1a12d22)
 #1 0x0000628857e7799f llvm::sys::RunSignalHandlers() (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x1a0f99f)
 #2 0x0000628857e77ae5 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007383fd445320 (/lib/x86_64-linux-gnu/libc.so.6+0x45320)
 #4 0x00007383fd49eb1c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #5 0x00007383fd49eb1c __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #6 0x00007383fd49eb1c pthread_kill ./nptl/pthread_kill.c:89:10
 #7 0x00007383fd44526e raise ./signal/../sysdeps/posix/raise.c:27:6
 #8 0x00007383fd4288ff abort ./stdlib/abort.c:81:7
 #9 0x00007383fd42881b _nl_load_domain ./intl/loadmsgcat.c:1177:9
#10 0x00007383fd43b507 (/lib/x86_64-linux-gnu/libc.so.6+0x3b507)
#11 0x0000628859ecb52a (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x3a6352a)
#12 0x0000628859f203a7 (anonymous namespace)::AAConvertOutArgumentFunction::manifest(llvm::Attributor&) AttributorAttributes.cpp:0:0
#13 0x0000628859ea74b1 llvm::Attributor::manifestAttributes() (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x3a3f4b1)
#14 0x0000628859ec3f39 llvm::Attributor::run() (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x3a5bf39)
#15 0x0000628859ec6237 runAttributorOnFunctions(llvm::InformationCache&, llvm::SetVector<llvm::Function*, llvm::SmallVector<llvm::Function*, 0u>, llvm::DenseSet<llvm::Function*, llvm::DenseMapInfo<llvm::Function*, void>>, 0u>&, llvm::AnalysisGetter&, llvm::CallGraphUpdater&, bool, bool) (.part.0) Attributor.cpp:0:0
#16 0x0000628859ec6dc0 llvm::AttributorPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x3a5edc0)
#17 0x0000628859ad5f55 llvm::detail::PassModel<llvm::Module, llvm::AttributorPass, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) PassBuilder.cpp:0:0
#18 0x000062885813601d llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x1cce01d)
#19 0x000062885995ff23 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x34f7f23)
#20 0x0000628857e53e0e optMain (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x19ebe0e)
#21 0x00007383fd42a1ca __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3
#22 0x00007383fd42a28b call_init ./csu/../csu/libc-start.c:128:20
#23 0x00007383fd42a28b __libc_start_main ./csu/../csu/libc-start.c:347:5
#24 0x0000628857e46365 _start (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x19de365)
Aborted (core dumped)
Traceback (most recent call last):
  File "/home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/utils/update_test_checks.py", line 370, in <module>
    main()
  File "/home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/utils/update_test_checks.py", line 174, in main
    raw_tool_output = common.invoke_tool(
                      ^^^^^^^^^^^^^^^^^^^
  File "/home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/utils/UpdateTestChecks/common.py", line 531, in invoke_tool
    stdout = subprocess.check_output(
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/subprocess.py", line 466, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '~/Desktop/open-source/llvm/build/bin/opt -S -passes=attributor' returned non-zero exit status 134.

@vidsinghal
Copy link
Contributor

@arsenm, @jdoerfert @shiltian @vidsinghal

I run into this crash, any help?

opt: /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/include/llvm/ADT/ilist_iterator.h:168: llvm::ilist_iterator<OptionsT, IsReverse, IsConst>::reference llvm::ilist_iterator<OptionsT, IsReverse, IsConst>::operator*() const [with OptionsT = llvm::ilist_detail::node_options<llvm::BasicBlock, true, false, void, false, void>; bool IsReverse = false; bool IsConst = false; reference = llvm::BasicBlock&]: Assertion `!NodePtr->isKnownSentinel()' failed.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: /home/mohamed/Desktop/open-source/llvm/build/bin/opt -S -passes=attributor
1.	Running pass "attributor" on module "<stdin>"
 #0 0x0000628857e7ad22 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x1a12d22)
 #1 0x0000628857e7799f llvm::sys::RunSignalHandlers() (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x1a0f99f)
 #2 0x0000628857e77ae5 SignalHandler(int) Signals.cpp:0:0
 #3 0x00007383fd445320 (/lib/x86_64-linux-gnu/libc.so.6+0x45320)
 #4 0x00007383fd49eb1c __pthread_kill_implementation ./nptl/pthread_kill.c:44:76
 #5 0x00007383fd49eb1c __pthread_kill_internal ./nptl/pthread_kill.c:78:10
 #6 0x00007383fd49eb1c pthread_kill ./nptl/pthread_kill.c:89:10
 #7 0x00007383fd44526e raise ./signal/../sysdeps/posix/raise.c:27:6
 #8 0x00007383fd4288ff abort ./stdlib/abort.c:81:7
 #9 0x00007383fd42881b _nl_load_domain ./intl/loadmsgcat.c:1177:9
#10 0x00007383fd43b507 (/lib/x86_64-linux-gnu/libc.so.6+0x3b507)
#11 0x0000628859ecb52a (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x3a6352a)
#12 0x0000628859f203a7 (anonymous namespace)::AAConvertOutArgumentFunction::manifest(llvm::Attributor&) AttributorAttributes.cpp:0:0
#13 0x0000628859ea74b1 llvm::Attributor::manifestAttributes() (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x3a3f4b1)
#14 0x0000628859ec3f39 llvm::Attributor::run() (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x3a5bf39)
#15 0x0000628859ec6237 runAttributorOnFunctions(llvm::InformationCache&, llvm::SetVector<llvm::Function*, llvm::SmallVector<llvm::Function*, 0u>, llvm::DenseSet<llvm::Function*, llvm::DenseMapInfo<llvm::Function*, void>>, 0u>&, llvm::AnalysisGetter&, llvm::CallGraphUpdater&, bool, bool) (.part.0) Attributor.cpp:0:0
#16 0x0000628859ec6dc0 llvm::AttributorPass::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x3a5edc0)
#17 0x0000628859ad5f55 llvm::detail::PassModel<llvm::Module, llvm::AttributorPass, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) PassBuilder.cpp:0:0
#18 0x000062885813601d llvm::PassManager<llvm::Module, llvm::AnalysisManager<llvm::Module>>::run(llvm::Module&, llvm::AnalysisManager<llvm::Module>&) (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x1cce01d)
#19 0x000062885995ff23 llvm::runPassPipeline(llvm::StringRef, llvm::Module&, llvm::TargetMachine*, llvm::TargetLibraryInfoImpl*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::ToolOutputFile*, llvm::StringRef, llvm::ArrayRef<llvm::PassPlugin>, llvm::ArrayRef<std::function<void (llvm::PassBuilder&)>>, llvm::opt_tool::OutputKind, llvm::opt_tool::VerifierKind, bool, bool, bool, bool, bool, bool, bool) (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x34f7f23)
#20 0x0000628857e53e0e optMain (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x19ebe0e)
#21 0x00007383fd42a1ca __libc_start_call_main ./csu/../sysdeps/nptl/libc_start_call_main.h:74:3
#22 0x00007383fd42a28b call_init ./csu/../csu/libc-start.c:128:20
#23 0x00007383fd42a28b __libc_start_main ./csu/../csu/libc-start.c:347:5
#24 0x0000628857e46365 _start (/home/mohamed/Desktop/open-source/llvm/build/bin/opt+0x19de365)
Aborted (core dumped)
Traceback (most recent call last):
  File "/home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/utils/update_test_checks.py", line 370, in <module>
    main()
  File "/home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/utils/update_test_checks.py", line 174, in main
    raw_tool_output = common.invoke_tool(
                      ^^^^^^^^^^^^^^^^^^^
  File "/home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/utils/UpdateTestChecks/common.py", line 531, in invoke_tool
    stdout = subprocess.check_output(
             ^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/subprocess.py", line 466, in check_output
    return run(*popenargs, stdout=PIPE, timeout=timeout, check=True,
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/lib/python3.12/subprocess.py", line 571, in run
    raise CalledProcessError(retcode, process.args,
subprocess.CalledProcessError: Command '~/Desktop/open-source/llvm/build/bin/opt -S -passes=attributor' returned non-zero exit status 134.

It seems like the assertion Assertion `!NodePtr->isKnownSentinel()' failed.
It seems like there are no line numbers in your trace... Are you compiling with relWithDebug?
I usually get line numbers with just the Debug build, maybe you should try that.

@elhewaty
Copy link
Member Author

elhewaty commented Jan 4, 2025

@vidsinghal @arsenm,

What trying to do here, is creating a new variables instead of old argument args using allocas.

BasicBlock *EntryBlock = BasicBlock::Create(NewFunction->getContext(), "entry", NewFunction);
    IRBuilder<> EntryBuilder(EntryBlock);
    for (auto &OldArg : F.args()) {
      if (PtrToType.count(&OldArg)) {
        // The crash happens here, any hint?
        AllocaInst *Alloca = EntryBuilder.CreateAlloca(  // this is AttributorAttributes.cpp:13069:21
            PtrToType[&OldArg], nullptr, OldArg.getName());
        EntryBuilder.CreateStore(&OldArg, Alloca);
        VMap[&OldArg] = Alloca;
      } else
        VMap[&OldArg] = &(*NewArgIt++);
    }

here's the stacktrace

 #0 0x000057e6110124b1 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:723:11
 #1 0x000057e6110129ab PrintStackTraceSignalHandler(void*) /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:798:1
 #2 0x000057e6110109a6 llvm::sys::RunSignalHandlers() /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/lib/Support/Signals.cpp:105:5
 #3 0x000057e611013145 SignalHandler(int) /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/lib/Support/Unix/Signals.inc:413:1
 #4 0x0000741e19245320 (/lib/x86_64-linux-gnu/libc.so.6+0x45320)
 #5 0x000057e6111d8e6c llvm::SmallVectorTemplateCommon<llvm::DataLayout::PrimitiveSpec, void>::begin() const /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/include/llvm/ADT/SmallVector.h:268:63
 #6 0x000057e6111d8e4d decltype(begin(std::forward<llvm::SmallVector<llvm::DataLayout::PrimitiveSpec, 6u> const&>(fp))) llvm::adl_detail::begin_impl<llvm::SmallVector<llvm::DataLayout::PrimitiveSpec, 6u> const&>(llvm::SmallVector<llvm::DataLayout::PrimitiveSpec, 6u> const&) /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/include/llvm/ADT/ADL.h:29:3
 #7 0x000057e6111d8dd5 decltype(adl_detail::begin_impl(std::forward<llvm::SmallVector<llvm::DataLayout::PrimitiveSpec, 6u> const&>(fp))) llvm::adl_begin<llvm::SmallVector<llvm::DataLayout::PrimitiveSpec, 6u> const&>(llvm::SmallVector<llvm::DataLayout::PrimitiveSpec, 6u> const&) /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/include/llvm/ADT/ADL.h:80:3
 #8 0x000057e6111d4179 auto llvm::lower_bound<llvm::SmallVector<llvm::DataLayout::PrimitiveSpec, 6u> const&, unsigned int&, (anonymous namespace)::LessPrimitiveBitWidth>(llvm::SmallVector<llvm::DataLayout::PrimitiveSpec, 6u> const&, unsigned int&, (anonymous namespace)::LessPrimitiveBitWidth) /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/include/llvm/ADT/STLExtras.h:1985:27
 #9 0x000057e6111d40eb llvm::DataLayout::getIntegerAlignment(unsigned int, bool) const /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/lib/IR/DataLayout.cpp:698:12
#10 0x000057e6111d47ee llvm::DataLayout::getAlignment(llvm::Type*, bool) const /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/lib/IR/DataLayout.cpp:794:12
#11 0x000057e6111d4adf llvm::DataLayout::getPrefTypeAlign(llvm::Type*) const /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/lib/IR/DataLayout.cpp:848:10
#12 0x000057e611580d1d llvm::IRBuilderBase::CreateAlloca(llvm::Type*, llvm::Value*, llvm::Twine const&) /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/include/llvm/IR/IRBuilder.h:1806:28
#13 0x000057e61329b3f4 (anonymous namespace)::AAConvertOutArgumentFunction::manifest(llvm::Attributor&) /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/lib/Transforms/IPO/AttributorAttributes.cpp:13069:21
#14 0x000057e613233df8 llvm::Attributor::manifestAttributes() /home/mohamed/Desktop/open-source/llvm/llvm-project/llvm/lib/Transforms/IPO/Attributor.cpp:2292:18

@arsenm
Copy link
Contributor

arsenm commented Jan 6, 2025

    AllocaInst *Alloca = EntryBuilder.CreateAlloca(  // this is AttributorAttributes.cpp:13069:21
        PtrToType[&OldArg], nullptr, OldArg.getName());

You're not using the correct address space for the alloca. Regardless of that, I think the crash is probably because PtrToType[&OldArg] is nullptr

@elhewaty
Copy link
Member Author

elhewaty commented Jan 7, 2025

@arsenm I tried to loop over PtrToType and assert to check if it has nullptr values but it doesn't. I think maybe the address space that causes the problem, any hint?

The wrong address space shouldn't cause this crash

@elhewaty elhewaty force-pushed the attributor-remove-arg-ptrs branch from 7e34621 to e4be7f0 Compare January 26, 2025 18:05
@elhewaty elhewaty force-pushed the attributor-remove-arg-ptrs branch from e4be7f0 to 4791659 Compare February 19, 2025 22:22
@elhewaty
Copy link
Member Author

@jdoerfert
Here is the generated function, I think the code generates it perfectly for this case.
how to improve the callsite class?

define internal %foo_out @foo.converted(i32 %0, i32 %1) {
entry:
  %dst_ = alloca i32, align 4
  br label %entry1

entry1:                                           ; preds = %entry
  %x = xor i32 %0, 13
  %y = add i32 %1, 5
  %z = icmp sle i32 %x, %y
  br i1 %z, label %if, label %else

if:                                               ; preds = %entry1
  store i32 %x, ptr %dst_, align 4
  br label %end

else:                                             ; preds = %entry1
  store i32 %y, ptr %dst_, align 4
  br label %end

end:                                              ; preds = %else, %if
  %t = mul i32 %x, %y
  %tt = xor i32 %x, %y
  %result = icmp eq i32 %t, %tt
  %2 = load i32, ptr %dst_, align 4
  %3 = insertvalue %foo_out undef, i1 %result, 0
  %4 = insertvalue %foo_out %3, i32 %2, 1
  ret %foo_out %4
}

thanks.

Copy link
Member

@jdoerfert jdoerfert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments and pointers to APIs that you should look into/use. It might help to list the steps explicitly here:

  1. Update: Check for each argument if we can rewrite it:
  • NoAlias + AAPointerInfo only has stores and they are all at known offsets.
  • isValidFunctionSignature/ReturnRewrite succeeds, thus all call sites are known and can be rewritten (more on that below).
  1. Manifest:
  • Remove the arguments with the rewrite API.
  • Add a stack allocation into the caller with a size to allow all AAPointerInfo tracked accesses.
  • Register rewrites of all the store pointers in AAPointerInfo to the stack allocation.
  • Introduce loads of all the values, or an approximation thereof (e.g., one i64 instead of two adjacent i32's) at each return of the function.
  • Register rewrites of the return value to be a struct of the loaded values and the original return value, potentially flattened into one struct.
  • Register the rewrite of the return type based on the new return type. The rewrite callback has to introduce stores of the returned values back into the originally passed pointer and the offsets in AAPointerInfo. e.g.,
void foo(int *out_2_ints);

becomes

{int, int} R = foo();
out[0] = get<0>(R);
out[1] = get<1>(R);

if (!markedAsAAConvertArgument) {
getOrCreateAAFor<AAConvertOutArgument>(FPos);
markedAsAAConvertArgument = true;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should just be with the other FPos attributes before the argument loop.
You can put an if (F.arg_size()) around it or sth.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So every function with at least one argument is a candidate to have out argument?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. I don't know how aggressive the filters here should try to be, but it could more specifically be there is at least one pointer argument that is writable

/// ----------- AAConvertOutArgument ----------
namespace {
static bool isEligibleArgument(const Argument &Arg, Attributor &A,
const AbstractAttribute &AA) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a (static) member function, no need for the global namespace. If it's a regular member you don't need to pass AA.

auto *NoAlias = A.getAAFor<AANoAlias>(AA, ArgPos, DepClassTy::OPTIONAL);

return AAMem && NoAlias && AAMem->isAssumedWriteOnly() &&
NoAlias->isAssumedNoAlias() && !Arg.hasPointeeInMemoryValueAttr();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the hasPointeeInMemoryValueAttr stuff first, with the type above.

ChangeStatus updateImpl(Attributor &A) override {
const Function *F = getAssociatedFunction();
if (!F || F->isDeclaration())
return indicatePessimisticFixpoint();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this check, you checked in initialize.

AAConvertOutArgumentFunction(const IRPosition &IRP, Attributor &A)
: AAConvertOutArgument(IRP, A) {}

SmallVector<bool> ArgumentsStates;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make it a bitvector/bitset. I think ADT has one of those as well.

for (unsigned ArgIdx = 0; ArgIdx < F.arg_size(); ++ArgIdx) {
Argument *Arg = F.getArg(ArgIdx);
if (!isEligibleArgument(*Arg, A, *this))
continue;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this check, just look into the AssumedState.

// AAPointerInfo on args
for (auto &Use : Arg->uses())
if (auto *Store = dyn_cast<StoreInst>(Use.getUser()))
PtrToType[Arg] = Store->getValueOperand()->getType();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to handle more than just stores users here. What if the user is a gep, or some other instruction. You can use AAPointerInfo instead of AAMemoryBehavior and then iterate over all the instructions that actually modify the pointer. It will also tell you the offsets, which you'll likely need.

Comment on lines +13038 to +13064
if (auto *OriginalFuncTy = F.getReturnType(); !OriginalFuncTy->isVoidTy())
OutStructElementsTypes.push_back(OriginalFuncTy);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd say we get the functionality in and then we add the heuristic on top. It should be easier to test the "maximal" approach first.

const Function *F = getAssociatedFunction();
if (!F || F->isDeclaration())
return indicatePessimisticFixpoint();

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you have to rewrite all the call sites, you need to check if they are amenable. Use Attributor::checkForAllCallSites to do so, and verify (1) you see them all, and (2) you can properly rewrite them. For (2) I don't have all of the conditions at hand but setting up the check will allow us to easily fill them. In manifest you can use the same API to iterate over all call sites again, create new ones and new code that sets up the return values into the originally passed pointers.

FunctionType::get(ReturnStructType, NewParamTypes, F.isVarArg());
auto *NewFunction =
Function::Create(NewFunctionType, F.getLinkage(), F.getAddressSpace(),
F.getName() + ".converted");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out

Attributor::isValidFunctionSignatureRewrite

I believe it will even check the call sites for you so you don't have to do it yourself (basically ignore what I wrote above).

Attributor::registerFunctionSignatureRewrite 

And this will do the function cloning and other things for you.

You might need to add a new pair of APIs though:

Attributor::isValidFunctionReturnRewrite

and

Attributor::registerFunctionReturnRewrite

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants