[llvm][CodeExtractor] fix bug in parameter naming #114237

tblah · 2024-10-30T14:19:14Z

The code extractor tries to apply the names of source input and output values to function arguments. Not all input and output values get added as arguments: some are instead placed inside of a struct passed to the function. The existing renaming code skipped trying to set these struct-packed arguments names (as there is no corresponding function argument to rename), but it still incremented the iterator over the function arguments. This could result in dereferencing an end iterator if struct-packed inputs/outputs preceded non-struct-packed inputs/outputs.

This patch rewrites this loop to avoid the end iterator dereference.

The code extractor tries to apply the names of source input and output values to function arguments. Not all input and output values get added as arguments: some are instead placed inside of a struct passed to the function. The existing renaming code skipped trying to set these struct-packed arguments names (as there is no corresponding function argument to rename), but it still incremented the iterator over the function arguments. This could result in dereferencing an end iterator if struct-packed inputs/outputs preceded non-struct-packed inputs/outputs. This patch rewrites this loop to avoid the end iterator dereference.

llvmbot · 2024-10-30T14:19:48Z

@llvm/pr-subscribers-llvm-transforms

Author: Tom Eccles (tblah)

Changes

The code extractor tries to apply the names of source input and output values to function arguments. Not all input and output values get added as arguments: some are instead placed inside of a struct passed to the function. The existing renaming code skipped trying to set these struct-packed arguments names (as there is no corresponding function argument to rename), but it still incremented the iterator over the function arguments. This could result in dereferencing an end iterator if struct-packed inputs/outputs preceded non-struct-packed inputs/outputs.

This patch rewrites this loop to avoid the end iterator dereference.

I haven't contributed to llvm-project/llvm before and wasn't sure how best to test this. I came across the bug while working on the OpenMP MLIR dialect's lowering to LLVM IR

Full diff: https://github.com/llvm/llvm-project/pull/114237.diff

1 Files Affected:

(modified) llvm/lib/Transforms/Utils/CodeExtractor.cpp (+13-11)

diff --git a/llvm/lib/Transforms/Utils/CodeExtractor.cpp b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
index ed4ad15e5ab695..fa467cc72bd020 100644
--- a/llvm/lib/Transforms/Utils/CodeExtractor.cpp
+++ b/llvm/lib/Transforms/Utils/CodeExtractor.cpp
@@ -823,17 +823,22 @@ Function *CodeExtractor::constructFunction(const ValueSet &inputs,
 
   std::vector<Type *> ParamTy;
   std::vector<Type *> AggParamTy;
+  std::vector<std::tuple<unsigned, Value *>> NumberedInputs;
+  std::vector<std::tuple<unsigned, Value *>> NumberedOutputs;
   ValueSet StructValues;
   const DataLayout &DL = M->getDataLayout();
 
   // Add the types of the input values to the function's argument list
+  unsigned ArgNum = 0;
   for (Value *value : inputs) {
     LLVM_DEBUG(dbgs() << "value used in func: " << *value << "\n");
     if (AggregateArgs && !ExcludeArgsFromAggregate.contains(value)) {
       AggParamTy.push_back(value->getType());
       StructValues.insert(value);
-    } else
+    } else {
       ParamTy.push_back(value->getType());
+      NumberedInputs.emplace_back(ArgNum++, value);
+    }
   }
 
   // Add the types of the output values to the function's argument list.
@@ -842,9 +847,11 @@ Function *CodeExtractor::constructFunction(const ValueSet &inputs,
     if (AggregateArgs && !ExcludeArgsFromAggregate.contains(output)) {
       AggParamTy.push_back(output->getType());
       StructValues.insert(output);
-    } else
+    } else {
       ParamTy.push_back(
           PointerType::get(output->getType(), DL.getAllocaAddrSpace()));
+      NumberedOutputs.emplace_back(ArgNum++, output);
+    }
   }
 
   assert(
@@ -1053,15 +1060,10 @@ Function *CodeExtractor::constructFunction(const ValueSet &inputs,
   }
 
   // Set names for input and output arguments.
-  if (NumScalarParams) {
-    ScalarAI = newFunction->arg_begin();
-    for (unsigned i = 0, e = inputs.size(); i != e; ++i, ++ScalarAI)
-      if (!StructValues.contains(inputs[i]))
-        ScalarAI->setName(inputs[i]->getName());
-    for (unsigned i = 0, e = outputs.size(); i != e; ++i, ++ScalarAI)
-      if (!StructValues.contains(outputs[i]))
-        ScalarAI->setName(outputs[i]->getName() + ".out");
-  }
+  for (auto [i, argVal] : NumberedInputs)
+    newFunction->getArg(i)->setName(argVal->getName());
+  for (auto [i, argVal] : NumberedOutputs)
+    newFunction->getArg(i)->setName(argVal->getName() + ".out");
 
   // Rewrite branches to basic blocks outside of the loop to new dummy blocks
   // within the new function. This must be done before we lose track of which

arsenm · 2024-10-30T17:20:10Z

The other tests in test/Transforms/CodeExtractor/ look like they all use partial-inliner, check some of those for an example?

Meinersbur · 2024-10-31T16:04:25Z

I think I have a more comprehensive fix here that does not need to introduce new maps to workaround bugs: #114419 Would you review it? I'd rebase it in that case.

A patch that fixes a bug is not NFC. The test I created is here: https://github.com/llvm/llvm-project/pull/114419/files#diff-84eabd1acf3e19fc7209c39a646ae0165b78aedbe0f13617db319184e00e11f2

tblah · 2024-10-31T21:33:15Z

I think I have a more comprehensive fix here that does not need to introduce new maps to workaround bugs: #114419 Would you review it? I'd rebase it in that case.

Thanks for posting your solution. Your patch does pass my regression test. Can we continue with my patch to get a temporary fix in until you can finish your patch, or would you prefer I close this PR?

Meinersbur

LGTM

Can I get an answer to my question whether you are going to review #114419? I am asking out of experience, several people abandoning a final review before, including the author(s) of the bug(s). It's frustrating for all the work put into it.

tblah · 2024-11-01T10:16:40Z

Can I get an answer to my question whether you are going to review #114419? I am asking out of experience, several people abandoning a final review before, including the author(s) of the bug(s). It's frustrating for all the work put into it.

I will do my best but I think we also need at least one reviewer with some experience in this area.

pawosm-arm · 2024-11-01T11:04:30Z

Can I get an answer to my question whether you are going to review #114419? I am asking out of experience, several people abandoning a final review before, including the author(s) of the bug(s). It's frustrating for all the work put into it.

I will do my best but I think we also need at least one reviewer with some experience in this area.

Finding reviewers for that one would go easier if the PR message could be updated slightly. I had to visit several Phabricator tickets and I'm still not sure what is the proposal there and what is the extension of other proposal (and what was the fate of the original proposal). Many potential candidates would scratch their heads several times before deciding to participate.

The code extractor tries to apply the names of source input and output values to function arguments. Not all input and output values get added as arguments: some are instead placed inside of a struct passed to the function. The existing renaming code skipped trying to set these struct-packed arguments names (as there is no corresponding function argument to rename), but it still incremented the iterator over the function arguments. This could result in dereferencing an end iterator if struct-packed inputs/outputs preceded non-struct-packed inputs/outputs. This patch rewrites this loop to avoid the end iterator dereference.

tblah requested review from arsenm, ggeorgakoudis, kiranchandramohan and pawosm-arm October 30, 2024 14:19

llvmbot added the llvm:transforms label Oct 30, 2024

tblah changed the title ~~[llvm][CodeExtractor] fix bug in parameter naming~~ [llvm][CodeExtractor][NFC] fix bug in parameter naming Oct 30, 2024

kiranchandramohan requested review from Meinersbur and jdoerfert October 31, 2024 13:22

Add test

9073212

tblah changed the title ~~[llvm][CodeExtractor][NFC] fix bug in parameter naming~~ [llvm][CodeExtractor] fix bug in parameter naming Oct 31, 2024

Meinersbur approved these changes Nov 1, 2024

View reviewed changes

tblah merged commit 4aaa925 into llvm:main Nov 4, 2024
8 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[llvm][CodeExtractor] fix bug in parameter naming #114237

[llvm][CodeExtractor] fix bug in parameter naming #114237

Uh oh!

tblah commented Oct 30, 2024 •

edited

Loading

Uh oh!

llvmbot commented Oct 30, 2024

Uh oh!

arsenm commented Oct 30, 2024

Uh oh!

Meinersbur commented Oct 31, 2024 •

edited

Loading

Uh oh!

tblah commented Oct 31, 2024

Uh oh!

Meinersbur left a comment

Uh oh!

tblah commented Nov 1, 2024

Uh oh!

pawosm-arm commented Nov 1, 2024 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

[llvm][CodeExtractor] fix bug in parameter naming #114237

[llvm][CodeExtractor] fix bug in parameter naming #114237

Uh oh!

Conversation

tblah commented Oct 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

llvmbot commented Oct 30, 2024

Uh oh!

arsenm commented Oct 30, 2024

Uh oh!

Meinersbur commented Oct 31, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tblah commented Oct 31, 2024

Uh oh!

Meinersbur left a comment

Choose a reason for hiding this comment

Uh oh!

tblah commented Nov 1, 2024

Uh oh!

pawosm-arm commented Nov 1, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

tblah commented Oct 30, 2024 •

edited

Loading

Meinersbur commented Oct 31, 2024 •

edited

Loading

pawosm-arm commented Nov 1, 2024 •

edited

Loading