Skip to content

Conversation

artagnon
Copy link
Contributor

@artagnon artagnon commented Oct 1, 2025

No description provided.

@llvmbot
Copy link
Member

llvmbot commented Oct 1, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Ramkumar Ramachandra (artagnon)

Changes

Full diff: https://github.com/llvm/llvm-project/pull/161509.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp (+1-1)
  • (added) llvm/test/Transforms/LoopIdiom/cyclic-redundancy-check-dl.ll (+50)
diff --git a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
index 0874b29ab7d22..b73d2efb9d318 100644
--- a/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
+++ b/llvm/lib/Transforms/Scalar/LoopIdiomRecognize.cpp
@@ -1654,7 +1654,7 @@ bool LoopIdiomRecognize::optimizeCRCLoop(const PolynomialInfo &Info) {
                                     : LoByte(Builder, Indexer, "indexer.lo");
 
     // Always index into a GEP using the index type.
-    Indexer = Builder.CreateZExt(
+    Indexer = Builder.CreateZExtOrTrunc(
         Indexer, SE->getDataLayout().getIndexType(GV->getType()),
         "indexer.ext");
 
diff --git a/llvm/test/Transforms/LoopIdiom/cyclic-redundancy-check-dl.ll b/llvm/test/Transforms/LoopIdiom/cyclic-redundancy-check-dl.ll
new file mode 100644
index 0000000000000..691523c5107e5
--- /dev/null
+++ b/llvm/test/Transforms/LoopIdiom/cyclic-redundancy-check-dl.ll
@@ -0,0 +1,50 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --check-globals all --version 6
+; RUN: opt -passes=loop-idiom -S %s | FileCheck %s
+
+target datalayout = "p:16:16"
+
+;.
+; CHECK: @.crctable = private constant [256 x i32] zeroinitializer
+;.
+define void @test_with_dl() {
+; CHECK-LABEL: define void @test_with_dl() {
+; CHECK-NEXT:  [[ENTRY:.*]]:
+; CHECK-NEXT:    br label %[[PH:.*]]
+; CHECK:       [[PH_LOOPEXIT:.*]]:
+; CHECK-NEXT:    [[CRC_NEXT_LCSSA:%.*]] = phi i32 [ [[CRC_NEXT3:%.*]], %[[LOOP:.*]] ]
+; CHECK-NEXT:    br label %[[PH]]
+; CHECK:       [[PH]]:
+; CHECK-NEXT:    [[CRC_USE:%.*]] = phi i32 [ 0, %[[ENTRY]] ], [ [[CRC_NEXT_LCSSA]], %[[PH_LOOPEXIT]] ]
+; CHECK-NEXT:    br label %[[LOOP]]
+; CHECK:       [[LOOP]]:
+; CHECK-NEXT:    [[IV:%.*]] = phi i16 [ 0, %[[PH]] ], [ [[IV_NEXT:%.*]], %[[LOOP]] ]
+; CHECK-NEXT:    [[CRC2:%.*]] = phi i32 [ 0, %[[PH]] ], [ [[CRC_NEXT3]], %[[LOOP]] ]
+; CHECK-NEXT:    [[INDEXER_LO:%.*]] = and i32 [[CRC2]], 255
+; CHECK-NEXT:    [[INDEXER_EXT:%.*]] = trunc i32 [[INDEXER_LO]] to i16
+; CHECK-NEXT:    [[TBL_PTRADD:%.*]] = getelementptr inbounds i32, ptr @.crctable, i16 [[INDEXER_EXT]]
+; CHECK-NEXT:    [[TBL_LD:%.*]] = load i32, ptr [[TBL_PTRADD]], align 4
+; CHECK-NEXT:    [[CRC_LE_SHIFT:%.*]] = lshr i32 [[CRC2]], 8
+; CHECK-NEXT:    [[CRC_NEXT3]] = xor i32 [[CRC_LE_SHIFT]], [[TBL_LD]]
+; CHECK-NEXT:    [[IV_NEXT]] = add i16 [[IV]], 1
+; CHECK-NEXT:    [[EXIT_COND1:%.*]] = icmp ne i16 [[IV]], 0
+; CHECK-NEXT:    br i1 [[EXIT_COND1]], label %[[LOOP]], label %[[PH_LOOPEXIT]]
+;
+entry:
+  br label %ph
+
+ph:
+  %crc.use = phi i32 [ 0, %entry ], [ %crc.next, %loop ]
+  br label %loop
+
+loop:
+  %iv = phi i16 [ 0, %ph ], [ %iv.next, %loop ]
+  %crc = phi i32 [ 0, %ph ], [ %crc.next, %loop ]
+  %lshr.crc.1 = lshr i32 %crc, 1
+  %crc.and.1 = and i32 %crc, 1
+  %sb.check = icmp eq i32 %crc.and.1, 0
+  %xor = xor i32 %lshr.crc.1, 0
+  %crc.next = select i1 %sb.check, i32 %lshr.crc.1, i32 %xor
+  %iv.next = add i16 %iv, 1
+  %exit.cond = icmp ult i16 %iv, 7
+  br i1 %exit.cond, label %loop, label %ph
+}

@mikaelholmen
Copy link
Collaborator

Thanks for the quick fix, this does solve the problem I saw.

Copy link
Contributor

@pfusik pfusik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@artagnon artagnon merged commit bb16c56 into llvm:main Oct 1, 2025
9 checks passed
@artagnon artagnon deleted the lir-crc-dl-crash branch October 1, 2025 16:14
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Oct 3, 2025
@Guohua-W
Copy link

Guohua-W commented Oct 9, 2025

Hello, I have a question: Can HashRecognize recognize both byte-level and bit-level CRC computations? If I have designed dedicated hardware instructions for the RISC-V architecture, how can I leverage the results from HashRecognize to optimize the code by replacing software CRC computations with these hardware instructions?

@artagnon
Copy link
Contributor Author

artagnon commented Oct 9, 2025

Hello, I have a question: Can HashRecognize recognize both byte-level and bit-level CRC computations? If I have designed dedicated hardware instructions for the RISC-V architecture, how can I leverage the results from HashRecognize to optimize the code by replacing software CRC computations with these hardware instructions?

Currently, HashRecognize recognizes bit-level CRC computations and LoopIdiom optimizes it using a byte-level table-lookup. I think the pending work to enable the use of hardware instructions is to introduce a generic llvm.clmul, but nobody is working on it at the moment. Kindly feel free to pick up the task, if you're so inclined.

@Guohua-W
Copy link

Guohua-W commented Oct 9, 2025

Hello, I have a question: Can HashRecognize recognize both byte-level and bit-level CRC computations? If I have designed dedicated hardware instructions for the RISC-V architecture, how can I leverage the results from HashRecognize to optimize the code by replacing software CRC computations with these hardware instructions?

Currently, HashRecognize recognizes bit-level CRC computations and LoopIdiom optimizes it using a byte-level table-lookup. I think the pending work to enable the use of hardware instructions is to introduce a generic llvm.clmul, but nobody is working on it at the moment. Kindly feel free to pick up the task, if you're so inclined.

Thank you for your response. You've done a fantastic job, and your answers have truly inspired me. I'm relatively new to LLVM, and my current goal is to recognize CRC loops and optimize them using my custom instructions. For example, I've already implemented a hardware instruction like CRC32C.B rd, rs1, rs2 in hardware. I can invoke this instruction through an intrinsic function, such as @llvm.riscv.crc32.b. I'd like to leverage the results from HashRecognize to achieve this optimization. Could you please give me some practical advice on how to proceed?

@artagnon
Copy link
Contributor Author

artagnon commented Oct 9, 2025

Hello, I have a question: Can HashRecognize recognize both byte-level and bit-level CRC computations? If I have designed dedicated hardware instructions for the RISC-V architecture, how can I leverage the results from HashRecognize to optimize the code by replacing software CRC computations with these hardware instructions?

Currently, HashRecognize recognizes bit-level CRC computations and LoopIdiom optimizes it using a byte-level table-lookup. I think the pending work to enable the use of hardware instructions is to introduce a generic llvm.clmul, but nobody is working on it at the moment. Kindly feel free to pick up the task, if you're so inclined.

Thank you for your response. You've done a fantastic job, and your answers have truly inspired me. I'm relatively new to LLVM, and my current goal is to recognize CRC loops and optimize them using my custom instructions. For example, I've already implemented a hardware instruction like CRC32C.B rd, rs1, rs2 in hardware. I can invoke this instruction through an intrinsic function, such as @llvm.riscv.crc32.b. I'd like to leverage the results from HashRecognize to achieve this optimization. Could you please give me some practical advice on how to proceed?

Oh, it's quite simple then. Create a target-hook to query if there's a valid lowering for llvm.riscv.crc32.b (for your custom hardware, yes), and modify optimizeCRCLoop with a query for this hook: if so, simply take Info.LHS, Info.LHSAux and Info.RHS, feed them into llvm.riscv.crc32.b, delete the existing loop, and skip the existing code? You don't need to call genSarwateTable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants