[TableGen][CodeEmitter] Refactor addCodeToMergeInOperand (NFC) #158674

s-barannikov · 2025-09-15T16:26:12Z

Use streams to avoid dealing with std::string
Print operand masks in hex
Make the output more succinct

* Use streams to avoid dealing with std::string * Print operand masks in hex * Make the output more succinct

llvmbot · 2025-09-15T16:26:54Z

@llvm/pr-subscribers-tablegen

Author: Sergei Barannikov (s-barannikov)

Changes

Use streams to avoid dealing with std::string
Print operand masks in hex
Make the output more succinct

Full diff: https://github.com/llvm/llvm-project/pull/158674.diff

1 Files Affected:

(modified) llvm/utils/TableGen/CodeEmitterGen.cpp (+31-87)

diff --git a/llvm/utils/TableGen/CodeEmitterGen.cpp b/llvm/utils/TableGen/CodeEmitterGen.cpp
index 588d354d1d293..a61ba54d3ffd2 100644
--- a/llvm/utils/TableGen/CodeEmitterGen.cpp
+++ b/llvm/utils/TableGen/CodeEmitterGen.cpp
@@ -31,6 +31,7 @@
 #include "llvm/ADT/ArrayRef.h"
 #include "llvm/ADT/StringExtras.h"
 #include "llvm/Support/Casting.h"
+#include "llvm/Support/Format.h"
 #include "llvm/Support/FormatVariadic.h"
 #include "llvm/Support/raw_ostream.h"
 #include "llvm/TableGen/Error.h"
@@ -139,58 +140,28 @@ bool CodeEmitterGen::addCodeToMergeInOperand(const Record *R,
   StringRef EncoderMethodName =
       CGI.Operands[SO.first].EncoderMethodNames[SO.second];
 
-  if (UseAPInt)
-    Case += "      op.clearAllBits();\n";
+  raw_string_ostream OS(Case);
+  indent Indent(6);
+
+  OS << Indent << "// op: " << VarName << '\n';
 
-  Case += "      // op: " + VarName + "\n";
+  if (UseAPInt)
+    OS << Indent << "op.clearAllBits();\n";
 
-  // If the source operand has a custom encoder, use it.
   if (!EncoderMethodName.empty()) {
-    raw_string_ostream CaseOS(Case);
-    CaseOS << indent(6);
     if (UseAPInt)
-      CaseOS << EncoderMethodName << "(MI, " << OpIdx << ", op";
+      OS << Indent << EncoderMethodName << "(MI, " << OpIdx
+         << ", op, Fixups, STI);\n";
     else
-      CaseOS << "op = " << EncoderMethodName << "(MI, " << OpIdx;
-    CaseOS << ", Fixups, STI);\n";
+      OS << Indent << "op = " << EncoderMethodName << "(MI, " << OpIdx
+         << ", Fixups, STI);\n";
   } else {
-    if (UseAPInt) {
-      Case +=
-          "      getMachineOpValue(MI, MI.getOperand(" + utostr(OpIdx) + ")";
-      Case += ", op, Fixups, STI";
-    } else {
-      Case += "      op = getMachineOpValue(MI, MI.getOperand(" +
-              utostr(OpIdx) + ")";
-      Case += ", Fixups, STI";
-    }
-    Case += ");\n";
-  }
-
-  // Precalculate the number of lits this variable contributes to in the
-  // operand. If there is a single lit (consecutive range of bits) we can use a
-  // destructive sequence on APInt that reduces memory allocations.
-  int NumOperandLits = 0;
-  for (int TmpBit = Bit; TmpBit >= 0;) {
-    int VarBit = getVariableBit(VarName, BI, TmpBit);
-
-    // If this bit isn't from a variable, skip it.
-    if (VarBit == -1) {
-      --TmpBit;
-      continue;
-    }
-
-    // Figure out the consecutive range of bits covered by this operand, in
-    // order to generate better encoding code.
-    int BeginVarBit = VarBit;
-    int N = 1;
-    for (--TmpBit; TmpBit >= 0;) {
-      VarBit = getVariableBit(VarName, BI, TmpBit);
-      if (VarBit == -1 || VarBit != (BeginVarBit - N))
-        break;
-      ++N;
-      --TmpBit;
-    }
-    ++NumOperandLits;
+    if (UseAPInt)
+      OS << Indent << "getMachineOpValue(MI, MI.getOperand(" << OpIdx
+         << "), op, Fixups, STI);\n";
+    else
+      OS << Indent << "op = getMachineOpValue(MI, MI.getOperand(" << OpIdx
+         << "), Fixups, STI);\n";
   }
 
   unsigned BitOffset = -1;
@@ -216,52 +187,25 @@ bool CodeEmitterGen::addCodeToMergeInOperand(const Record *R,
       --Bit;
     }
 
-    std::string MaskStr;
-    int OpShift;
-
     unsigned LoBit = BeginVarBit - N + 1;
-    unsigned HiBit = LoBit + N;
     unsigned LoInstBit = BeginInstBit - N + 1;
     BitOffset = LoInstBit;
     if (UseAPInt) {
-      std::string ExtractStr;
-      if (N >= 64) {
-        ExtractStr = "op.extractBits(" + itostr(HiBit - LoBit) + ", " +
-                     itostr(LoBit) + ")";
-        Case += "      Value.insertBits(" + ExtractStr + ", " +
-                itostr(LoInstBit) + ");\n";
-      } else {
-        ExtractStr = "op.extractBitsAsZExtValue(" + itostr(HiBit - LoBit) +
-                     ", " + itostr(LoBit) + ")";
-        Case += "      Value.insertBits(" + ExtractStr + ", " +
-                itostr(LoInstBit) + ", " + itostr(HiBit - LoBit) + ");\n";
-      }
+      if (N > 64)
+        OS << Indent << "Value.insertBits(op.extractBits(" << N << ", " << LoBit
+           << "), " << LoInstBit << ");\n";
+      else
+        OS << Indent << "Value.insertBits(op.extractBitsAsZExtValue(" << N
+           << ", " << LoBit << "), " << LoInstBit << ", " << N << ");\n";
     } else {
-      uint64_t OpMask = ~(uint64_t)0 >> (64 - N);
-      OpShift = BeginVarBit - N + 1;
-      OpMask <<= OpShift;
-      MaskStr = "UINT64_C(" + utostr(OpMask) + ")";
-      OpShift = BeginInstBit - BeginVarBit;
-
-      if (NumOperandLits == 1) {
-        Case += "      op &= " + MaskStr + ";\n";
-        if (OpShift > 0) {
-          Case += "      op <<= " + itostr(OpShift) + ";\n";
-        } else if (OpShift < 0) {
-          Case += "      op >>= " + itostr(-OpShift) + ";\n";
-        }
-        Case += "      Value |= op;\n";
-      } else {
-        if (OpShift > 0) {
-          Case += "      Value |= (op & " + MaskStr + ") << " +
-                  itostr(OpShift) + ";\n";
-        } else if (OpShift < 0) {
-          Case += "      Value |= (op & " + MaskStr + ") >> " +
-                  itostr(-OpShift) + ";\n";
-        } else {
-          Case += "      Value |= (op & " + MaskStr + ");\n";
-        }
-      }
+      uint64_t OpMask = maskTrailingOnes<uint64_t>(N) << LoBit;
+      OS << Indent << "Value |= (op & " << format_hex(OpMask, 0) << ')';
+      int OpShift = BeginInstBit - BeginVarBit;
+      if (OpShift > 0)
+        OS << " << " << OpShift;
+      else if (OpShift < 0)
+        OS << " >> " << -OpShift;
+      OS << ";\n";
     }
   }

s-barannikov · 2025-09-15T16:27:46Z

Sample output diff:

jurahul · 2025-09-16T11:43:21Z

llvm/utils/TableGen/CodeEmitterGen.cpp

-        }
-      }
+      uint64_t OpMask = maskTrailingOnes<uint64_t>(N) << LoBit;
+      OS << Indent << "Value |= (op & " << format_hex(OpMask, 0) << ')';


Q: If OpMask is > 32 bits, will the lack of UINT64_C cause issues (i.e, warning and/or the upper bits being lost?)

I guess not, since op is uint64_t here, so it will get interpreted as a uint64_t.

Right, the usual promotion rules apply, there should be no warnings or bit losses

[TableGen][CodeEmitter] Refactor addCodeToMergeInOperand (NFC)

1007b93

* Use streams to avoid dealing with std::string * Print operand masks in hex * Make the output more succinct

s-barannikov requested a review from jurahul September 15, 2025 16:26

llvmbot added the tablegen label Sep 15, 2025

Update tests

de36c91

jurahul reviewed Sep 16, 2025

View reviewed changes

jurahul approved these changes Sep 16, 2025

View reviewed changes

s-barannikov merged commit 3c7c892 into llvm:main Sep 16, 2025
9 checks passed

s-barannikov deleted the tablegen/code-emitter/stream branch September 16, 2025 13:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[TableGen][CodeEmitter] Refactor addCodeToMergeInOperand (NFC) #158674

[TableGen][CodeEmitter] Refactor addCodeToMergeInOperand (NFC) #158674

Uh oh!

s-barannikov commented Sep 15, 2025

Uh oh!

llvmbot commented Sep 15, 2025

Uh oh!

s-barannikov commented Sep 15, 2025

Uh oh!

jurahul Sep 16, 2025

Uh oh!

jurahul Sep 16, 2025

Uh oh!

s-barannikov Sep 16, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[TableGen][CodeEmitter] Refactor addCodeToMergeInOperand (NFC) #158674

[TableGen][CodeEmitter] Refactor addCodeToMergeInOperand (NFC) #158674

Uh oh!

Conversation

s-barannikov commented Sep 15, 2025

Uh oh!

llvmbot commented Sep 15, 2025

Uh oh!

s-barannikov commented Sep 15, 2025

Uh oh!

jurahul Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

jurahul Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

s-barannikov Sep 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants