Skip to content

Conversation

@jurahul
Copy link
Contributor

@jurahul jurahul commented Oct 24, 2024

Change Intrinsic::ID enum values to encode an 8-bit target index in upper 16-bits and a 16-bit intrinsic index (within the target) in lower 16-bits.

This change is in preparation for being able to disable intrinsics for targets that are not enabled.

@jurahul jurahul force-pushed the intrinsic_id_values_tgt_idx_intr_idx branch from e5067a4 to 565c8ca Compare October 24, 2024 17:32
@jurahul jurahul marked this pull request as ready for review October 24, 2024 19:57
@jurahul jurahul requested a review from nikic as a code owner October 24, 2024 19:57
@jurahul jurahul requested review from RKSimon, arsenm and rnk October 24, 2024 19:58
@llvmbot
Copy link
Member

llvmbot commented Oct 24, 2024

@llvm/pr-subscribers-llvm-selectiondag
@llvm/pr-subscribers-tablegen
@llvm/pr-subscribers-llvm-support
@llvm/pr-subscribers-llvm-ir

@llvm/pr-subscribers-backend-x86

Author: Rahul Joshi (jurahul)

Changes

Change Intrinsic::ID enum values to encode an 8-bit target index in upper 16-bits and a 16-bit intrinsic index (within the target) in lower 16-bits.

This change is in preparation for being able to disable intrinsics for targets that are not enabled.


Patch is 64.78 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/113576.diff

30 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h (+2-2)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h (+2-2)
  • (modified) llvm/include/llvm/IR/Intrinsics.h (+9)
  • (added) llvm/include/llvm/Support/IntrinsicID.h (+59)
  • (modified) llvm/lib/CodeGen/MachineOperand.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/MachineVerifier.cpp (+2-2)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+1-1)
  • (modified) llvm/lib/IR/Core.cpp (+2-1)
  • (modified) llvm/lib/IR/Intrinsics.cpp (+95-22)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+1-1)
  • (modified) llvm/lib/Target/X86/X86IntrinsicsInfo.h (+1-2)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td (+7-7)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-SDNodeXForm-timm.td (+2-2)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-immarg-literal-pattern.td (+2-2)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-input-discard.td (+1-1)
  • (modified) llvm/test/TableGen/GlobalISelEmitter.td (+4-4)
  • (modified) llvm/test/TableGen/GlobalISelEmitterOverloadedPtr.td (+1-1)
  • (modified) llvm/test/TableGen/immarg-predicated.td (+1-1)
  • (modified) llvm/test/TableGen/immarg.td (+1-1)
  • (modified) llvm/test/TableGen/intrinsic-overload-conflict.td (+1-1)
  • (modified) llvm/test/TableGen/intrinsic-struct.td (+1-1)
  • (modified) llvm/test/TableGen/predicate-patfags.td (+2-2)
  • (modified) llvm/unittests/IR/VPIntrinsicTest.cpp (+8-7)
  • (modified) llvm/utils/TableGen/Basic/CodeGenIntrinsics.cpp (+113-46)
  • (modified) llvm/utils/TableGen/Basic/CodeGenIntrinsics.h (+19-14)
  • (modified) llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp (+1-1)
  • (modified) llvm/utils/TableGen/Common/CodeGenDAGPatterns.h (+3-11)
  • (modified) llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp (+2-2)
  • (modified) llvm/utils/TableGen/IntrinsicEmitter.cpp (+103-81)
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
index 7b42722ca8d4f1..383ce96813d84d 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
@@ -275,7 +275,7 @@ enum {
   /// Check the operand is a specific intrinsic ID
   /// - InsnID(ULEB128) - Instruction ID
   /// - OpIdx(ULEB128) - Operand index
-  /// - IID(2) - Expected Intrinsic ID
+  /// - IID(4) - Expected Intrinsic ID
   GIM_CheckIntrinsicID,
 
   /// Check the operand is a specific predicate
@@ -411,7 +411,7 @@ enum {
 
   /// Adds an intrinsic ID to the specified instruction.
   /// - InsnID(ULEB128) - Instruction ID to modify
-  /// - IID(2) - Intrinsic ID
+  /// - IID(4) - Intrinsic ID
   GIR_AddIntrinsicID,
 
   /// Marks the implicit def of a register as dead.
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
index 9f8eb030a96c64..775998fe53a7be 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
@@ -889,7 +889,7 @@ bool GIMatchTableExecutor::executeMatchTable(
     case GIM_CheckIntrinsicID: {
       uint64_t InsnID = readULEB();
       uint64_t OpIdx = readULEB();
-      uint16_t Value = readU16();
+      uint32_t Value = readU32();
       DEBUG_WITH_TYPE(TgtExecutor::getName(),
                       dbgs() << CurrentIdx << ": GIM_CheckIntrinsicID(MIs["
                              << InsnID << "]->getOperand(" << OpIdx
@@ -1185,7 +1185,7 @@ bool GIMatchTableExecutor::executeMatchTable(
     }
     case GIR_AddIntrinsicID: {
       uint64_t InsnID = readULEB();
-      uint16_t Value = readU16();
+      uint32_t Value = readU32();
       assert(OutMIs[InsnID] && "Attempted to add to undefined instruction");
       OutMIs[InsnID].addIntrinsicID((Intrinsic::ID)Value);
       DEBUG_WITH_TYPE(TgtExecutor::getName(),
diff --git a/llvm/include/llvm/IR/Intrinsics.h b/llvm/include/llvm/IR/Intrinsics.h
index e893295e3272b9..e04711c45de803 100644
--- a/llvm/include/llvm/IR/Intrinsics.h
+++ b/llvm/include/llvm/IR/Intrinsics.h
@@ -47,8 +47,17 @@ namespace Intrinsic {
 #define GET_INTRINSIC_ENUM_VALUES
 #include "llvm/IR/IntrinsicEnums.inc"
 #undef GET_INTRINSIC_ENUM_VALUES
+    end_id = ~0U,
   };
 
+  // Returns if `id` is a valid intrinsic ID. Validity means that it value is
+  // one of the values defined for the Intrinsic::ID enum.
+  bool IsIntrinsicIDValid(ID id);
+
+  // Get the next valid ID. This is used in test cases that iterate over valid
+  // intrinsic ID enums.
+  ID GetNextValidIntrinsicID(ID id);
+
   /// Return the LLVM name for an intrinsic, such as "llvm.ppc.altivec.lvx".
   /// Note, this version is for intrinsics with no overloads.  Use the other
   /// version of getName if overloads are required.
diff --git a/llvm/include/llvm/Support/IntrinsicID.h b/llvm/include/llvm/Support/IntrinsicID.h
new file mode 100644
index 00000000000000..60b13d85ec9070
--- /dev/null
+++ b/llvm/include/llvm/Support/IntrinsicID.h
@@ -0,0 +1,59 @@
+//===- llvm/Support/IntrinsicID.h - Intrinsic ID encoding -------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains functions to support intrinsic ID encoding. The
+// Intrinsic::ID enum value is constructed using a target prefix index in bits
+// 23-16 (8-bit) and an intrinsic index (index within the list of intrinsics for
+// tha target) in lower 16 bits. To support Intrinsic::ID 0 being not used, the
+// intrinsic index is encoded as Index + 1 for all targets.
+//
+// This file defines functions that encapsulate this encoding.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_INTRINSIC_ID_H
+#define LLVM_SUPPORT_INTRINSIC_ID_H
+
+#include "llvm/Support/FormatVariadic.h"
+#include <limits>
+#include <optional>
+#include <utility>
+
+namespace llvm::Intrinsic {
+typedef unsigned ID;
+
+inline ID EncodeIntrinsicID(unsigned TargetIndex, unsigned IntrinsicIndex) {
+  assert(IntrinsicIndex < std::numeric_limits<uint16_t>::max());
+  assert(TargetIndex <= std::numeric_limits<uint8_t>::max());
+  return (TargetIndex << 16) | (IntrinsicIndex + 1);
+}
+
+inline std::pair<unsigned, unsigned> DecodeIntrinsicID(ID id) {
+  unsigned IntrinsicIndex = id & 0xFFFF;
+  unsigned TargetIndex = id >> 16;
+  assert(IntrinsicIndex != 0);
+  return {TargetIndex, IntrinsicIndex - 1};
+}
+
+inline std::optional<std::pair<unsigned, unsigned>>
+DecodeIntrinsicIDNoFail(ID id) {
+  unsigned IntrinsicIndex = id & 0xFFFF;
+  unsigned TargetIndex = id >> 16;
+  if (IntrinsicIndex == 0)
+    return std::nullopt;
+  return std::make_pair(TargetIndex, IntrinsicIndex - 1);
+}
+
+inline void PrintIntrinsicIDEncoding(raw_ostream &OS, unsigned TargetIndex,
+                                     unsigned IntrinsicIndex) {
+  OS << formatv(" = ({} << 16) + {} + 1", TargetIndex, IntrinsicIndex);
+}
+
+} // end namespace llvm::Intrinsic
+
+#endif // LLVM_SUPPORT_INTRINSIC_ID_H
diff --git a/llvm/lib/CodeGen/MachineOperand.cpp b/llvm/lib/CodeGen/MachineOperand.cpp
index c0e004555de959..8261ac9401d6ae 100644
--- a/llvm/lib/CodeGen/MachineOperand.cpp
+++ b/llvm/lib/CodeGen/MachineOperand.cpp
@@ -992,7 +992,7 @@ void MachineOperand::print(raw_ostream &OS, ModuleSlotTracker &MST,
   }
   case MachineOperand::MO_IntrinsicID: {
     Intrinsic::ID ID = getIntrinsicID();
-    if (ID < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(ID))
       OS << "intrinsic(@" << Intrinsic::getBaseName(ID) << ')';
     else if (IntrinsicInfo)
       OS << "intrinsic(@" << IntrinsicInfo->getName(ID) << ')';
diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index e2c09fe25d55cd..77440e5e7275f7 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -1055,7 +1055,7 @@ bool MachineVerifier::verifyGIntrinsicSideEffects(const MachineInstr *MI) {
   bool NoSideEffects = Opcode == TargetOpcode::G_INTRINSIC ||
                        Opcode == TargetOpcode::G_INTRINSIC_CONVERGENT;
   unsigned IntrID = cast<GIntrinsic>(MI)->getIntrinsicID();
-  if (IntrID != 0 && IntrID < Intrinsic::num_intrinsics) {
+  if (IntrID != 0 && Intrinsic::IsIntrinsicIDValid(IntrID)) {
     AttributeList Attrs = Intrinsic::getAttributes(
         MF->getFunction().getContext(), static_cast<Intrinsic::ID>(IntrID));
     bool DeclHasSideEffects = !Attrs.getMemoryEffects().doesNotAccessMemory();
@@ -1079,7 +1079,7 @@ bool MachineVerifier::verifyGIntrinsicConvergence(const MachineInstr *MI) {
   bool NotConvergent = Opcode == TargetOpcode::G_INTRINSIC ||
                        Opcode == TargetOpcode::G_INTRINSIC_W_SIDE_EFFECTS;
   unsigned IntrID = cast<GIntrinsic>(MI)->getIntrinsicID();
-  if (IntrID != 0 && IntrID < Intrinsic::num_intrinsics) {
+  if (IntrID != 0 && Intrinsic::IsIntrinsicIDValid(IntrID)) {
     AttributeList Attrs = Intrinsic::getAttributes(
         MF->getFunction().getContext(), static_cast<Intrinsic::ID>(IntrID));
     bool DeclIsConvergent = Attrs.hasFnAttr(Attribute::Convergent);
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
index 703efb70089742..d9e3324c6a4981 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
@@ -160,7 +160,7 @@ std::string SDNode::getOperationName(const SelectionDAG *G) const {
   case ISD::INTRINSIC_W_CHAIN: {
     unsigned OpNo = getOpcode() == ISD::INTRINSIC_WO_CHAIN ? 0 : 1;
     unsigned IID = getOperand(OpNo)->getAsZExtVal();
-    if (IID < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(IID))
       return Intrinsic::getBaseName((Intrinsic::ID)IID).str();
     if (!G)
       return "Unknown intrinsic";
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index 981ab18b59c1c1..391f6facdc901e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -4431,7 +4431,7 @@ void SelectionDAGISel::CannotYetSelect(SDNode *N) {
   } else {
     bool HasInputChain = N->getOperand(0).getValueType() == MVT::Other;
     unsigned iid = N->getConstantOperandVal(HasInputChain);
-    if (iid < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(iid))
       Msg << "intrinsic %" << Intrinsic::getBaseName((Intrinsic::ID)iid);
     else if (const TargetIntrinsicInfo *TII = TM.getIntrinsicInfo())
       Msg << "target intrinsic %" << TII->getName(iid);
diff --git a/llvm/lib/IR/Core.cpp b/llvm/lib/IR/Core.cpp
index 1cf998c6850068..32f31e9900880d 100644
--- a/llvm/lib/IR/Core.cpp
+++ b/llvm/lib/IR/Core.cpp
@@ -2458,7 +2458,8 @@ unsigned LLVMGetIntrinsicID(LLVMValueRef Fn) {
 }
 
 static Intrinsic::ID llvm_map_to_intrinsic_id(unsigned ID) {
-  assert(ID < llvm::Intrinsic::num_intrinsics && "Intrinsic ID out of range");
+  assert(llvm::Intrinsic::IsIntrinsicIDValid(ID) &&
+         "Intrinsic ID out of range");
   return llvm::Intrinsic::ID(ID);
 }
 
diff --git a/llvm/lib/IR/Intrinsics.cpp b/llvm/lib/IR/Intrinsics.cpp
index 1b92daf15b463e..729d553c15a579 100644
--- a/llvm/lib/IR/Intrinsics.cpp
+++ b/llvm/lib/IR/Intrinsics.cpp
@@ -33,8 +33,54 @@
 #include "llvm/IR/IntrinsicsXCore.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/Type.h"
+#include "llvm/Support/IntrinsicID.h"
 
 using namespace llvm;
+using namespace Intrinsic;
+
+/// Table of per-target intrinsic name tables.
+#define GET_INTRINSIC_TARGET_DATA
+#include "llvm/IR/IntrinsicImpl.inc"
+#undef GET_INTRINSIC_TARGET_DATA
+size_t constexpr NumTargets = sizeof(TargetInfos) / sizeof(TargetInfos[0]);
+
+// Returns true if the given intrinsic ID is valid, that is, its value is one
+// of the enum values defined for this intrinsic (including not_intrinsic).
+bool Intrinsic::IsIntrinsicIDValid(ID ID) {
+  if (ID == Intrinsic::not_intrinsic)
+    return true;
+  auto Decoded = DecodeIntrinsicIDNoFail(ID);
+  if (!Decoded)
+    return false;
+  unsigned TargetIdx = Decoded->first;
+  unsigned IntrinsicIdx = Decoded->second;
+  return TargetIdx < NumTargets && IntrinsicIdx < TargetInfos[TargetIdx].Count;
+}
+
+// Returns linear index of ID if its valid, else returns 0.
+inline unsigned getLinearIndex(Intrinsic::ID ID) {
+  auto Decoded = DecodeIntrinsicIDNoFail(ID);
+  if (!Decoded)
+    return 0;
+  unsigned TargetIdx = Decoded->first;
+  unsigned IntrinsicIdx = Decoded->second;
+  return TargetInfos[TargetIdx].FirstLinearIndex + IntrinsicIdx;
+}
+
+ID Intrinsic::GetNextValidIntrinsicID(Intrinsic::ID ID) {
+  if (ID == Intrinsic::not_intrinsic)
+    return EncodeIntrinsicID(0, 0);
+  if (ID == Intrinsic::last_valid_intrinsic_id)
+    return Intrinsic::end_id;
+  if (ID == Intrinsic::end_id)
+    llvm_unreachable("Cannot find the next valid intrisnic");
+  auto [TargetIndex, IntrinsicIndex] = DecodeIntrinsicID(ID);
+  if (IntrinsicIndex + 1 < TargetInfos[TargetIndex].Count)
+    return EncodeIntrinsicID(TargetIndex, IntrinsicIndex + 1);
+  if (TargetIndex + 1 < NumTargets)
+    return EncodeIntrinsicID(TargetIndex + 1, 0);
+  llvm_unreachable("Cannot find the next valid intrisnic");
+}
 
 /// Table of string intrinsic names indexed by enum value.
 static constexpr const char *const IntrinsicNameTable[] = {
@@ -45,12 +91,12 @@ static constexpr const char *const IntrinsicNameTable[] = {
 };
 
 StringRef Intrinsic::getBaseName(ID id) {
-  assert(id < num_intrinsics && "Invalid intrinsic ID!");
-  return IntrinsicNameTable[id];
+  assert(IsIntrinsicIDValid(id) && "Invalid intrinsic ID!");
+  return ArrayRef(IntrinsicNameTable)[getLinearIndex(id)];
 }
 
 StringRef Intrinsic::getName(ID id) {
-  assert(id < num_intrinsics && "Invalid intrinsic ID!");
+  assert(IsIntrinsicIDValid(id) && "Invalid intrinsic ID!");
   assert(!Intrinsic::isOverloaded(id) &&
          "This version of getName does not support overloading");
   return getBaseName(id);
@@ -157,8 +203,7 @@ static std::string getMangledTypeStr(Type *Ty, bool &HasUnnamedType) {
 static std::string getIntrinsicNameImpl(Intrinsic::ID Id, ArrayRef<Type *> Tys,
                                         Module *M, FunctionType *FT,
                                         bool EarlyModuleCheck) {
-
-  assert(Id < Intrinsic::num_intrinsics && "Invalid intrinsic ID!");
+  assert(IsIntrinsicIDValid(Id) && "Invalid intrinsic ID!");
   assert((Tys.empty() || Intrinsic::isOverloaded(Id)) &&
          "This version of getName is for overloaded intrinsics only");
   (void)EarlyModuleCheck;
@@ -450,11 +495,15 @@ DecodeIITType(unsigned &NextElt, ArrayRef<unsigned char> Infos,
 #undef GET_INTRINSIC_GENERATOR_GLOBAL
 
 void Intrinsic::getIntrinsicInfoTableEntries(
-    ID id, SmallVectorImpl<IITDescriptor> &T) {
+    ID IntrinsicID, SmallVectorImpl<IITDescriptor> &T) {
   static_assert(sizeof(IIT_Table[0]) == 2,
                 "Expect 16-bit entries in IIT_Table");
+  assert(IsIntrinsicIDValid(IntrinsicID));
+  unsigned Idx = getLinearIndex(IntrinsicID);
+  if (Idx == 0)
+    return;
   // Check to see if the intrinsic's type was expressible by the table.
-  uint16_t TableVal = IIT_Table[id - 1];
+  uint16_t TableVal = IIT_Table[Idx - 1];
 
   // Decode the TableVal into an array of IITValues.
   SmallVector<unsigned char> IITValues;
@@ -609,19 +658,20 @@ FunctionType *Intrinsic::getType(LLVMContext &Context, ID id,
   return FunctionType::get(ResultTy, ArgTys, false);
 }
 
-bool Intrinsic::isOverloaded(ID id) {
+// Check if an intrinsic is overloaded or not using its linear index.
+static bool isOverloadedUsingLinearIndex(unsigned Idx) {
 #define GET_INTRINSIC_OVERLOAD_TABLE
 #include "llvm/IR/IntrinsicImpl.inc"
 #undef GET_INTRINSIC_OVERLOAD_TABLE
 }
 
-/// Table of per-target intrinsic name tables.
-#define GET_INTRINSIC_TARGET_DATA
-#include "llvm/IR/IntrinsicImpl.inc"
-#undef GET_INTRINSIC_TARGET_DATA
+bool Intrinsic::isOverloaded(ID id) {
+  assert(IsIntrinsicIDValid(id));
+  return isOverloadedUsingLinearIndex(getLinearIndex(id));
+}
 
 bool Intrinsic::isTargetIntrinsic(Intrinsic::ID IID) {
-  return IID > TargetInfos[0].Count;
+  return IID != Intrinsic::not_intrinsic && DecodeIntrinsicID(IID).first != 0;
 }
 
 int llvm::Intrinsic::lookupLLVMIntrinsicByName(ArrayRef<const char *> NameTable,
@@ -683,7 +733,29 @@ findTargetSubtable(StringRef Name) {
   // We've either found the target or just fall back to the generic set, which
   // is always first.
   const auto &TI = It != Targets.end() && It->Name == Target ? *It : Targets[0];
-  return {ArrayRef(&IntrinsicNameTable[1] + TI.Offset, TI.Count), TI.Name};
+  unsigned LinearIndex = TI.FirstLinearIndex;
+  return {ArrayRef(IntrinsicNameTable + LinearIndex, TI.Count), TI.Name};
+}
+
+static Intrinsic::ID getIntrinsicIDFromIndex(unsigned Idx) {
+  if (Idx == 0)
+    return Intrinsic::not_intrinsic;
+
+  auto It =
+      partition_point(TargetInfos, [Idx](const IntrinsicTargetInfo &Info) {
+        return Info.FirstLinearIndex + Info.Count < Idx;
+      });
+  // Idx, if present, will be in the entry at It or It + 1.
+  if (It == std::end(TargetInfos))
+    return Intrinsic::not_intrinsic;
+  unsigned TargetIndex = std::distance(std::begin(TargetInfos), It);
+  if (It->FirstLinearIndex <= Idx && Idx < It->FirstLinearIndex + It->Count)
+    return EncodeIntrinsicID(TargetIndex, Idx - It->FirstLinearIndex);
+  ++It;
+  ++TargetIndex;
+  if (It->FirstLinearIndex <= Idx && Idx < It->FirstLinearIndex + It->Count)
+    return EncodeIntrinsicID(TargetIndex, Idx - It->FirstLinearIndex);
+  return Intrinsic::not_intrinsic;
 }
 
 /// This does the actual lookup of an intrinsic ID which matches the given
@@ -693,19 +765,19 @@ Intrinsic::ID Intrinsic::lookupIntrinsicID(StringRef Name) {
   int Idx = Intrinsic::lookupLLVMIntrinsicByName(NameTable, Name, Target);
   if (Idx == -1)
     return Intrinsic::not_intrinsic;
-
-  // Intrinsic IDs correspond to the location in IntrinsicNameTable, but we have
-  // an index into a sub-table.
+  const auto MatchSize = strlen(NameTable[Idx]);
+  // Adjust the index from sub-table index to index into the global table.
   int Adjust = NameTable.data() - IntrinsicNameTable;
-  Intrinsic::ID ID = static_cast<Intrinsic::ID>(Idx + Adjust);
+  Idx += Adjust;
 
   // If the intrinsic is not overloaded, require an exact match. If it is
   // overloaded, require either exact or prefix match.
-  const auto MatchSize = strlen(NameTable[Idx]);
   assert(Name.size() >= MatchSize && "Expected either exact or prefix match");
   bool IsExactMatch = Name.size() == MatchSize;
-  return IsExactMatch || Intrinsic::isOverloaded(ID) ? ID
-                                                     : Intrinsic::not_intrinsic;
+  Intrinsic::ID r = IsExactMatch || isOverloadedUsingLinearIndex(Idx)
+                        ? getIntrinsicIDFromIndex(Idx)
+                        : Intrinsic::not_intrinsic;
+  return r;
 }
 
 /// This defines the "Intrinsic::getAttributes(ID id)" method.
@@ -1043,8 +1115,9 @@ bool Intrinsic::matchIntrinsicVarArg(
 
 bool Intrinsic::getIntrinsicSignature(Intrinsic::ID ID, FunctionType *FT,
                                       SmallVectorImpl<Type *> &ArgTys) {
-  if (!ID)
+  if (ID == Intrinsic::not_intrinsic)
     return false;
+  assert(IsIntrinsicIDValid(ID));
 
   SmallVector<Intrinsic::IITDescriptor, 8> Table;
   getIntrinsicInfoTableEntries(ID, Table);
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 5a848ada9dd8ee..30aadcce028c23 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7542,7 +7542,7 @@ static unsigned getIntrinsicID(const SDNode *N) {
     return Intrinsic::not_intrinsic;
   case ISD::INTRINSIC_WO_CHAIN: {
     unsigned IID = N->getConstantOperandVal(0);
-    if (IID < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(IID))
       return IID;
     return Intrinsic::not_intrinsic;
   }
diff --git a/llvm/lib/Target/X86/X86IntrinsicsInfo.h b/llvm/lib/Target/X86/X86IntrinsicsInfo.h
index 86fd04046d16a0..f5477ec2f81315 100644
--- a/llvm/lib/Target/X86/X86IntrinsicsInfo.h
+++ b/llvm/lib/Target/X86/X86IntrinsicsInfo.h
@@ -79,8 +79,7 @@ enum IntrinsicType : uint16_t {
 };
 
 struct IntrinsicData {
-
-  uint16_t Id;
+  unsigned Id;
   IntrinsicType Type;
   uint16_t Opc0;
   uint16_t Opc1;
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
index 365d0c9fbff494..8d96f703f97892 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
@@ -36,7 +36,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
 // CHECK-NEXT:     GIM_Try, /*On fail goto*//*Label 3*/ GIMT_Encode4([[L72:[0-9]+]]), // Rule ID 0 //
 // CHECK-NEXT:       GIM_CheckSimplePredicate, GIMT_Encode2(GICXXPred_Simple_IsRule0Enabled),
 // CHECK-NEXT:       GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
-// CHECK-NEXT:       GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, GIMT_Encode2(Intrinsic::1in_1out),
+// CHECK-NEXT:       GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, GIMT_Encode4(Intrinsic::1in_1out),
 // CHECK-NEXT:       // MIs[0] a
 // CHECK-NEXT:       // No operand predicates
 // CHECK-NEXT:       // MIs[0] Operand 2
@@ -45,10 +45,10 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
 // CHECK-NEXT:       // Combiner Rule #0: IntrinTest0
 // CHECK-NEXT:       GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::G_INTRINSIC),
 // CHECK-NEXT:       GIR_AddTempRegister, /*InsnID*/0, /*TempRegID*/0, /*TempRegFlags*/GIMT_Encode2(RegState::Define),
-// CHECK-NEXT:       GIR_AddIntrinsicID, /*MI*/0, GIMT_Encode2(Intrinsic::0in_1out),
+// CHECK-NEXT:       GIR_AddIntrinsicID, /*MI*/0, GIMT_Encode4(Intrinsic::0in_1out),
 // CHECK-NEXT:       GIR_BuildMI, /*InsnID*/1, /*Opcode*/GIMT_Encode2(TargetOpcode::G...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Oct 24, 2024

@llvm/pr-subscribers-llvm-globalisel

Author: Rahul Joshi (jurahul)

Changes

Change Intrinsic::ID enum values to encode an 8-bit target index in upper 16-bits and a 16-bit intrinsic index (within the target) in lower 16-bits.

This change is in preparation for being able to disable intrinsics for targets that are not enabled.


Patch is 64.78 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/113576.diff

30 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h (+2-2)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h (+2-2)
  • (modified) llvm/include/llvm/IR/Intrinsics.h (+9)
  • (added) llvm/include/llvm/Support/IntrinsicID.h (+59)
  • (modified) llvm/lib/CodeGen/MachineOperand.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/MachineVerifier.cpp (+2-2)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+1-1)
  • (modified) llvm/lib/IR/Core.cpp (+2-1)
  • (modified) llvm/lib/IR/Intrinsics.cpp (+95-22)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+1-1)
  • (modified) llvm/lib/Target/X86/X86IntrinsicsInfo.h (+1-2)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td (+7-7)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-SDNodeXForm-timm.td (+2-2)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-immarg-literal-pattern.td (+2-2)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-input-discard.td (+1-1)
  • (modified) llvm/test/TableGen/GlobalISelEmitter.td (+4-4)
  • (modified) llvm/test/TableGen/GlobalISelEmitterOverloadedPtr.td (+1-1)
  • (modified) llvm/test/TableGen/immarg-predicated.td (+1-1)
  • (modified) llvm/test/TableGen/immarg.td (+1-1)
  • (modified) llvm/test/TableGen/intrinsic-overload-conflict.td (+1-1)
  • (modified) llvm/test/TableGen/intrinsic-struct.td (+1-1)
  • (modified) llvm/test/TableGen/predicate-patfags.td (+2-2)
  • (modified) llvm/unittests/IR/VPIntrinsicTest.cpp (+8-7)
  • (modified) llvm/utils/TableGen/Basic/CodeGenIntrinsics.cpp (+113-46)
  • (modified) llvm/utils/TableGen/Basic/CodeGenIntrinsics.h (+19-14)
  • (modified) llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp (+1-1)
  • (modified) llvm/utils/TableGen/Common/CodeGenDAGPatterns.h (+3-11)
  • (modified) llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp (+2-2)
  • (modified) llvm/utils/TableGen/IntrinsicEmitter.cpp (+103-81)
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
index 7b42722ca8d4f1..383ce96813d84d 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
@@ -275,7 +275,7 @@ enum {
   /// Check the operand is a specific intrinsic ID
   /// - InsnID(ULEB128) - Instruction ID
   /// - OpIdx(ULEB128) - Operand index
-  /// - IID(2) - Expected Intrinsic ID
+  /// - IID(4) - Expected Intrinsic ID
   GIM_CheckIntrinsicID,
 
   /// Check the operand is a specific predicate
@@ -411,7 +411,7 @@ enum {
 
   /// Adds an intrinsic ID to the specified instruction.
   /// - InsnID(ULEB128) - Instruction ID to modify
-  /// - IID(2) - Intrinsic ID
+  /// - IID(4) - Intrinsic ID
   GIR_AddIntrinsicID,
 
   /// Marks the implicit def of a register as dead.
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
index 9f8eb030a96c64..775998fe53a7be 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
@@ -889,7 +889,7 @@ bool GIMatchTableExecutor::executeMatchTable(
     case GIM_CheckIntrinsicID: {
       uint64_t InsnID = readULEB();
       uint64_t OpIdx = readULEB();
-      uint16_t Value = readU16();
+      uint32_t Value = readU32();
       DEBUG_WITH_TYPE(TgtExecutor::getName(),
                       dbgs() << CurrentIdx << ": GIM_CheckIntrinsicID(MIs["
                              << InsnID << "]->getOperand(" << OpIdx
@@ -1185,7 +1185,7 @@ bool GIMatchTableExecutor::executeMatchTable(
     }
     case GIR_AddIntrinsicID: {
       uint64_t InsnID = readULEB();
-      uint16_t Value = readU16();
+      uint32_t Value = readU32();
       assert(OutMIs[InsnID] && "Attempted to add to undefined instruction");
       OutMIs[InsnID].addIntrinsicID((Intrinsic::ID)Value);
       DEBUG_WITH_TYPE(TgtExecutor::getName(),
diff --git a/llvm/include/llvm/IR/Intrinsics.h b/llvm/include/llvm/IR/Intrinsics.h
index e893295e3272b9..e04711c45de803 100644
--- a/llvm/include/llvm/IR/Intrinsics.h
+++ b/llvm/include/llvm/IR/Intrinsics.h
@@ -47,8 +47,17 @@ namespace Intrinsic {
 #define GET_INTRINSIC_ENUM_VALUES
 #include "llvm/IR/IntrinsicEnums.inc"
 #undef GET_INTRINSIC_ENUM_VALUES
+    end_id = ~0U,
   };
 
+  // Returns if `id` is a valid intrinsic ID. Validity means that it value is
+  // one of the values defined for the Intrinsic::ID enum.
+  bool IsIntrinsicIDValid(ID id);
+
+  // Get the next valid ID. This is used in test cases that iterate over valid
+  // intrinsic ID enums.
+  ID GetNextValidIntrinsicID(ID id);
+
   /// Return the LLVM name for an intrinsic, such as "llvm.ppc.altivec.lvx".
   /// Note, this version is for intrinsics with no overloads.  Use the other
   /// version of getName if overloads are required.
diff --git a/llvm/include/llvm/Support/IntrinsicID.h b/llvm/include/llvm/Support/IntrinsicID.h
new file mode 100644
index 00000000000000..60b13d85ec9070
--- /dev/null
+++ b/llvm/include/llvm/Support/IntrinsicID.h
@@ -0,0 +1,59 @@
+//===- llvm/Support/IntrinsicID.h - Intrinsic ID encoding -------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains functions to support intrinsic ID encoding. The
+// Intrinsic::ID enum value is constructed using a target prefix index in bits
+// 23-16 (8-bit) and an intrinsic index (index within the list of intrinsics for
+// tha target) in lower 16 bits. To support Intrinsic::ID 0 being not used, the
+// intrinsic index is encoded as Index + 1 for all targets.
+//
+// This file defines functions that encapsulate this encoding.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_INTRINSIC_ID_H
+#define LLVM_SUPPORT_INTRINSIC_ID_H
+
+#include "llvm/Support/FormatVariadic.h"
+#include <limits>
+#include <optional>
+#include <utility>
+
+namespace llvm::Intrinsic {
+typedef unsigned ID;
+
+inline ID EncodeIntrinsicID(unsigned TargetIndex, unsigned IntrinsicIndex) {
+  assert(IntrinsicIndex < std::numeric_limits<uint16_t>::max());
+  assert(TargetIndex <= std::numeric_limits<uint8_t>::max());
+  return (TargetIndex << 16) | (IntrinsicIndex + 1);
+}
+
+inline std::pair<unsigned, unsigned> DecodeIntrinsicID(ID id) {
+  unsigned IntrinsicIndex = id & 0xFFFF;
+  unsigned TargetIndex = id >> 16;
+  assert(IntrinsicIndex != 0);
+  return {TargetIndex, IntrinsicIndex - 1};
+}
+
+inline std::optional<std::pair<unsigned, unsigned>>
+DecodeIntrinsicIDNoFail(ID id) {
+  unsigned IntrinsicIndex = id & 0xFFFF;
+  unsigned TargetIndex = id >> 16;
+  if (IntrinsicIndex == 0)
+    return std::nullopt;
+  return std::make_pair(TargetIndex, IntrinsicIndex - 1);
+}
+
+inline void PrintIntrinsicIDEncoding(raw_ostream &OS, unsigned TargetIndex,
+                                     unsigned IntrinsicIndex) {
+  OS << formatv(" = ({} << 16) + {} + 1", TargetIndex, IntrinsicIndex);
+}
+
+} // end namespace llvm::Intrinsic
+
+#endif // LLVM_SUPPORT_INTRINSIC_ID_H
diff --git a/llvm/lib/CodeGen/MachineOperand.cpp b/llvm/lib/CodeGen/MachineOperand.cpp
index c0e004555de959..8261ac9401d6ae 100644
--- a/llvm/lib/CodeGen/MachineOperand.cpp
+++ b/llvm/lib/CodeGen/MachineOperand.cpp
@@ -992,7 +992,7 @@ void MachineOperand::print(raw_ostream &OS, ModuleSlotTracker &MST,
   }
   case MachineOperand::MO_IntrinsicID: {
     Intrinsic::ID ID = getIntrinsicID();
-    if (ID < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(ID))
       OS << "intrinsic(@" << Intrinsic::getBaseName(ID) << ')';
     else if (IntrinsicInfo)
       OS << "intrinsic(@" << IntrinsicInfo->getName(ID) << ')';
diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index e2c09fe25d55cd..77440e5e7275f7 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -1055,7 +1055,7 @@ bool MachineVerifier::verifyGIntrinsicSideEffects(const MachineInstr *MI) {
   bool NoSideEffects = Opcode == TargetOpcode::G_INTRINSIC ||
                        Opcode == TargetOpcode::G_INTRINSIC_CONVERGENT;
   unsigned IntrID = cast<GIntrinsic>(MI)->getIntrinsicID();
-  if (IntrID != 0 && IntrID < Intrinsic::num_intrinsics) {
+  if (IntrID != 0 && Intrinsic::IsIntrinsicIDValid(IntrID)) {
     AttributeList Attrs = Intrinsic::getAttributes(
         MF->getFunction().getContext(), static_cast<Intrinsic::ID>(IntrID));
     bool DeclHasSideEffects = !Attrs.getMemoryEffects().doesNotAccessMemory();
@@ -1079,7 +1079,7 @@ bool MachineVerifier::verifyGIntrinsicConvergence(const MachineInstr *MI) {
   bool NotConvergent = Opcode == TargetOpcode::G_INTRINSIC ||
                        Opcode == TargetOpcode::G_INTRINSIC_W_SIDE_EFFECTS;
   unsigned IntrID = cast<GIntrinsic>(MI)->getIntrinsicID();
-  if (IntrID != 0 && IntrID < Intrinsic::num_intrinsics) {
+  if (IntrID != 0 && Intrinsic::IsIntrinsicIDValid(IntrID)) {
     AttributeList Attrs = Intrinsic::getAttributes(
         MF->getFunction().getContext(), static_cast<Intrinsic::ID>(IntrID));
     bool DeclIsConvergent = Attrs.hasFnAttr(Attribute::Convergent);
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
index 703efb70089742..d9e3324c6a4981 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
@@ -160,7 +160,7 @@ std::string SDNode::getOperationName(const SelectionDAG *G) const {
   case ISD::INTRINSIC_W_CHAIN: {
     unsigned OpNo = getOpcode() == ISD::INTRINSIC_WO_CHAIN ? 0 : 1;
     unsigned IID = getOperand(OpNo)->getAsZExtVal();
-    if (IID < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(IID))
       return Intrinsic::getBaseName((Intrinsic::ID)IID).str();
     if (!G)
       return "Unknown intrinsic";
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index 981ab18b59c1c1..391f6facdc901e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -4431,7 +4431,7 @@ void SelectionDAGISel::CannotYetSelect(SDNode *N) {
   } else {
     bool HasInputChain = N->getOperand(0).getValueType() == MVT::Other;
     unsigned iid = N->getConstantOperandVal(HasInputChain);
-    if (iid < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(iid))
       Msg << "intrinsic %" << Intrinsic::getBaseName((Intrinsic::ID)iid);
     else if (const TargetIntrinsicInfo *TII = TM.getIntrinsicInfo())
       Msg << "target intrinsic %" << TII->getName(iid);
diff --git a/llvm/lib/IR/Core.cpp b/llvm/lib/IR/Core.cpp
index 1cf998c6850068..32f31e9900880d 100644
--- a/llvm/lib/IR/Core.cpp
+++ b/llvm/lib/IR/Core.cpp
@@ -2458,7 +2458,8 @@ unsigned LLVMGetIntrinsicID(LLVMValueRef Fn) {
 }
 
 static Intrinsic::ID llvm_map_to_intrinsic_id(unsigned ID) {
-  assert(ID < llvm::Intrinsic::num_intrinsics && "Intrinsic ID out of range");
+  assert(llvm::Intrinsic::IsIntrinsicIDValid(ID) &&
+         "Intrinsic ID out of range");
   return llvm::Intrinsic::ID(ID);
 }
 
diff --git a/llvm/lib/IR/Intrinsics.cpp b/llvm/lib/IR/Intrinsics.cpp
index 1b92daf15b463e..729d553c15a579 100644
--- a/llvm/lib/IR/Intrinsics.cpp
+++ b/llvm/lib/IR/Intrinsics.cpp
@@ -33,8 +33,54 @@
 #include "llvm/IR/IntrinsicsXCore.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/Type.h"
+#include "llvm/Support/IntrinsicID.h"
 
 using namespace llvm;
+using namespace Intrinsic;
+
+/// Table of per-target intrinsic name tables.
+#define GET_INTRINSIC_TARGET_DATA
+#include "llvm/IR/IntrinsicImpl.inc"
+#undef GET_INTRINSIC_TARGET_DATA
+size_t constexpr NumTargets = sizeof(TargetInfos) / sizeof(TargetInfos[0]);
+
+// Returns true if the given intrinsic ID is valid, that is, its value is one
+// of the enum values defined for this intrinsic (including not_intrinsic).
+bool Intrinsic::IsIntrinsicIDValid(ID ID) {
+  if (ID == Intrinsic::not_intrinsic)
+    return true;
+  auto Decoded = DecodeIntrinsicIDNoFail(ID);
+  if (!Decoded)
+    return false;
+  unsigned TargetIdx = Decoded->first;
+  unsigned IntrinsicIdx = Decoded->second;
+  return TargetIdx < NumTargets && IntrinsicIdx < TargetInfos[TargetIdx].Count;
+}
+
+// Returns linear index of ID if its valid, else returns 0.
+inline unsigned getLinearIndex(Intrinsic::ID ID) {
+  auto Decoded = DecodeIntrinsicIDNoFail(ID);
+  if (!Decoded)
+    return 0;
+  unsigned TargetIdx = Decoded->first;
+  unsigned IntrinsicIdx = Decoded->second;
+  return TargetInfos[TargetIdx].FirstLinearIndex + IntrinsicIdx;
+}
+
+ID Intrinsic::GetNextValidIntrinsicID(Intrinsic::ID ID) {
+  if (ID == Intrinsic::not_intrinsic)
+    return EncodeIntrinsicID(0, 0);
+  if (ID == Intrinsic::last_valid_intrinsic_id)
+    return Intrinsic::end_id;
+  if (ID == Intrinsic::end_id)
+    llvm_unreachable("Cannot find the next valid intrisnic");
+  auto [TargetIndex, IntrinsicIndex] = DecodeIntrinsicID(ID);
+  if (IntrinsicIndex + 1 < TargetInfos[TargetIndex].Count)
+    return EncodeIntrinsicID(TargetIndex, IntrinsicIndex + 1);
+  if (TargetIndex + 1 < NumTargets)
+    return EncodeIntrinsicID(TargetIndex + 1, 0);
+  llvm_unreachable("Cannot find the next valid intrisnic");
+}
 
 /// Table of string intrinsic names indexed by enum value.
 static constexpr const char *const IntrinsicNameTable[] = {
@@ -45,12 +91,12 @@ static constexpr const char *const IntrinsicNameTable[] = {
 };
 
 StringRef Intrinsic::getBaseName(ID id) {
-  assert(id < num_intrinsics && "Invalid intrinsic ID!");
-  return IntrinsicNameTable[id];
+  assert(IsIntrinsicIDValid(id) && "Invalid intrinsic ID!");
+  return ArrayRef(IntrinsicNameTable)[getLinearIndex(id)];
 }
 
 StringRef Intrinsic::getName(ID id) {
-  assert(id < num_intrinsics && "Invalid intrinsic ID!");
+  assert(IsIntrinsicIDValid(id) && "Invalid intrinsic ID!");
   assert(!Intrinsic::isOverloaded(id) &&
          "This version of getName does not support overloading");
   return getBaseName(id);
@@ -157,8 +203,7 @@ static std::string getMangledTypeStr(Type *Ty, bool &HasUnnamedType) {
 static std::string getIntrinsicNameImpl(Intrinsic::ID Id, ArrayRef<Type *> Tys,
                                         Module *M, FunctionType *FT,
                                         bool EarlyModuleCheck) {
-
-  assert(Id < Intrinsic::num_intrinsics && "Invalid intrinsic ID!");
+  assert(IsIntrinsicIDValid(Id) && "Invalid intrinsic ID!");
   assert((Tys.empty() || Intrinsic::isOverloaded(Id)) &&
          "This version of getName is for overloaded intrinsics only");
   (void)EarlyModuleCheck;
@@ -450,11 +495,15 @@ DecodeIITType(unsigned &NextElt, ArrayRef<unsigned char> Infos,
 #undef GET_INTRINSIC_GENERATOR_GLOBAL
 
 void Intrinsic::getIntrinsicInfoTableEntries(
-    ID id, SmallVectorImpl<IITDescriptor> &T) {
+    ID IntrinsicID, SmallVectorImpl<IITDescriptor> &T) {
   static_assert(sizeof(IIT_Table[0]) == 2,
                 "Expect 16-bit entries in IIT_Table");
+  assert(IsIntrinsicIDValid(IntrinsicID));
+  unsigned Idx = getLinearIndex(IntrinsicID);
+  if (Idx == 0)
+    return;
   // Check to see if the intrinsic's type was expressible by the table.
-  uint16_t TableVal = IIT_Table[id - 1];
+  uint16_t TableVal = IIT_Table[Idx - 1];
 
   // Decode the TableVal into an array of IITValues.
   SmallVector<unsigned char> IITValues;
@@ -609,19 +658,20 @@ FunctionType *Intrinsic::getType(LLVMContext &Context, ID id,
   return FunctionType::get(ResultTy, ArgTys, false);
 }
 
-bool Intrinsic::isOverloaded(ID id) {
+// Check if an intrinsic is overloaded or not using its linear index.
+static bool isOverloadedUsingLinearIndex(unsigned Idx) {
 #define GET_INTRINSIC_OVERLOAD_TABLE
 #include "llvm/IR/IntrinsicImpl.inc"
 #undef GET_INTRINSIC_OVERLOAD_TABLE
 }
 
-/// Table of per-target intrinsic name tables.
-#define GET_INTRINSIC_TARGET_DATA
-#include "llvm/IR/IntrinsicImpl.inc"
-#undef GET_INTRINSIC_TARGET_DATA
+bool Intrinsic::isOverloaded(ID id) {
+  assert(IsIntrinsicIDValid(id));
+  return isOverloadedUsingLinearIndex(getLinearIndex(id));
+}
 
 bool Intrinsic::isTargetIntrinsic(Intrinsic::ID IID) {
-  return IID > TargetInfos[0].Count;
+  return IID != Intrinsic::not_intrinsic && DecodeIntrinsicID(IID).first != 0;
 }
 
 int llvm::Intrinsic::lookupLLVMIntrinsicByName(ArrayRef<const char *> NameTable,
@@ -683,7 +733,29 @@ findTargetSubtable(StringRef Name) {
   // We've either found the target or just fall back to the generic set, which
   // is always first.
   const auto &TI = It != Targets.end() && It->Name == Target ? *It : Targets[0];
-  return {ArrayRef(&IntrinsicNameTable[1] + TI.Offset, TI.Count), TI.Name};
+  unsigned LinearIndex = TI.FirstLinearIndex;
+  return {ArrayRef(IntrinsicNameTable + LinearIndex, TI.Count), TI.Name};
+}
+
+static Intrinsic::ID getIntrinsicIDFromIndex(unsigned Idx) {
+  if (Idx == 0)
+    return Intrinsic::not_intrinsic;
+
+  auto It =
+      partition_point(TargetInfos, [Idx](const IntrinsicTargetInfo &Info) {
+        return Info.FirstLinearIndex + Info.Count < Idx;
+      });
+  // Idx, if present, will be in the entry at It or It + 1.
+  if (It == std::end(TargetInfos))
+    return Intrinsic::not_intrinsic;
+  unsigned TargetIndex = std::distance(std::begin(TargetInfos), It);
+  if (It->FirstLinearIndex <= Idx && Idx < It->FirstLinearIndex + It->Count)
+    return EncodeIntrinsicID(TargetIndex, Idx - It->FirstLinearIndex);
+  ++It;
+  ++TargetIndex;
+  if (It->FirstLinearIndex <= Idx && Idx < It->FirstLinearIndex + It->Count)
+    return EncodeIntrinsicID(TargetIndex, Idx - It->FirstLinearIndex);
+  return Intrinsic::not_intrinsic;
 }
 
 /// This does the actual lookup of an intrinsic ID which matches the given
@@ -693,19 +765,19 @@ Intrinsic::ID Intrinsic::lookupIntrinsicID(StringRef Name) {
   int Idx = Intrinsic::lookupLLVMIntrinsicByName(NameTable, Name, Target);
   if (Idx == -1)
     return Intrinsic::not_intrinsic;
-
-  // Intrinsic IDs correspond to the location in IntrinsicNameTable, but we have
-  // an index into a sub-table.
+  const auto MatchSize = strlen(NameTable[Idx]);
+  // Adjust the index from sub-table index to index into the global table.
   int Adjust = NameTable.data() - IntrinsicNameTable;
-  Intrinsic::ID ID = static_cast<Intrinsic::ID>(Idx + Adjust);
+  Idx += Adjust;
 
   // If the intrinsic is not overloaded, require an exact match. If it is
   // overloaded, require either exact or prefix match.
-  const auto MatchSize = strlen(NameTable[Idx]);
   assert(Name.size() >= MatchSize && "Expected either exact or prefix match");
   bool IsExactMatch = Name.size() == MatchSize;
-  return IsExactMatch || Intrinsic::isOverloaded(ID) ? ID
-                                                     : Intrinsic::not_intrinsic;
+  Intrinsic::ID r = IsExactMatch || isOverloadedUsingLinearIndex(Idx)
+                        ? getIntrinsicIDFromIndex(Idx)
+                        : Intrinsic::not_intrinsic;
+  return r;
 }
 
 /// This defines the "Intrinsic::getAttributes(ID id)" method.
@@ -1043,8 +1115,9 @@ bool Intrinsic::matchIntrinsicVarArg(
 
 bool Intrinsic::getIntrinsicSignature(Intrinsic::ID ID, FunctionType *FT,
                                       SmallVectorImpl<Type *> &ArgTys) {
-  if (!ID)
+  if (ID == Intrinsic::not_intrinsic)
     return false;
+  assert(IsIntrinsicIDValid(ID));
 
   SmallVector<Intrinsic::IITDescriptor, 8> Table;
   getIntrinsicInfoTableEntries(ID, Table);
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 5a848ada9dd8ee..30aadcce028c23 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7542,7 +7542,7 @@ static unsigned getIntrinsicID(const SDNode *N) {
     return Intrinsic::not_intrinsic;
   case ISD::INTRINSIC_WO_CHAIN: {
     unsigned IID = N->getConstantOperandVal(0);
-    if (IID < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(IID))
       return IID;
     return Intrinsic::not_intrinsic;
   }
diff --git a/llvm/lib/Target/X86/X86IntrinsicsInfo.h b/llvm/lib/Target/X86/X86IntrinsicsInfo.h
index 86fd04046d16a0..f5477ec2f81315 100644
--- a/llvm/lib/Target/X86/X86IntrinsicsInfo.h
+++ b/llvm/lib/Target/X86/X86IntrinsicsInfo.h
@@ -79,8 +79,7 @@ enum IntrinsicType : uint16_t {
 };
 
 struct IntrinsicData {
-
-  uint16_t Id;
+  unsigned Id;
   IntrinsicType Type;
   uint16_t Opc0;
   uint16_t Opc1;
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
index 365d0c9fbff494..8d96f703f97892 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
@@ -36,7 +36,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
 // CHECK-NEXT:     GIM_Try, /*On fail goto*//*Label 3*/ GIMT_Encode4([[L72:[0-9]+]]), // Rule ID 0 //
 // CHECK-NEXT:       GIM_CheckSimplePredicate, GIMT_Encode2(GICXXPred_Simple_IsRule0Enabled),
 // CHECK-NEXT:       GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
-// CHECK-NEXT:       GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, GIMT_Encode2(Intrinsic::1in_1out),
+// CHECK-NEXT:       GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, GIMT_Encode4(Intrinsic::1in_1out),
 // CHECK-NEXT:       // MIs[0] a
 // CHECK-NEXT:       // No operand predicates
 // CHECK-NEXT:       // MIs[0] Operand 2
@@ -45,10 +45,10 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
 // CHECK-NEXT:       // Combiner Rule #0: IntrinTest0
 // CHECK-NEXT:       GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::G_INTRINSIC),
 // CHECK-NEXT:       GIR_AddTempRegister, /*InsnID*/0, /*TempRegID*/0, /*TempRegFlags*/GIMT_Encode2(RegState::Define),
-// CHECK-NEXT:       GIR_AddIntrinsicID, /*MI*/0, GIMT_Encode2(Intrinsic::0in_1out),
+// CHECK-NEXT:       GIR_AddIntrinsicID, /*MI*/0, GIMT_Encode4(Intrinsic::0in_1out),
 // CHECK-NEXT:       GIR_BuildMI, /*InsnID*/1, /*Opcode*/GIMT_Encode2(TargetOpcode::G...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Oct 24, 2024

@llvm/pr-subscribers-backend-aarch64

Author: Rahul Joshi (jurahul)

Changes

Change Intrinsic::ID enum values to encode an 8-bit target index in upper 16-bits and a 16-bit intrinsic index (within the target) in lower 16-bits.

This change is in preparation for being able to disable intrinsics for targets that are not enabled.


Patch is 64.78 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/113576.diff

30 Files Affected:

  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h (+2-2)
  • (modified) llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h (+2-2)
  • (modified) llvm/include/llvm/IR/Intrinsics.h (+9)
  • (added) llvm/include/llvm/Support/IntrinsicID.h (+59)
  • (modified) llvm/lib/CodeGen/MachineOperand.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/MachineVerifier.cpp (+2-2)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp (+1-1)
  • (modified) llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp (+1-1)
  • (modified) llvm/lib/IR/Core.cpp (+2-1)
  • (modified) llvm/lib/IR/Intrinsics.cpp (+95-22)
  • (modified) llvm/lib/Target/AArch64/AArch64ISelLowering.cpp (+1-1)
  • (modified) llvm/lib/Target/X86/X86IntrinsicsInfo.h (+1-2)
  • (modified) llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td (+7-7)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-SDNodeXForm-timm.td (+2-2)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-immarg-literal-pattern.td (+2-2)
  • (modified) llvm/test/TableGen/GlobalISelEmitter-input-discard.td (+1-1)
  • (modified) llvm/test/TableGen/GlobalISelEmitter.td (+4-4)
  • (modified) llvm/test/TableGen/GlobalISelEmitterOverloadedPtr.td (+1-1)
  • (modified) llvm/test/TableGen/immarg-predicated.td (+1-1)
  • (modified) llvm/test/TableGen/immarg.td (+1-1)
  • (modified) llvm/test/TableGen/intrinsic-overload-conflict.td (+1-1)
  • (modified) llvm/test/TableGen/intrinsic-struct.td (+1-1)
  • (modified) llvm/test/TableGen/predicate-patfags.td (+2-2)
  • (modified) llvm/unittests/IR/VPIntrinsicTest.cpp (+8-7)
  • (modified) llvm/utils/TableGen/Basic/CodeGenIntrinsics.cpp (+113-46)
  • (modified) llvm/utils/TableGen/Basic/CodeGenIntrinsics.h (+19-14)
  • (modified) llvm/utils/TableGen/Common/CodeGenDAGPatterns.cpp (+1-1)
  • (modified) llvm/utils/TableGen/Common/CodeGenDAGPatterns.h (+3-11)
  • (modified) llvm/utils/TableGen/Common/GlobalISel/GlobalISelMatchTable.cpp (+2-2)
  • (modified) llvm/utils/TableGen/IntrinsicEmitter.cpp (+103-81)
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
index 7b42722ca8d4f1..383ce96813d84d 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutor.h
@@ -275,7 +275,7 @@ enum {
   /// Check the operand is a specific intrinsic ID
   /// - InsnID(ULEB128) - Instruction ID
   /// - OpIdx(ULEB128) - Operand index
-  /// - IID(2) - Expected Intrinsic ID
+  /// - IID(4) - Expected Intrinsic ID
   GIM_CheckIntrinsicID,
 
   /// Check the operand is a specific predicate
@@ -411,7 +411,7 @@ enum {
 
   /// Adds an intrinsic ID to the specified instruction.
   /// - InsnID(ULEB128) - Instruction ID to modify
-  /// - IID(2) - Intrinsic ID
+  /// - IID(4) - Intrinsic ID
   GIR_AddIntrinsicID,
 
   /// Marks the implicit def of a register as dead.
diff --git a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
index 9f8eb030a96c64..775998fe53a7be 100644
--- a/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
+++ b/llvm/include/llvm/CodeGen/GlobalISel/GIMatchTableExecutorImpl.h
@@ -889,7 +889,7 @@ bool GIMatchTableExecutor::executeMatchTable(
     case GIM_CheckIntrinsicID: {
       uint64_t InsnID = readULEB();
       uint64_t OpIdx = readULEB();
-      uint16_t Value = readU16();
+      uint32_t Value = readU32();
       DEBUG_WITH_TYPE(TgtExecutor::getName(),
                       dbgs() << CurrentIdx << ": GIM_CheckIntrinsicID(MIs["
                              << InsnID << "]->getOperand(" << OpIdx
@@ -1185,7 +1185,7 @@ bool GIMatchTableExecutor::executeMatchTable(
     }
     case GIR_AddIntrinsicID: {
       uint64_t InsnID = readULEB();
-      uint16_t Value = readU16();
+      uint32_t Value = readU32();
       assert(OutMIs[InsnID] && "Attempted to add to undefined instruction");
       OutMIs[InsnID].addIntrinsicID((Intrinsic::ID)Value);
       DEBUG_WITH_TYPE(TgtExecutor::getName(),
diff --git a/llvm/include/llvm/IR/Intrinsics.h b/llvm/include/llvm/IR/Intrinsics.h
index e893295e3272b9..e04711c45de803 100644
--- a/llvm/include/llvm/IR/Intrinsics.h
+++ b/llvm/include/llvm/IR/Intrinsics.h
@@ -47,8 +47,17 @@ namespace Intrinsic {
 #define GET_INTRINSIC_ENUM_VALUES
 #include "llvm/IR/IntrinsicEnums.inc"
 #undef GET_INTRINSIC_ENUM_VALUES
+    end_id = ~0U,
   };
 
+  // Returns if `id` is a valid intrinsic ID. Validity means that it value is
+  // one of the values defined for the Intrinsic::ID enum.
+  bool IsIntrinsicIDValid(ID id);
+
+  // Get the next valid ID. This is used in test cases that iterate over valid
+  // intrinsic ID enums.
+  ID GetNextValidIntrinsicID(ID id);
+
   /// Return the LLVM name for an intrinsic, such as "llvm.ppc.altivec.lvx".
   /// Note, this version is for intrinsics with no overloads.  Use the other
   /// version of getName if overloads are required.
diff --git a/llvm/include/llvm/Support/IntrinsicID.h b/llvm/include/llvm/Support/IntrinsicID.h
new file mode 100644
index 00000000000000..60b13d85ec9070
--- /dev/null
+++ b/llvm/include/llvm/Support/IntrinsicID.h
@@ -0,0 +1,59 @@
+//===- llvm/Support/IntrinsicID.h - Intrinsic ID encoding -------*- C++ -*-===//
+//
+// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
+// See https://llvm.org/LICENSE.txt for license information.
+// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
+//
+//===----------------------------------------------------------------------===//
+//
+// This file contains functions to support intrinsic ID encoding. The
+// Intrinsic::ID enum value is constructed using a target prefix index in bits
+// 23-16 (8-bit) and an intrinsic index (index within the list of intrinsics for
+// tha target) in lower 16 bits. To support Intrinsic::ID 0 being not used, the
+// intrinsic index is encoded as Index + 1 for all targets.
+//
+// This file defines functions that encapsulate this encoding.
+//
+//===----------------------------------------------------------------------===//
+
+#ifndef LLVM_SUPPORT_INTRINSIC_ID_H
+#define LLVM_SUPPORT_INTRINSIC_ID_H
+
+#include "llvm/Support/FormatVariadic.h"
+#include <limits>
+#include <optional>
+#include <utility>
+
+namespace llvm::Intrinsic {
+typedef unsigned ID;
+
+inline ID EncodeIntrinsicID(unsigned TargetIndex, unsigned IntrinsicIndex) {
+  assert(IntrinsicIndex < std::numeric_limits<uint16_t>::max());
+  assert(TargetIndex <= std::numeric_limits<uint8_t>::max());
+  return (TargetIndex << 16) | (IntrinsicIndex + 1);
+}
+
+inline std::pair<unsigned, unsigned> DecodeIntrinsicID(ID id) {
+  unsigned IntrinsicIndex = id & 0xFFFF;
+  unsigned TargetIndex = id >> 16;
+  assert(IntrinsicIndex != 0);
+  return {TargetIndex, IntrinsicIndex - 1};
+}
+
+inline std::optional<std::pair<unsigned, unsigned>>
+DecodeIntrinsicIDNoFail(ID id) {
+  unsigned IntrinsicIndex = id & 0xFFFF;
+  unsigned TargetIndex = id >> 16;
+  if (IntrinsicIndex == 0)
+    return std::nullopt;
+  return std::make_pair(TargetIndex, IntrinsicIndex - 1);
+}
+
+inline void PrintIntrinsicIDEncoding(raw_ostream &OS, unsigned TargetIndex,
+                                     unsigned IntrinsicIndex) {
+  OS << formatv(" = ({} << 16) + {} + 1", TargetIndex, IntrinsicIndex);
+}
+
+} // end namespace llvm::Intrinsic
+
+#endif // LLVM_SUPPORT_INTRINSIC_ID_H
diff --git a/llvm/lib/CodeGen/MachineOperand.cpp b/llvm/lib/CodeGen/MachineOperand.cpp
index c0e004555de959..8261ac9401d6ae 100644
--- a/llvm/lib/CodeGen/MachineOperand.cpp
+++ b/llvm/lib/CodeGen/MachineOperand.cpp
@@ -992,7 +992,7 @@ void MachineOperand::print(raw_ostream &OS, ModuleSlotTracker &MST,
   }
   case MachineOperand::MO_IntrinsicID: {
     Intrinsic::ID ID = getIntrinsicID();
-    if (ID < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(ID))
       OS << "intrinsic(@" << Intrinsic::getBaseName(ID) << ')';
     else if (IntrinsicInfo)
       OS << "intrinsic(@" << IntrinsicInfo->getName(ID) << ')';
diff --git a/llvm/lib/CodeGen/MachineVerifier.cpp b/llvm/lib/CodeGen/MachineVerifier.cpp
index e2c09fe25d55cd..77440e5e7275f7 100644
--- a/llvm/lib/CodeGen/MachineVerifier.cpp
+++ b/llvm/lib/CodeGen/MachineVerifier.cpp
@@ -1055,7 +1055,7 @@ bool MachineVerifier::verifyGIntrinsicSideEffects(const MachineInstr *MI) {
   bool NoSideEffects = Opcode == TargetOpcode::G_INTRINSIC ||
                        Opcode == TargetOpcode::G_INTRINSIC_CONVERGENT;
   unsigned IntrID = cast<GIntrinsic>(MI)->getIntrinsicID();
-  if (IntrID != 0 && IntrID < Intrinsic::num_intrinsics) {
+  if (IntrID != 0 && Intrinsic::IsIntrinsicIDValid(IntrID)) {
     AttributeList Attrs = Intrinsic::getAttributes(
         MF->getFunction().getContext(), static_cast<Intrinsic::ID>(IntrID));
     bool DeclHasSideEffects = !Attrs.getMemoryEffects().doesNotAccessMemory();
@@ -1079,7 +1079,7 @@ bool MachineVerifier::verifyGIntrinsicConvergence(const MachineInstr *MI) {
   bool NotConvergent = Opcode == TargetOpcode::G_INTRINSIC ||
                        Opcode == TargetOpcode::G_INTRINSIC_W_SIDE_EFFECTS;
   unsigned IntrID = cast<GIntrinsic>(MI)->getIntrinsicID();
-  if (IntrID != 0 && IntrID < Intrinsic::num_intrinsics) {
+  if (IntrID != 0 && Intrinsic::IsIntrinsicIDValid(IntrID)) {
     AttributeList Attrs = Intrinsic::getAttributes(
         MF->getFunction().getContext(), static_cast<Intrinsic::ID>(IntrID));
     bool DeclIsConvergent = Attrs.hasFnAttr(Attribute::Convergent);
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
index 703efb70089742..d9e3324c6a4981 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp
@@ -160,7 +160,7 @@ std::string SDNode::getOperationName(const SelectionDAG *G) const {
   case ISD::INTRINSIC_W_CHAIN: {
     unsigned OpNo = getOpcode() == ISD::INTRINSIC_WO_CHAIN ? 0 : 1;
     unsigned IID = getOperand(OpNo)->getAsZExtVal();
-    if (IID < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(IID))
       return Intrinsic::getBaseName((Intrinsic::ID)IID).str();
     if (!G)
       return "Unknown intrinsic";
diff --git a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
index 981ab18b59c1c1..391f6facdc901e 100644
--- a/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
+++ b/llvm/lib/CodeGen/SelectionDAG/SelectionDAGISel.cpp
@@ -4431,7 +4431,7 @@ void SelectionDAGISel::CannotYetSelect(SDNode *N) {
   } else {
     bool HasInputChain = N->getOperand(0).getValueType() == MVT::Other;
     unsigned iid = N->getConstantOperandVal(HasInputChain);
-    if (iid < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(iid))
       Msg << "intrinsic %" << Intrinsic::getBaseName((Intrinsic::ID)iid);
     else if (const TargetIntrinsicInfo *TII = TM.getIntrinsicInfo())
       Msg << "target intrinsic %" << TII->getName(iid);
diff --git a/llvm/lib/IR/Core.cpp b/llvm/lib/IR/Core.cpp
index 1cf998c6850068..32f31e9900880d 100644
--- a/llvm/lib/IR/Core.cpp
+++ b/llvm/lib/IR/Core.cpp
@@ -2458,7 +2458,8 @@ unsigned LLVMGetIntrinsicID(LLVMValueRef Fn) {
 }
 
 static Intrinsic::ID llvm_map_to_intrinsic_id(unsigned ID) {
-  assert(ID < llvm::Intrinsic::num_intrinsics && "Intrinsic ID out of range");
+  assert(llvm::Intrinsic::IsIntrinsicIDValid(ID) &&
+         "Intrinsic ID out of range");
   return llvm::Intrinsic::ID(ID);
 }
 
diff --git a/llvm/lib/IR/Intrinsics.cpp b/llvm/lib/IR/Intrinsics.cpp
index 1b92daf15b463e..729d553c15a579 100644
--- a/llvm/lib/IR/Intrinsics.cpp
+++ b/llvm/lib/IR/Intrinsics.cpp
@@ -33,8 +33,54 @@
 #include "llvm/IR/IntrinsicsXCore.h"
 #include "llvm/IR/Module.h"
 #include "llvm/IR/Type.h"
+#include "llvm/Support/IntrinsicID.h"
 
 using namespace llvm;
+using namespace Intrinsic;
+
+/// Table of per-target intrinsic name tables.
+#define GET_INTRINSIC_TARGET_DATA
+#include "llvm/IR/IntrinsicImpl.inc"
+#undef GET_INTRINSIC_TARGET_DATA
+size_t constexpr NumTargets = sizeof(TargetInfos) / sizeof(TargetInfos[0]);
+
+// Returns true if the given intrinsic ID is valid, that is, its value is one
+// of the enum values defined for this intrinsic (including not_intrinsic).
+bool Intrinsic::IsIntrinsicIDValid(ID ID) {
+  if (ID == Intrinsic::not_intrinsic)
+    return true;
+  auto Decoded = DecodeIntrinsicIDNoFail(ID);
+  if (!Decoded)
+    return false;
+  unsigned TargetIdx = Decoded->first;
+  unsigned IntrinsicIdx = Decoded->second;
+  return TargetIdx < NumTargets && IntrinsicIdx < TargetInfos[TargetIdx].Count;
+}
+
+// Returns linear index of ID if its valid, else returns 0.
+inline unsigned getLinearIndex(Intrinsic::ID ID) {
+  auto Decoded = DecodeIntrinsicIDNoFail(ID);
+  if (!Decoded)
+    return 0;
+  unsigned TargetIdx = Decoded->first;
+  unsigned IntrinsicIdx = Decoded->second;
+  return TargetInfos[TargetIdx].FirstLinearIndex + IntrinsicIdx;
+}
+
+ID Intrinsic::GetNextValidIntrinsicID(Intrinsic::ID ID) {
+  if (ID == Intrinsic::not_intrinsic)
+    return EncodeIntrinsicID(0, 0);
+  if (ID == Intrinsic::last_valid_intrinsic_id)
+    return Intrinsic::end_id;
+  if (ID == Intrinsic::end_id)
+    llvm_unreachable("Cannot find the next valid intrisnic");
+  auto [TargetIndex, IntrinsicIndex] = DecodeIntrinsicID(ID);
+  if (IntrinsicIndex + 1 < TargetInfos[TargetIndex].Count)
+    return EncodeIntrinsicID(TargetIndex, IntrinsicIndex + 1);
+  if (TargetIndex + 1 < NumTargets)
+    return EncodeIntrinsicID(TargetIndex + 1, 0);
+  llvm_unreachable("Cannot find the next valid intrisnic");
+}
 
 /// Table of string intrinsic names indexed by enum value.
 static constexpr const char *const IntrinsicNameTable[] = {
@@ -45,12 +91,12 @@ static constexpr const char *const IntrinsicNameTable[] = {
 };
 
 StringRef Intrinsic::getBaseName(ID id) {
-  assert(id < num_intrinsics && "Invalid intrinsic ID!");
-  return IntrinsicNameTable[id];
+  assert(IsIntrinsicIDValid(id) && "Invalid intrinsic ID!");
+  return ArrayRef(IntrinsicNameTable)[getLinearIndex(id)];
 }
 
 StringRef Intrinsic::getName(ID id) {
-  assert(id < num_intrinsics && "Invalid intrinsic ID!");
+  assert(IsIntrinsicIDValid(id) && "Invalid intrinsic ID!");
   assert(!Intrinsic::isOverloaded(id) &&
          "This version of getName does not support overloading");
   return getBaseName(id);
@@ -157,8 +203,7 @@ static std::string getMangledTypeStr(Type *Ty, bool &HasUnnamedType) {
 static std::string getIntrinsicNameImpl(Intrinsic::ID Id, ArrayRef<Type *> Tys,
                                         Module *M, FunctionType *FT,
                                         bool EarlyModuleCheck) {
-
-  assert(Id < Intrinsic::num_intrinsics && "Invalid intrinsic ID!");
+  assert(IsIntrinsicIDValid(Id) && "Invalid intrinsic ID!");
   assert((Tys.empty() || Intrinsic::isOverloaded(Id)) &&
          "This version of getName is for overloaded intrinsics only");
   (void)EarlyModuleCheck;
@@ -450,11 +495,15 @@ DecodeIITType(unsigned &NextElt, ArrayRef<unsigned char> Infos,
 #undef GET_INTRINSIC_GENERATOR_GLOBAL
 
 void Intrinsic::getIntrinsicInfoTableEntries(
-    ID id, SmallVectorImpl<IITDescriptor> &T) {
+    ID IntrinsicID, SmallVectorImpl<IITDescriptor> &T) {
   static_assert(sizeof(IIT_Table[0]) == 2,
                 "Expect 16-bit entries in IIT_Table");
+  assert(IsIntrinsicIDValid(IntrinsicID));
+  unsigned Idx = getLinearIndex(IntrinsicID);
+  if (Idx == 0)
+    return;
   // Check to see if the intrinsic's type was expressible by the table.
-  uint16_t TableVal = IIT_Table[id - 1];
+  uint16_t TableVal = IIT_Table[Idx - 1];
 
   // Decode the TableVal into an array of IITValues.
   SmallVector<unsigned char> IITValues;
@@ -609,19 +658,20 @@ FunctionType *Intrinsic::getType(LLVMContext &Context, ID id,
   return FunctionType::get(ResultTy, ArgTys, false);
 }
 
-bool Intrinsic::isOverloaded(ID id) {
+// Check if an intrinsic is overloaded or not using its linear index.
+static bool isOverloadedUsingLinearIndex(unsigned Idx) {
 #define GET_INTRINSIC_OVERLOAD_TABLE
 #include "llvm/IR/IntrinsicImpl.inc"
 #undef GET_INTRINSIC_OVERLOAD_TABLE
 }
 
-/// Table of per-target intrinsic name tables.
-#define GET_INTRINSIC_TARGET_DATA
-#include "llvm/IR/IntrinsicImpl.inc"
-#undef GET_INTRINSIC_TARGET_DATA
+bool Intrinsic::isOverloaded(ID id) {
+  assert(IsIntrinsicIDValid(id));
+  return isOverloadedUsingLinearIndex(getLinearIndex(id));
+}
 
 bool Intrinsic::isTargetIntrinsic(Intrinsic::ID IID) {
-  return IID > TargetInfos[0].Count;
+  return IID != Intrinsic::not_intrinsic && DecodeIntrinsicID(IID).first != 0;
 }
 
 int llvm::Intrinsic::lookupLLVMIntrinsicByName(ArrayRef<const char *> NameTable,
@@ -683,7 +733,29 @@ findTargetSubtable(StringRef Name) {
   // We've either found the target or just fall back to the generic set, which
   // is always first.
   const auto &TI = It != Targets.end() && It->Name == Target ? *It : Targets[0];
-  return {ArrayRef(&IntrinsicNameTable[1] + TI.Offset, TI.Count), TI.Name};
+  unsigned LinearIndex = TI.FirstLinearIndex;
+  return {ArrayRef(IntrinsicNameTable + LinearIndex, TI.Count), TI.Name};
+}
+
+static Intrinsic::ID getIntrinsicIDFromIndex(unsigned Idx) {
+  if (Idx == 0)
+    return Intrinsic::not_intrinsic;
+
+  auto It =
+      partition_point(TargetInfos, [Idx](const IntrinsicTargetInfo &Info) {
+        return Info.FirstLinearIndex + Info.Count < Idx;
+      });
+  // Idx, if present, will be in the entry at It or It + 1.
+  if (It == std::end(TargetInfos))
+    return Intrinsic::not_intrinsic;
+  unsigned TargetIndex = std::distance(std::begin(TargetInfos), It);
+  if (It->FirstLinearIndex <= Idx && Idx < It->FirstLinearIndex + It->Count)
+    return EncodeIntrinsicID(TargetIndex, Idx - It->FirstLinearIndex);
+  ++It;
+  ++TargetIndex;
+  if (It->FirstLinearIndex <= Idx && Idx < It->FirstLinearIndex + It->Count)
+    return EncodeIntrinsicID(TargetIndex, Idx - It->FirstLinearIndex);
+  return Intrinsic::not_intrinsic;
 }
 
 /// This does the actual lookup of an intrinsic ID which matches the given
@@ -693,19 +765,19 @@ Intrinsic::ID Intrinsic::lookupIntrinsicID(StringRef Name) {
   int Idx = Intrinsic::lookupLLVMIntrinsicByName(NameTable, Name, Target);
   if (Idx == -1)
     return Intrinsic::not_intrinsic;
-
-  // Intrinsic IDs correspond to the location in IntrinsicNameTable, but we have
-  // an index into a sub-table.
+  const auto MatchSize = strlen(NameTable[Idx]);
+  // Adjust the index from sub-table index to index into the global table.
   int Adjust = NameTable.data() - IntrinsicNameTable;
-  Intrinsic::ID ID = static_cast<Intrinsic::ID>(Idx + Adjust);
+  Idx += Adjust;
 
   // If the intrinsic is not overloaded, require an exact match. If it is
   // overloaded, require either exact or prefix match.
-  const auto MatchSize = strlen(NameTable[Idx]);
   assert(Name.size() >= MatchSize && "Expected either exact or prefix match");
   bool IsExactMatch = Name.size() == MatchSize;
-  return IsExactMatch || Intrinsic::isOverloaded(ID) ? ID
-                                                     : Intrinsic::not_intrinsic;
+  Intrinsic::ID r = IsExactMatch || isOverloadedUsingLinearIndex(Idx)
+                        ? getIntrinsicIDFromIndex(Idx)
+                        : Intrinsic::not_intrinsic;
+  return r;
 }
 
 /// This defines the "Intrinsic::getAttributes(ID id)" method.
@@ -1043,8 +1115,9 @@ bool Intrinsic::matchIntrinsicVarArg(
 
 bool Intrinsic::getIntrinsicSignature(Intrinsic::ID ID, FunctionType *FT,
                                       SmallVectorImpl<Type *> &ArgTys) {
-  if (!ID)
+  if (ID == Intrinsic::not_intrinsic)
     return false;
+  assert(IsIntrinsicIDValid(ID));
 
   SmallVector<Intrinsic::IITDescriptor, 8> Table;
   getIntrinsicInfoTableEntries(ID, Table);
diff --git a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
index 5a848ada9dd8ee..30aadcce028c23 100644
--- a/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
+++ b/llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
@@ -7542,7 +7542,7 @@ static unsigned getIntrinsicID(const SDNode *N) {
     return Intrinsic::not_intrinsic;
   case ISD::INTRINSIC_WO_CHAIN: {
     unsigned IID = N->getConstantOperandVal(0);
-    if (IID < Intrinsic::num_intrinsics)
+    if (Intrinsic::IsIntrinsicIDValid(IID))
       return IID;
     return Intrinsic::not_intrinsic;
   }
diff --git a/llvm/lib/Target/X86/X86IntrinsicsInfo.h b/llvm/lib/Target/X86/X86IntrinsicsInfo.h
index 86fd04046d16a0..f5477ec2f81315 100644
--- a/llvm/lib/Target/X86/X86IntrinsicsInfo.h
+++ b/llvm/lib/Target/X86/X86IntrinsicsInfo.h
@@ -79,8 +79,7 @@ enum IntrinsicType : uint16_t {
 };
 
 struct IntrinsicData {
-
-  uint16_t Id;
+  unsigned Id;
   IntrinsicType Type;
   uint16_t Opc0;
   uint16_t Opc1;
diff --git a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
index 365d0c9fbff494..8d96f703f97892 100644
--- a/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
+++ b/llvm/test/TableGen/GlobalISelCombinerEmitter/match-table-intrinsics.td
@@ -36,7 +36,7 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
 // CHECK-NEXT:     GIM_Try, /*On fail goto*//*Label 3*/ GIMT_Encode4([[L72:[0-9]+]]), // Rule ID 0 //
 // CHECK-NEXT:       GIM_CheckSimplePredicate, GIMT_Encode2(GICXXPred_Simple_IsRule0Enabled),
 // CHECK-NEXT:       GIM_CheckNumOperands, /*MI*/0, /*Expected*/3,
-// CHECK-NEXT:       GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, GIMT_Encode2(Intrinsic::1in_1out),
+// CHECK-NEXT:       GIM_CheckIntrinsicID, /*MI*/0, /*Op*/1, GIMT_Encode4(Intrinsic::1in_1out),
 // CHECK-NEXT:       // MIs[0] a
 // CHECK-NEXT:       // No operand predicates
 // CHECK-NEXT:       // MIs[0] Operand 2
@@ -45,10 +45,10 @@ def MyCombiner: GICombiner<"GenMyCombiner", [
 // CHECK-NEXT:       // Combiner Rule #0: IntrinTest0
 // CHECK-NEXT:       GIR_BuildRootMI, /*Opcode*/GIMT_Encode2(TargetOpcode::G_INTRINSIC),
 // CHECK-NEXT:       GIR_AddTempRegister, /*InsnID*/0, /*TempRegID*/0, /*TempRegFlags*/GIMT_Encode2(RegState::Define),
-// CHECK-NEXT:       GIR_AddIntrinsicID, /*MI*/0, GIMT_Encode2(Intrinsic::0in_1out),
+// CHECK-NEXT:       GIR_AddIntrinsicID, /*MI*/0, GIMT_Encode4(Intrinsic::0in_1out),
 // CHECK-NEXT:       GIR_BuildMI, /*InsnID*/1, /*Opcode*/GIMT_Encode2(TargetOpcode::G...
[truncated]

@jayfoad
Copy link
Contributor

jayfoad commented Oct 25, 2024

This change is in preparation for being able to disable intrinsics for targets that are not enabled.

Why is it required for that? Is there some requirement that LLVM builds with different targets enabled use the same IDs?

@jurahul
Copy link
Contributor Author

jurahul commented Oct 25, 2024

This change is in preparation for being able to disable intrinsics for targets that are not enabled.

Why is it required for that? Is there some requirement that LLVM builds with different targets enabled use the same IDs?

please see discourse thread here: https://discourse.llvm.org/t/rfc-compress-intrinsic-name-table/82412/20

It seems there is a desire to not "rock the boat" too much and keep the enum values stable atleast with different targets enabled but built from the same source. The choice was to use an indirection table, or this. This scheme helps derive a linear index from the ID enum without an additional LUT and additionally, it keeps target's IDs stable within a target prefix as well if no new intrinsics are added with that prefix. Currently, any new prefix in another target can change all ID enum values.

I do not completely understand the use case of building LLVM with different targets and have the 2 builds interoperate, but given that it was not too involved, this is the result.

@jurahul jurahul force-pushed the intrinsic_id_values_tgt_idx_intr_idx branch from 565c8ca to ae715b0 Compare October 25, 2024 20:51
@jurahul jurahul requested review from jayfoad and rnk October 25, 2024 21:04
/// Check the operand is a specific intrinsic ID
/// - InsnID(ULEB128) - Instruction ID
/// - OpIdx(ULEB128) - Operand index
/// - IID(2) - Expected Intrinsic ID
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you omit the extra 2 bytes by assuming the current target's index?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't sure. What if a GISel patterns refers to a generic intrinsic or some other targets intrinsics? For instance, AMDGPU likely refers to mix of r600 and amdgcn intrinsics?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMDGPU shouldn't mix r600 and amdgcn intrinsics, but there might be a handful remaining that do that mixing. @AlexVlx also wanted to use amdgcn intrinsics in spirv

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AMDGPU shouldn't mix r600 and amdgcn intrinsics, but there might be a handful remaining that do that mixing. @AlexVlx also wanted to use amdgcn intrinsics in spirv

We do use them for AMDGCN flavoured SPIR-V, but I admit I was not aware and have not really internalised what this change would imply in that context. I’ll try to parse it now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I don't think it will directly affect anything at the moment. A follow on change would need to be aware of which targets use which intrinsic prefixes to make sure we do disable the right ones for that target,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My concern with making this 2 byte is: (1) won't we have files with patterns that use mix of target independent and target dependent builtins. In such cases, just encoding the 16-bit target index is not sufficient, and (2) premature optimizations? If its really important, I'd prefer doing this as an independent change.

Comment on lines +53 to +51
if (!Decoded)
return false;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no fail but then it needs a null check?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, so in some cases we really just want to return 0 or false if decoding failed, and in other cases we want to assert of fail in different ways, so the nofail variant just reports failure without an assert. Maybe we just have one version that return std::optional and if it has failed, dereferencing the std::optional will fail in cases where decodeIntrinsicID is used? So we don't have 2 variants

@jurahul jurahul force-pushed the intrinsic_id_values_tgt_idx_intr_idx branch from ae715b0 to f32b1e4 Compare October 25, 2024 23:08
@jurahul jurahul requested a review from arsenm October 25, 2024 23:23
Copy link
Collaborator

@rnk rnk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I'm happy with this, but please make sure other reviewer comments are addressed.

@jurahul
Copy link
Contributor Author

jurahul commented Oct 29, 2024

Thanks, I'm happy with this, but please make sure other reviewer comments are addressed.

Sure. However, I think there is an open discussion on discourse that needs resolution to decide whether this change is necessary in the first place. It would be great if you can comment on making all of us understand the use case that motivated this change.

https://discourse.llvm.org/t/rfc-compress-intrinsic-name-table/82412/30

@nikic
Copy link
Contributor

nikic commented Nov 5, 2024

Have you checked whether this does help incremental builds in practice?

@jayfoad
Copy link
Contributor

jayfoad commented Nov 5, 2024

Have you checked whether this does help incremental builds in practice?

I checked and it does seem to help. Adding a new AMDGPU intrinsic did not cause any other target's .cpp files to be rebuilt. (But TBH I'm still not a huge fan of this patch, just because it seems like extra complexity that we don't really need.)

Change `Intrinsic::ID` enum values to encode an 8-bit target index
in upper 16-bits and a 16-bit intrinsic index (within the target) in
lower 16-bits.

This change is in preparation for being able to disable intrinsics for
targets that are not enabled.
@jurahul jurahul force-pushed the intrinsic_id_values_tgt_idx_intr_idx branch from f32b1e4 to d58ffe7 Compare November 5, 2024 15:58
@jurahul
Copy link
Contributor Author

jurahul commented Nov 5, 2024

Have you checked whether this does help incremental builds in practice?

I checked and it does seem to help. Adding a new AMDGPU intrinsic did not cause any other target's .cpp files to be rebuilt. (But TBH I'm still not a huge fan of this patch, just because it seems like extra complexity that we don't really need.)

Thanks for checking. I'd say if we were actually going forward with compiling out intrinsics for disabled targets, then this mildly extra complexity might have been worth it. But given that that is not going to happen, this reduces to a pure refactor/NFC for better separation between targets. I'll let the relevant folks decide if it's worth committing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants