Skip to content

Conversation

@tltao
Copy link
Contributor

@tltao tltao commented Mar 19, 2024

Recreate the hard register PR created on phabricator with some minor additions: https://reviews.llvm.org/D105142

The main idea is to allow Clang to support the ability to specify specific hardware registers in inline assembly constraints via the use of curly braces {}. As such, this is mainly a Clang change.

Relevant RFCs posted here:

https://lists.llvm.org/pipermail/llvm-dev/2021-June/151370.html
https://gcc.gnu.org/pipermail/gcc/2021-June/236269.html (implemented - see https://gcc.gnu.org/onlinedocs/gcc/Hard-Register-Constraints.html)

Copying the Summary from the phabricator patch:

  • This is put up as an RFC patch to get feedback about the introduction of a new inline asm constraint which supports hard register operands
  • This is mostly a clang change, since the LLVM IR for the inline assembly already supports the {...} syntax which the backend recognizes. This change merely completes the loop in terms of introducing a user facing constraint which maps to it.

The following design decisions were taken for this patch:

For the Sema side:

  • We validate the "{" constraint using a two-phase validation approach. Firstly, we check if there is a target dependent implementation to handle the {....} constraint. If it fails, then we move on to a target agnostic check where we parse and validate the {....} constraint.
  • Why do we do this? Well, there are some targets which already process the {...} as a user facing constraint. For example the AMDGPU target. It supports syntax of the form {register-name} as well as {register-name[...]}. Moving this implementation to the target agnostic side seems to set a precedent, that we can keep extending the target agnostic implementation based on new cases for certain targets, which would be better served of moving it to the respective target.
  • In terms of the target agnostic validation, we simply check for the following syntax {.*}. The parsed out content within the two curly braces is checked to see whether its a "valid GCC register".

For the Clang CodeGen side:

  • Most of the work is done in the AddVariableConstraints function in CGStmt.cpp, which is responsible for emitting the LLVM IR corresponding to the usage of an actual register that the backend can use. Coincidentally, the LLVM IR is also {...}. As mentioned above, this is essentially mapping the LLVM Inline Assembly IR back to a user facing inline asm constraint.
  • Within this function we add in logic to check if the constraint is of the form [&]{...} in addition to the "register asm" construct.
  • A scenario where it was applicable to apply both the "register asm" construct and the "hard register inline asm constraint" was diagnosed as an unsupported error because there's no way the compiler will know which register the user meant. The safest option here is to error out explicitly, and put onus back on the user.
  • To achieve this, and refactor it in a nice way, most of the logic pertaining to the "register asm" construct has been moved into a separate function called ShouldApplyRegisterVariableConstraint which deduces whether to apply the "register asm" construct or not.
  • Furthermore, the GCCReg field is set with the "Register" only if the register is validated to be a "valid GCC Register" type. To me it doesn't make a lot of sense to set GCC Reg to something that might not necessarily be a "valid GCC register" as defined by respective targets. Would we have a case where we could have a {.*} constraint where the contents inside the curly brace is not a valid register? Yes. For example, The x86 target supports the "@cca" constraint, which is validated on the Sema side as a valid constraint, and before processing is converted to "{@cca}" (via the convertAsmConstraint function. "@cca" is not a valid "GCC register name". So in these case, we'll simply emit the constraint without setting GCCReg (with the assumption that the respective backends deal with it appropriately)

Tests:

  • Various tests were updated to account for the new behaviour.
  • I added a few SystemZ tests because we work on the Z backend, but I have no issues adding testing for multiple targets.
  • There were a few mentions of "waiting" for the GCC implementation of the same to land before the Clang side lands. As mentioned above, the intent to implement it on the GCC side has already been put forward via the RFC. I have no issues "parking" this implementation until its ready to be merged in. However, it might be good to hash out any open questions/concerns in the interim.

Differences with GCC:

For the most part, the Clang implementation follows the GCC implementation. However, one difference between the Clang implementation and the GCC implementation is in the following example:

int x;
asm ("" : "=r" (x), "={r1}" (x));  (1)
asm ("" : "=r2" (x), "={r1}" (x));  (2)

In the GCC implementation, both statements above emits the error matching constraint not valid in output operand. In LLVM, statement (1) emits the error couldn't allocate output register for constraint 'x' from InlineAsmLowering/selectionDAG whereas statement (2) emits no errors.

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:ARM backend:AArch64 backend:RISC-V backend:SystemZ clang:frontend Language frontend issues, e.g. anything involving "Sema" clang:codegen IR generation bugs: mangling, exceptions, etc. labels Mar 19, 2024
@llvmbot
Copy link
Member

llvmbot commented Mar 19, 2024

@llvm/pr-subscribers-backend-systemz
@llvm/pr-subscribers-backend-risc-v
@llvm/pr-subscribers-clang
@llvm/pr-subscribers-clang-codegen

@llvm/pr-subscribers-backend-aarch64

Author: None (tltao)

Changes

Recreate the hard register PR created on phabricator with some minor additions: https://reviews.llvm.org/D105142

The main idea is to allow Clang to support the ability to specify specific hardware registers in inline assembly constraints via the use of curly braces {}. As such, this is mainly a Clang change.

Relevant RFCs posted here:

https://lists.llvm.org/pipermail/llvm-dev/2021-June/151370.html
https://gcc.gnu.org/pipermail/gcc/2021-June/236269.html

Copying the Summary from the phabricator patch:

  • This is put up as an RFC patch to get feedback about the introduction of a new inline asm constraint which supports hard register operands
  • This is mostly a clang change, since the LLVM IR for the inline assembly already supports the {...} syntax which the backend recognizes. This change merely completes the loop in terms of introducing a user facing constraint which maps to it.

The following design decisions were taken for this patch:

For the Sema side:

  • We validate the "{" constraint using a two-phase validation approach. Firstly, we check if there is a target dependent implementation to handle the {....} constraint. If it fails, then we move on to a target agnostic check where we parse and validate the {....} constraint.
  • Why do we do this? Well, there are some targets which already process the {...} as a user facing constraint. For example the AMDGPU target. It supports syntax of the form {register-name} as well as {register-name[...]}. Moving this implementation to the target agnostic side seems to set a precedent, that we can keep extending the target agnostic implementation based on new cases for certain targets, which would be better served of moving it to the respective target.
  • In terms of the target agnostic validation, we simply check for the following syntax {.*}. The parsed out content within the two curly braces is checked to see whether its a "valid GCC register".

For the Clang CodeGen side:

  • Most of the work is done in the AddVariableConstraints function in CGStmt.cpp, which is responsible for emitting the LLVM IR corresponding to the usage of an actual register that the backend can use. Coincidentally, the LLVM IR is also {...}. As mentioned above, this is essentially mapping the LLVM Inline Assembly IR back to a user facing inline asm constraint.
  • Within this function we add in logic to check if the constraint is of the form [&]{...} in addition to the "register asm" construct.
  • A scenario where it was applicable to apply both the "register asm" construct and the "hard register inline asm constraint" was diagnosed as an unsupported error because there's no way the compiler will know which register the user meant. The safest option here is to error out explicitly, and put onus back on the user.
  • To achieve this, and refactor it in a nice way, most of the logic pertaining to the "register asm" construct has been moved into a separate function called ShouldApplyRegisterVariableConstraint which deduces whether to apply the "register asm" construct or not.
  • Furthermore, the GCCReg field is set with the "Register" only if the register is validated to be a "valid GCC Register" type. To me it doesn't make a lot of sense to set GCC Reg to something that might not necessarily be a "valid GCC register" as defined by respective targets. Would we have a case where we could have a {.*} constraint where the contents inside the curly brace is not a valid register? Yes. For example, The x86 target supports the "@cca" constraint, which is validated on the Sema side as a valid constraint, and before processing is converted to "{@cca}" (via the convertAsmConstraint function. "@cca" is not a valid "GCC register name". So in these case, we'll simply emit the constraint without setting GCCReg (with the assumption that the respective backends deal with it appropriately)

Tests:

  • Various tests were updated to account for the new behaviour.
  • I added a few SystemZ tests because we work on the Z backend, but I have no issues adding testing for multiple targets.
  • There were a few mentions of "waiting" for the GCC implementation of the same to land before the Clang side lands. As mentioned above, the intent to implement it on the GCC side has already been put forward via the RFC. I have no issues "parking" this implementation until its ready to be merged in. However, it might be good to hash out any open questions/concerns in the interim.

Patch is 68.16 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/85846.diff

19 Files Affected:

  • (modified) clang/docs/LanguageExtensions.rst (+43)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+2)
  • (modified) clang/include/clang/Basic/TargetInfo.h (+17-1)
  • (modified) clang/lib/Basic/TargetInfo.cpp (+54)
  • (modified) clang/lib/Basic/Targets/AArch64.h (-5)
  • (modified) clang/lib/Basic/Targets/ARM.h (-5)
  • (modified) clang/lib/Basic/Targets/RISCV.h (-5)
  • (modified) clang/lib/Basic/Targets/X86.h (+1-1)
  • (modified) clang/lib/CodeGen/CGStmt.cpp (+83-18)
  • (modified) clang/test/CodeGen/SystemZ/systemz-inline-asm-02.c (+8-2)
  • (modified) clang/test/CodeGen/SystemZ/systemz-inline-asm.c (+16-3)
  • (modified) clang/test/CodeGen/aarch64-inline-asm.c (+10-2)
  • (modified) clang/test/CodeGen/asm-goto.c (+3-3)
  • (modified) clang/test/CodeGen/ms-intrinsics.c (+16-16)
  • (added) clang/test/CodeGen/ms-intrinsics.ll (+533)
  • (added) clang/test/CodeGen/x86-asm-register-constraint-mix.c (+62)
  • (added) clang/test/CodeGen/z-hard-register-inline-asm.c (+52)
  • (added) clang/test/Sema/z-hard-register-inline-asm.c (+50)
  • (added) llvm/test/CodeGen/SystemZ/zos-inline-asm.ll (+250)
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index 13d7261d83d7f1..39d1981a068992 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -1760,6 +1760,49 @@ references can be used instead of numeric references.
       return -1;
   }
 
+Hard Register Operands for ASM Constraints
+==========================================
+
+Clang supports the ability to specify specific hardware registers in inline
+assembly constraints via the use of curly braces ``{}``.
+
+Prior to clang-19, the only way to associate an inline assembly constraint
+with a specific register is via the local register variable feature (`GCC
+Specifying Registers for Local Variables <https://gcc.gnu.org/onlinedocs/gcc-6.5.0/gcc/Local-Register-Variables.html>`_). However, the local register variable association lasts for the entire
+scope of the variable.
+
+Hard register operands will instead only apply to the specific inline ASM
+statement which improves readability and solves a few other issues experienced
+by local register variables, such as:
+
+* function calls might clobber register variables
+* the constraints for the register operands are superfluous
+* one register variable cannot be used for 2 different inline
+  assemblies if the value is expected in different hard regs
+
+The code below is an example of an inline assembly statement using local
+register variables.
+
+.. code-block:: c++
+
+  void foo() {
+    register int *p1 asm ("r0") = bar();
+    register int *p2 asm ("r1") = bar();
+    register int *result asm ("r0");
+    asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2));
+  }
+Below is the same code but using hard register operands.
+
+.. code-block:: c++
+
+  void foo() {
+    int *p1 = bar();
+    int *p2 = bar();
+    int *result;
+    asm ("sysint" : "={r0}" (result) : "0" (p1), "{r1}" (p2));
+  }
+
+
 Objective-C Features
 ====================
 
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 8e97902564af08..09672eb3865742 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -9235,6 +9235,8 @@ let CategoryName = "Inline Assembly Issue" in {
     "more than one input constraint matches the same output '%0'">;
   def err_store_value_to_reg : Error<
     "impossible constraint in asm: can't store value into a register">;
+  def err_asm_hard_reg_variable_duplicate : Error<
+    "hard register operand already defined as register variable">;
 
   def warn_asm_label_on_auto_decl : Warning<
     "ignored asm label '%0' on automatic variable">;
diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h
index 374595edd2ce4a..01d43b838414b7 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -1043,9 +1043,17 @@ class TargetInfo : public TransferrableTargetInfo,
   ///
   /// This function is used by Sema in order to diagnose conflicts between
   /// the clobber list and the input/output lists.
+  /// The constraint should already by validated in validateHardRegisterAsmConstraint
+  /// so just do some basic checking
   virtual StringRef getConstraintRegister(StringRef Constraint,
                                           StringRef Expression) const {
-    return "";
+    StringRef Reg = Expression;
+    size_t Start = Constraint.find('{');
+    size_t End = Constraint.find('}');
+    if (Start != StringRef::npos && End != StringRef::npos && End > Start)
+      Reg = Constraint.substr(Start + 1, End - Start - 1);
+
+    return Reg;
   }
 
   struct ConstraintInfo {
@@ -1187,6 +1195,14 @@ class TargetInfo : public TransferrableTargetInfo,
   validateAsmConstraint(const char *&Name,
                         TargetInfo::ConstraintInfo &info) const = 0;
 
+  // Validate the "hard register" inline asm constraint. This constraint is
+  // of the form {<reg-name>}. This constraint is meant to be used
+  // as an alternative for the "register asm" construct to put inline
+  // asm operands into specific registers.
+  bool
+  validateHardRegisterAsmConstraint(const char *&Name,
+                                    TargetInfo::ConstraintInfo &info) const;
+
   bool resolveSymbolicName(const char *&Name,
                            ArrayRef<ConstraintInfo> OutputConstraints,
                            unsigned &Index) const;
diff --git a/clang/lib/Basic/TargetInfo.cpp b/clang/lib/Basic/TargetInfo.cpp
index 5d9055174c089a..2190a8b8bb246d 100644
--- a/clang/lib/Basic/TargetInfo.cpp
+++ b/clang/lib/Basic/TargetInfo.cpp
@@ -770,6 +770,18 @@ bool TargetInfo::validateOutputConstraint(ConstraintInfo &Info) const {
     case 'E':
     case 'F':
       break;  // Pass them.
+    case '{': {
+      // First, check the target parser in case it validates
+      // the {...} constraint differently.
+      if (validateAsmConstraint(Name, Info))
+        return true;
+
+      // If not, that's okay, we will try to validate it
+      // using a target agnostic implementation.
+      if (!validateHardRegisterAsmConstraint(Name, Info))
+        return false;
+      break;
+    }
     }
 
     Name++;
@@ -785,6 +797,36 @@ bool TargetInfo::validateOutputConstraint(ConstraintInfo &Info) const {
   return Info.allowsMemory() || Info.allowsRegister();
 }
 
+bool TargetInfo::validateHardRegisterAsmConstraint(
+    const char *&Name, TargetInfo::ConstraintInfo &Info) const {
+  // First, swallow the '{'.
+  Name++;
+
+  // Mark the start of the possible register name.
+  const char *Start = Name;
+
+  // Loop through rest of "Name".
+  // In this loop, we check whether we have a closing curly brace which
+  // validates the constraint. Also, this allows us to get the correct bounds to
+  // set our register name.
+  while (*Name && *Name != '}')
+    Name++;
+
+  // Missing '}' or if there is anything after '}', return false.
+  if (!*Name || *(Name + 1))
+    return false;
+
+  // Now we set the register name.
+  std::string Register(Start, Name - Start);
+
+  // We validate whether its a valid register to be used.
+  if (!isValidGCCRegisterName(Register))
+    return false;
+
+  Info.setAllowsRegister();
+  return true;
+}
+
 bool TargetInfo::resolveSymbolicName(const char *&Name,
                                      ArrayRef<ConstraintInfo> OutputConstraints,
                                      unsigned &Index) const {
@@ -917,6 +959,18 @@ bool TargetInfo::validateInputConstraint(
     case '!': // Disparage severely.
     case '*': // Ignore for choosing register preferences.
       break;  // Pass them.
+    case '{': {
+      // First, check the target parser in case it validates
+      // the {...} constraint differently.
+      if (validateAsmConstraint(Name, Info))
+        return true;
+
+      // If not, that's okay, we will try to validate it
+      // using a target agnostic implementation.
+      if (!validateHardRegisterAsmConstraint(Name, Info))
+        return false;
+      break;
+    }
     }
 
     Name++;
diff --git a/clang/lib/Basic/Targets/AArch64.h b/clang/lib/Basic/Targets/AArch64.h
index 2dd6b2181e87df..c40ef2a3c13e94 100644
--- a/clang/lib/Basic/Targets/AArch64.h
+++ b/clang/lib/Basic/Targets/AArch64.h
@@ -188,11 +188,6 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public TargetInfo {
                              std::string &SuggestedModifier) const override;
   std::string_view getClobbers() const override;
 
-  StringRef getConstraintRegister(StringRef Constraint,
-                                  StringRef Expression) const override {
-    return Expression;
-  }
-
   int getEHDataRegisterNumber(unsigned RegNo) const override;
 
   bool validatePointerAuthKey(const llvm::APSInt &value) const override;
diff --git a/clang/lib/Basic/Targets/ARM.h b/clang/lib/Basic/Targets/ARM.h
index 71322a094f5edb..9ed452163c6048 100644
--- a/clang/lib/Basic/Targets/ARM.h
+++ b/clang/lib/Basic/Targets/ARM.h
@@ -213,11 +213,6 @@ class LLVM_LIBRARY_VISIBILITY ARMTargetInfo : public TargetInfo {
                              std::string &SuggestedModifier) const override;
   std::string_view getClobbers() const override;
 
-  StringRef getConstraintRegister(StringRef Constraint,
-                                  StringRef Expression) const override {
-    return Expression;
-  }
-
   CallingConvCheckResult checkCallingConvention(CallingConv CC) const override;
 
   int getEHDataRegisterNumber(unsigned RegNo) const override;
diff --git a/clang/lib/Basic/Targets/RISCV.h b/clang/lib/Basic/Targets/RISCV.h
index bfbdafb682c851..89071da7a42776 100644
--- a/clang/lib/Basic/Targets/RISCV.h
+++ b/clang/lib/Basic/Targets/RISCV.h
@@ -70,11 +70,6 @@ class RISCVTargetInfo : public TargetInfo {
 
   std::string_view getClobbers() const override { return ""; }
 
-  StringRef getConstraintRegister(StringRef Constraint,
-                                  StringRef Expression) const override {
-    return Expression;
-  }
-
   ArrayRef<const char *> getGCCRegNames() const override;
 
   int getEHDataRegisterNumber(unsigned RegNo) const override {
diff --git a/clang/lib/Basic/Targets/X86.h b/clang/lib/Basic/Targets/X86.h
index d2232c7d5275ab..e7fb2706d02590 100644
--- a/clang/lib/Basic/Targets/X86.h
+++ b/clang/lib/Basic/Targets/X86.h
@@ -307,7 +307,7 @@ class LLVM_LIBRARY_VISIBILITY X86TargetInfo : public TargetInfo {
       return "di";
     // In case the constraint is 'r' we need to return Expression
     case 'r':
-      return Expression;
+      return TargetInfo::getConstraintRegister(Constraint, Expression);
     // Double letters Y<x> constraints
     case 'Y':
       if ((++I != E) && ((*I == '0') || (*I == 'z')))
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index 8898e3f22a7df6..97f61eea620b2a 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -2183,9 +2183,17 @@ void CodeGenFunction::EmitSwitchStmt(const SwitchStmt &S) {
   CaseRangeBlock = SavedCRBlock;
 }
 
-static std::string
-SimplifyConstraint(const char *Constraint, const TargetInfo &Target,
-                 SmallVectorImpl<TargetInfo::ConstraintInfo> *OutCons=nullptr) {
+static std::string SimplifyConstraint(
+    const char *Constraint, const TargetInfo &Target,
+    SmallVectorImpl<TargetInfo::ConstraintInfo> *OutCons = nullptr) {
+  // If we have only the {...} constraint, do not do any simplifications. This
+  // already maps to the lower level LLVM inline assembly IR that tells the
+  // backend to allocate a specific register. Any validations would have already
+  // been done in the Sema stage or will be done in the AddVariableConstraints
+  // function.
+  if (Constraint[0] == '{' || (Constraint[0] == '&' && Constraint[1] == '{'))
+    return std::string(Constraint);
+
   std::string Result;
 
   while (*Constraint) {
@@ -2232,37 +2240,94 @@ SimplifyConstraint(const char *Constraint, const TargetInfo &Target,
 
   return Result;
 }
+/// Is it valid to apply a register constraint for a variable marked with
+/// the "register asm" construct?
+/// Optionally, if it is determined that we can, we set "Register" to the
+/// regiser name.
+static bool
+ShouldApplyRegisterVariableConstraint(const Expr &AsmExpr,
+                                      std::string *Register = nullptr) {
 
-/// AddVariableConstraints - Look at AsmExpr and if it is a variable declared
-/// as using a particular register add that as a constraint that will be used
-/// in this asm stmt.
-static std::string
-AddVariableConstraints(const std::string &Constraint, const Expr &AsmExpr,
-                       const TargetInfo &Target, CodeGenModule &CGM,
-                       const AsmStmt &Stmt, const bool EarlyClobber,
-                       std::string *GCCReg = nullptr) {
   const DeclRefExpr *AsmDeclRef = dyn_cast<DeclRefExpr>(&AsmExpr);
   if (!AsmDeclRef)
-    return Constraint;
+    return false;
   const ValueDecl &Value = *AsmDeclRef->getDecl();
   const VarDecl *Variable = dyn_cast<VarDecl>(&Value);
   if (!Variable)
-    return Constraint;
+    return false;
   if (Variable->getStorageClass() != SC_Register)
-    return Constraint;
+    return false;
   AsmLabelAttr *Attr = Variable->getAttr<AsmLabelAttr>();
   if (!Attr)
+    return false;
+
+  if (Register != nullptr)
+    // Set the register to return from Attr.
+    *Register = Attr->getLabel().str();
+  return true;
+}
+
+/// AddVariableConstraints:
+/// Look at AsmExpr and if it is a variable declared as using a particular
+/// register add that as a constraint that will be used in this asm stmt.
+/// Whether it can be used or not is dependent on querying
+/// ShouldApplyRegisterVariableConstraint() Also check whether the "hard
+/// register" inline asm constraint (i.e. "{reg-name}") is specified. If so, add
+/// that as a constraint that will be used in this asm stmt.
+static std::string
+AddVariableConstraints(const std::string &Constraint, const Expr &AsmExpr,
+                       const TargetInfo &Target, CodeGenModule &CGM,
+                       const AsmStmt &Stmt, const bool EarlyClobber,
+                       std::string *GCCReg = nullptr) {
+
+  // Do we have the "hard register" inline asm constraint.
+  bool ApplyHardRegisterConstraint =
+      Constraint[0] == '{' || (EarlyClobber && Constraint[1] == '{');
+
+  // Do we have "register asm" on a variable.
+  std::string Reg = "";
+  bool ApplyRegisterVariableConstraint =
+      ShouldApplyRegisterVariableConstraint(AsmExpr, &Reg);
+
+  // Diagnose the scenario where we apply both the register variable constraint
+  // and a hard register variable constraint as an unsupported error.
+  // Why? Because we could have a situation where the register passed in through
+  // {...} and the register passed in through the "register asm" construct could
+  // be different, and in this case, there's no way for the compiler to know
+  // which one to emit.
+  if (ApplyHardRegisterConstraint && ApplyRegisterVariableConstraint) {
+    CGM.getDiags().Report(AsmExpr.getExprLoc(),
+                          diag::err_asm_hard_reg_variable_duplicate);
     return Constraint;
-  StringRef Register = Attr->getLabel();
-  assert(Target.isValidGCCRegisterName(Register));
+  }
+
+  if (!ApplyHardRegisterConstraint && !ApplyRegisterVariableConstraint)
+    return Constraint;
+
   // We're using validateOutputConstraint here because we only care if
   // this is a register constraint.
   TargetInfo::ConstraintInfo Info(Constraint, "");
-  if (Target.validateOutputConstraint(Info) &&
-      !Info.allowsRegister()) {
+  if (Target.validateOutputConstraint(Info) && !Info.allowsRegister()) {
     CGM.ErrorUnsupported(&Stmt, "__asm__");
     return Constraint;
   }
+
+  if (ApplyHardRegisterConstraint) {
+    int Start = EarlyClobber ? 2 : 1;
+    int End = Constraint.find('}');
+    Reg = Constraint.substr(Start, End - Start);
+    // If we don't have a valid register name, simply return the constraint.
+    // For example: There are some targets like X86 that use a constraint such
+    // as "@cca", which is validated and then converted into {@cca}. Now this
+    // isn't necessarily a "GCC Register", but in terms of emission, it is
+    // valid since it lowered appropriately in the X86 backend. For the {..}
+    // constraint, we shouldn't be too strict and error out if the register
+    // itself isn't a valid "GCC register".
+    if (!Target.isValidGCCRegisterName(Reg))
+      return Constraint;
+  }
+
+  StringRef Register(Reg);
   // Canonicalize the register here before returning it.
   Register = Target.getNormalizedGCCRegisterName(Register);
   if (GCCReg != nullptr)
diff --git a/clang/test/CodeGen/SystemZ/systemz-inline-asm-02.c b/clang/test/CodeGen/SystemZ/systemz-inline-asm-02.c
index 754d7e66f04b24..60237c81fd7298 100644
--- a/clang/test/CodeGen/SystemZ/systemz-inline-asm-02.c
+++ b/clang/test/CodeGen/SystemZ/systemz-inline-asm-02.c
@@ -5,9 +5,15 @@
 // Test that an error is given if a physreg is defined by multiple operands.
 int test_physreg_defs(void) {
   register int l __asm__("r7") = 0;
+  int m;
 
   // CHECK: error: multiple outputs to hard register: r7
-  __asm__("" : "+r"(l), "=r"(l));
+  __asm__(""
+          : "+r"(l), "=r"(l));
 
-  return l;
+  // CHECK: error: multiple outputs to hard register: r6
+  __asm__(""
+          : "+{r6}"(m), "={r6}"(m));
+
+  return l + m;
 }
diff --git a/clang/test/CodeGen/SystemZ/systemz-inline-asm.c b/clang/test/CodeGen/SystemZ/systemz-inline-asm.c
index e38d37cd345e26..a3b47700dc30bb 100644
--- a/clang/test/CodeGen/SystemZ/systemz-inline-asm.c
+++ b/clang/test/CodeGen/SystemZ/systemz-inline-asm.c
@@ -134,12 +134,25 @@ long double test_f128(long double f, long double g) {
 int test_physregs(void) {
   // CHECK-LABEL: define{{.*}} signext i32 @test_physregs()
   register int l __asm__("r7") = 0;
+  int m = 0;
 
   // CHECK: call i32 asm "lr $0, $1", "={r7},{r7}"
-  __asm__("lr %0, %1" : "+r"(l));
+  __asm__("lr %0, %1"
+          : "+r"(l));
 
   // CHECK: call i32 asm "$0 $1 $2", "={r7},{r7},{r7}"
-  __asm__("%0 %1 %2" : "+r"(l) : "r"(l));
+  __asm__("%0 %1 %2"
+          : "+r"(l)
+          : "r"(l));
 
-  return l;
+  // CHECK: call i32 asm "lr $0, $1", "={r6},{r6}"
+  __asm__("lr %0, %1"
+          : "+{r6}"(m));
+
+  // CHECK: call i32 asm "$0 $1 $2", "={r6},{r6},{r6}"
+  __asm__("%0 %1 %2"
+          : "+{r6}"(m)
+          : "{r6}"(m));
+
+  return l + m;
 }
diff --git a/clang/test/CodeGen/aarch64-inline-asm.c b/clang/test/CodeGen/aarch64-inline-asm.c
index 8ddee560b11da4..860cc858275ea6 100644
--- a/clang/test/CodeGen/aarch64-inline-asm.c
+++ b/clang/test/CodeGen/aarch64-inline-asm.c
@@ -77,7 +77,15 @@ void test_gcc_registers(void) {
 
 void test_tied_earlyclobber(void) {
   register int a asm("x1");
-  asm("" : "+&r"(a));
+  asm(""
+      : "+&r"(a));
+  // CHECK: call i32 asm "", "=&{x1},0"(i32 %0)
+}
+
+void test_tied_earlyclobber2(void) {
+  int a;
+  asm(""
+      : "+&{x1}"(a));
   // CHECK: call i32 asm "", "=&{x1},0"(i32 %0)
 }
 
@@ -102,4 +110,4 @@ void test_sme_constraints(){
 
   asm("movt zt0[3, mul vl], z0" : : : "zt0");
 // CHECK: call void asm sideeffect "movt zt0[3, mul vl], z0", "~{zt0}"()
-}
\ No newline at end of file
+}
diff --git a/clang/test/CodeGen/asm-goto.c b/clang/test/CodeGen/asm-goto.c
index 4037c1b2a3d7a2..77bd77615f2998 100644
--- a/clang/test/CodeGen/asm-goto.c
+++ b/clang/test/CodeGen/asm-goto.c
@@ -55,14 +55,14 @@ int test3(int out1, int out2) {
 
 int test4(int out1, int out2) {
   // CHECK-LABEL: define{{.*}} i32 @test4(
-  // CHECK: callbr { i32, i32 } asm sideeffect "jne ${5:l}", "={si},={di},r,0,1,!i,!i
+  // CHECK: callbr { i32, i32 } asm sideeffect "jne ${5:l}", "={si},={di},r,{si},{di},!i,!i
   // CHECK: to label %asm.fallthrough [label %label_true.split, label %loop.split]
   // CHECK-LABEL: asm.fallthrough:
   if (out1 < out2)
     asm volatile goto("jne %l5" : "+S"(out1), "+D"(out2) : "r"(out1) :: label_true, loop);
   else
     asm volatile goto("jne %l7" : "+S"(out1), "+D"(out2) : "r"(out1), "r"(out2) :: label_true, loop);
-  // CHECK: callbr { i32, i32 } asm sideeffect "jne ${7:l}", "={si},={di},r,r,0,1,!i,!i
+  // CHECK: callbr { i32, i32 } asm sideeffect "jne ${7:l}", "={si},={di},r,r,{si},{di},!i,!i
   // CHECK: to label %asm.fallthrough6 [label %label_true.split11, label %loop.split14]
   // CHECK-LABEL: asm.fallthrough6:
   return out1 + out2;
@@ -92,7 +92,7 @@ int test5(int addr, int size, int limit) {
 
 int test6(int out1) {
   // CHECK-LABEL: define{{.*}} i32 @test6(
-  // CHECK: callbr i32 asm sideeffect "testl $0, $0; testl $1, $1; jne ${3:l}", "={si},r,0,!i,!i,{{.*}}
+  // CHECK: callbr i32 asm sideeffect "testl $0, $0; testl $1, $1; jne ${3:l}", "={si},r,{si},!i,!i,{{.*}}
   // CHECK: to label %asm.fallthrough [label %label_true.split, label %landing.split]
   // CHECK-LABEL: asm.fallthrough:
   // CHECK-LABEL: landing:
diff --git a/clang/test/CodeGen/ms-intrinsics.c b/clang/test/CodeGen/ms-intrinsics.c
index 5bb003d1f91fc0..375258ca609675 100644
--- a/clang/test/CodeGen/ms-intrinsics.c
+++ b/clang/test/CodeGen/ms-intrinsics.c
@@ -36,12 +36,12 @@ void test__movsb(unsigned char *Dest, unsigned char *Src, size_t Count) {
   return __movsb(Dest, Src, Count);
 }
 // CHECK-I386-LABEL: define{{.*}} void @test__movsb
-// CHECK-I386:   tail call { ptr, ptr, i32 } asm sideeffect "xchg $(%esi, $1$|$1, esi$)\0Arep movsb\0Axchg ...
[truncated]

@llvmbot
Copy link
Member

llvmbot commented Mar 19, 2024

@llvm/pr-subscribers-backend-arm

Author: None (tltao)

Changes

Recreate the hard register PR created on phabricator with some minor additions: https://reviews.llvm.org/D105142

The main idea is to allow Clang to support the ability to specify specific hardware registers in inline assembly constraints via the use of curly braces {}. As such, this is mainly a Clang change.

Relevant RFCs posted here:

https://lists.llvm.org/pipermail/llvm-dev/2021-June/151370.html
https://gcc.gnu.org/pipermail/gcc/2021-June/236269.html

Copying the Summary from the phabricator patch:

  • This is put up as an RFC patch to get feedback about the introduction of a new inline asm constraint which supports hard register operands
  • This is mostly a clang change, since the LLVM IR for the inline assembly already supports the {...} syntax which the backend recognizes. This change merely completes the loop in terms of introducing a user facing constraint which maps to it.

The following design decisions were taken for this patch:

For the Sema side:

  • We validate the "{" constraint using a two-phase validation approach. Firstly, we check if there is a target dependent implementation to handle the {....} constraint. If it fails, then we move on to a target agnostic check where we parse and validate the {....} constraint.
  • Why do we do this? Well, there are some targets which already process the {...} as a user facing constraint. For example the AMDGPU target. It supports syntax of the form {register-name} as well as {register-name[...]}. Moving this implementation to the target agnostic side seems to set a precedent, that we can keep extending the target agnostic implementation based on new cases for certain targets, which would be better served of moving it to the respective target.
  • In terms of the target agnostic validation, we simply check for the following syntax {.*}. The parsed out content within the two curly braces is checked to see whether its a "valid GCC register".

For the Clang CodeGen side:

  • Most of the work is done in the AddVariableConstraints function in CGStmt.cpp, which is responsible for emitting the LLVM IR corresponding to the usage of an actual register that the backend can use. Coincidentally, the LLVM IR is also {...}. As mentioned above, this is essentially mapping the LLVM Inline Assembly IR back to a user facing inline asm constraint.
  • Within this function we add in logic to check if the constraint is of the form [&]{...} in addition to the "register asm" construct.
  • A scenario where it was applicable to apply both the "register asm" construct and the "hard register inline asm constraint" was diagnosed as an unsupported error because there's no way the compiler will know which register the user meant. The safest option here is to error out explicitly, and put onus back on the user.
  • To achieve this, and refactor it in a nice way, most of the logic pertaining to the "register asm" construct has been moved into a separate function called ShouldApplyRegisterVariableConstraint which deduces whether to apply the "register asm" construct or not.
  • Furthermore, the GCCReg field is set with the "Register" only if the register is validated to be a "valid GCC Register" type. To me it doesn't make a lot of sense to set GCC Reg to something that might not necessarily be a "valid GCC register" as defined by respective targets. Would we have a case where we could have a {.*} constraint where the contents inside the curly brace is not a valid register? Yes. For example, The x86 target supports the "@cca" constraint, which is validated on the Sema side as a valid constraint, and before processing is converted to "{@cca}" (via the convertAsmConstraint function. "@cca" is not a valid "GCC register name". So in these case, we'll simply emit the constraint without setting GCCReg (with the assumption that the respective backends deal with it appropriately)

Tests:

  • Various tests were updated to account for the new behaviour.
  • I added a few SystemZ tests because we work on the Z backend, but I have no issues adding testing for multiple targets.
  • There were a few mentions of "waiting" for the GCC implementation of the same to land before the Clang side lands. As mentioned above, the intent to implement it on the GCC side has already been put forward via the RFC. I have no issues "parking" this implementation until its ready to be merged in. However, it might be good to hash out any open questions/concerns in the interim.

Patch is 68.16 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/85846.diff

19 Files Affected:

  • (modified) clang/docs/LanguageExtensions.rst (+43)
  • (modified) clang/include/clang/Basic/DiagnosticSemaKinds.td (+2)
  • (modified) clang/include/clang/Basic/TargetInfo.h (+17-1)
  • (modified) clang/lib/Basic/TargetInfo.cpp (+54)
  • (modified) clang/lib/Basic/Targets/AArch64.h (-5)
  • (modified) clang/lib/Basic/Targets/ARM.h (-5)
  • (modified) clang/lib/Basic/Targets/RISCV.h (-5)
  • (modified) clang/lib/Basic/Targets/X86.h (+1-1)
  • (modified) clang/lib/CodeGen/CGStmt.cpp (+83-18)
  • (modified) clang/test/CodeGen/SystemZ/systemz-inline-asm-02.c (+8-2)
  • (modified) clang/test/CodeGen/SystemZ/systemz-inline-asm.c (+16-3)
  • (modified) clang/test/CodeGen/aarch64-inline-asm.c (+10-2)
  • (modified) clang/test/CodeGen/asm-goto.c (+3-3)
  • (modified) clang/test/CodeGen/ms-intrinsics.c (+16-16)
  • (added) clang/test/CodeGen/ms-intrinsics.ll (+533)
  • (added) clang/test/CodeGen/x86-asm-register-constraint-mix.c (+62)
  • (added) clang/test/CodeGen/z-hard-register-inline-asm.c (+52)
  • (added) clang/test/Sema/z-hard-register-inline-asm.c (+50)
  • (added) llvm/test/CodeGen/SystemZ/zos-inline-asm.ll (+250)
diff --git a/clang/docs/LanguageExtensions.rst b/clang/docs/LanguageExtensions.rst
index 13d7261d83d7f1..39d1981a068992 100644
--- a/clang/docs/LanguageExtensions.rst
+++ b/clang/docs/LanguageExtensions.rst
@@ -1760,6 +1760,49 @@ references can be used instead of numeric references.
       return -1;
   }
 
+Hard Register Operands for ASM Constraints
+==========================================
+
+Clang supports the ability to specify specific hardware registers in inline
+assembly constraints via the use of curly braces ``{}``.
+
+Prior to clang-19, the only way to associate an inline assembly constraint
+with a specific register is via the local register variable feature (`GCC
+Specifying Registers for Local Variables <https://gcc.gnu.org/onlinedocs/gcc-6.5.0/gcc/Local-Register-Variables.html>`_). However, the local register variable association lasts for the entire
+scope of the variable.
+
+Hard register operands will instead only apply to the specific inline ASM
+statement which improves readability and solves a few other issues experienced
+by local register variables, such as:
+
+* function calls might clobber register variables
+* the constraints for the register operands are superfluous
+* one register variable cannot be used for 2 different inline
+  assemblies if the value is expected in different hard regs
+
+The code below is an example of an inline assembly statement using local
+register variables.
+
+.. code-block:: c++
+
+  void foo() {
+    register int *p1 asm ("r0") = bar();
+    register int *p2 asm ("r1") = bar();
+    register int *result asm ("r0");
+    asm ("sysint" : "=r" (result) : "0" (p1), "r" (p2));
+  }
+Below is the same code but using hard register operands.
+
+.. code-block:: c++
+
+  void foo() {
+    int *p1 = bar();
+    int *p2 = bar();
+    int *result;
+    asm ("sysint" : "={r0}" (result) : "0" (p1), "{r1}" (p2));
+  }
+
+
 Objective-C Features
 ====================
 
diff --git a/clang/include/clang/Basic/DiagnosticSemaKinds.td b/clang/include/clang/Basic/DiagnosticSemaKinds.td
index 8e97902564af08..09672eb3865742 100644
--- a/clang/include/clang/Basic/DiagnosticSemaKinds.td
+++ b/clang/include/clang/Basic/DiagnosticSemaKinds.td
@@ -9235,6 +9235,8 @@ let CategoryName = "Inline Assembly Issue" in {
     "more than one input constraint matches the same output '%0'">;
   def err_store_value_to_reg : Error<
     "impossible constraint in asm: can't store value into a register">;
+  def err_asm_hard_reg_variable_duplicate : Error<
+    "hard register operand already defined as register variable">;
 
   def warn_asm_label_on_auto_decl : Warning<
     "ignored asm label '%0' on automatic variable">;
diff --git a/clang/include/clang/Basic/TargetInfo.h b/clang/include/clang/Basic/TargetInfo.h
index 374595edd2ce4a..01d43b838414b7 100644
--- a/clang/include/clang/Basic/TargetInfo.h
+++ b/clang/include/clang/Basic/TargetInfo.h
@@ -1043,9 +1043,17 @@ class TargetInfo : public TransferrableTargetInfo,
   ///
   /// This function is used by Sema in order to diagnose conflicts between
   /// the clobber list and the input/output lists.
+  /// The constraint should already by validated in validateHardRegisterAsmConstraint
+  /// so just do some basic checking
   virtual StringRef getConstraintRegister(StringRef Constraint,
                                           StringRef Expression) const {
-    return "";
+    StringRef Reg = Expression;
+    size_t Start = Constraint.find('{');
+    size_t End = Constraint.find('}');
+    if (Start != StringRef::npos && End != StringRef::npos && End > Start)
+      Reg = Constraint.substr(Start + 1, End - Start - 1);
+
+    return Reg;
   }
 
   struct ConstraintInfo {
@@ -1187,6 +1195,14 @@ class TargetInfo : public TransferrableTargetInfo,
   validateAsmConstraint(const char *&Name,
                         TargetInfo::ConstraintInfo &info) const = 0;
 
+  // Validate the "hard register" inline asm constraint. This constraint is
+  // of the form {<reg-name>}. This constraint is meant to be used
+  // as an alternative for the "register asm" construct to put inline
+  // asm operands into specific registers.
+  bool
+  validateHardRegisterAsmConstraint(const char *&Name,
+                                    TargetInfo::ConstraintInfo &info) const;
+
   bool resolveSymbolicName(const char *&Name,
                            ArrayRef<ConstraintInfo> OutputConstraints,
                            unsigned &Index) const;
diff --git a/clang/lib/Basic/TargetInfo.cpp b/clang/lib/Basic/TargetInfo.cpp
index 5d9055174c089a..2190a8b8bb246d 100644
--- a/clang/lib/Basic/TargetInfo.cpp
+++ b/clang/lib/Basic/TargetInfo.cpp
@@ -770,6 +770,18 @@ bool TargetInfo::validateOutputConstraint(ConstraintInfo &Info) const {
     case 'E':
     case 'F':
       break;  // Pass them.
+    case '{': {
+      // First, check the target parser in case it validates
+      // the {...} constraint differently.
+      if (validateAsmConstraint(Name, Info))
+        return true;
+
+      // If not, that's okay, we will try to validate it
+      // using a target agnostic implementation.
+      if (!validateHardRegisterAsmConstraint(Name, Info))
+        return false;
+      break;
+    }
     }
 
     Name++;
@@ -785,6 +797,36 @@ bool TargetInfo::validateOutputConstraint(ConstraintInfo &Info) const {
   return Info.allowsMemory() || Info.allowsRegister();
 }
 
+bool TargetInfo::validateHardRegisterAsmConstraint(
+    const char *&Name, TargetInfo::ConstraintInfo &Info) const {
+  // First, swallow the '{'.
+  Name++;
+
+  // Mark the start of the possible register name.
+  const char *Start = Name;
+
+  // Loop through rest of "Name".
+  // In this loop, we check whether we have a closing curly brace which
+  // validates the constraint. Also, this allows us to get the correct bounds to
+  // set our register name.
+  while (*Name && *Name != '}')
+    Name++;
+
+  // Missing '}' or if there is anything after '}', return false.
+  if (!*Name || *(Name + 1))
+    return false;
+
+  // Now we set the register name.
+  std::string Register(Start, Name - Start);
+
+  // We validate whether its a valid register to be used.
+  if (!isValidGCCRegisterName(Register))
+    return false;
+
+  Info.setAllowsRegister();
+  return true;
+}
+
 bool TargetInfo::resolveSymbolicName(const char *&Name,
                                      ArrayRef<ConstraintInfo> OutputConstraints,
                                      unsigned &Index) const {
@@ -917,6 +959,18 @@ bool TargetInfo::validateInputConstraint(
     case '!': // Disparage severely.
     case '*': // Ignore for choosing register preferences.
       break;  // Pass them.
+    case '{': {
+      // First, check the target parser in case it validates
+      // the {...} constraint differently.
+      if (validateAsmConstraint(Name, Info))
+        return true;
+
+      // If not, that's okay, we will try to validate it
+      // using a target agnostic implementation.
+      if (!validateHardRegisterAsmConstraint(Name, Info))
+        return false;
+      break;
+    }
     }
 
     Name++;
diff --git a/clang/lib/Basic/Targets/AArch64.h b/clang/lib/Basic/Targets/AArch64.h
index 2dd6b2181e87df..c40ef2a3c13e94 100644
--- a/clang/lib/Basic/Targets/AArch64.h
+++ b/clang/lib/Basic/Targets/AArch64.h
@@ -188,11 +188,6 @@ class LLVM_LIBRARY_VISIBILITY AArch64TargetInfo : public TargetInfo {
                              std::string &SuggestedModifier) const override;
   std::string_view getClobbers() const override;
 
-  StringRef getConstraintRegister(StringRef Constraint,
-                                  StringRef Expression) const override {
-    return Expression;
-  }
-
   int getEHDataRegisterNumber(unsigned RegNo) const override;
 
   bool validatePointerAuthKey(const llvm::APSInt &value) const override;
diff --git a/clang/lib/Basic/Targets/ARM.h b/clang/lib/Basic/Targets/ARM.h
index 71322a094f5edb..9ed452163c6048 100644
--- a/clang/lib/Basic/Targets/ARM.h
+++ b/clang/lib/Basic/Targets/ARM.h
@@ -213,11 +213,6 @@ class LLVM_LIBRARY_VISIBILITY ARMTargetInfo : public TargetInfo {
                              std::string &SuggestedModifier) const override;
   std::string_view getClobbers() const override;
 
-  StringRef getConstraintRegister(StringRef Constraint,
-                                  StringRef Expression) const override {
-    return Expression;
-  }
-
   CallingConvCheckResult checkCallingConvention(CallingConv CC) const override;
 
   int getEHDataRegisterNumber(unsigned RegNo) const override;
diff --git a/clang/lib/Basic/Targets/RISCV.h b/clang/lib/Basic/Targets/RISCV.h
index bfbdafb682c851..89071da7a42776 100644
--- a/clang/lib/Basic/Targets/RISCV.h
+++ b/clang/lib/Basic/Targets/RISCV.h
@@ -70,11 +70,6 @@ class RISCVTargetInfo : public TargetInfo {
 
   std::string_view getClobbers() const override { return ""; }
 
-  StringRef getConstraintRegister(StringRef Constraint,
-                                  StringRef Expression) const override {
-    return Expression;
-  }
-
   ArrayRef<const char *> getGCCRegNames() const override;
 
   int getEHDataRegisterNumber(unsigned RegNo) const override {
diff --git a/clang/lib/Basic/Targets/X86.h b/clang/lib/Basic/Targets/X86.h
index d2232c7d5275ab..e7fb2706d02590 100644
--- a/clang/lib/Basic/Targets/X86.h
+++ b/clang/lib/Basic/Targets/X86.h
@@ -307,7 +307,7 @@ class LLVM_LIBRARY_VISIBILITY X86TargetInfo : public TargetInfo {
       return "di";
     // In case the constraint is 'r' we need to return Expression
     case 'r':
-      return Expression;
+      return TargetInfo::getConstraintRegister(Constraint, Expression);
     // Double letters Y<x> constraints
     case 'Y':
       if ((++I != E) && ((*I == '0') || (*I == 'z')))
diff --git a/clang/lib/CodeGen/CGStmt.cpp b/clang/lib/CodeGen/CGStmt.cpp
index 8898e3f22a7df6..97f61eea620b2a 100644
--- a/clang/lib/CodeGen/CGStmt.cpp
+++ b/clang/lib/CodeGen/CGStmt.cpp
@@ -2183,9 +2183,17 @@ void CodeGenFunction::EmitSwitchStmt(const SwitchStmt &S) {
   CaseRangeBlock = SavedCRBlock;
 }
 
-static std::string
-SimplifyConstraint(const char *Constraint, const TargetInfo &Target,
-                 SmallVectorImpl<TargetInfo::ConstraintInfo> *OutCons=nullptr) {
+static std::string SimplifyConstraint(
+    const char *Constraint, const TargetInfo &Target,
+    SmallVectorImpl<TargetInfo::ConstraintInfo> *OutCons = nullptr) {
+  // If we have only the {...} constraint, do not do any simplifications. This
+  // already maps to the lower level LLVM inline assembly IR that tells the
+  // backend to allocate a specific register. Any validations would have already
+  // been done in the Sema stage or will be done in the AddVariableConstraints
+  // function.
+  if (Constraint[0] == '{' || (Constraint[0] == '&' && Constraint[1] == '{'))
+    return std::string(Constraint);
+
   std::string Result;
 
   while (*Constraint) {
@@ -2232,37 +2240,94 @@ SimplifyConstraint(const char *Constraint, const TargetInfo &Target,
 
   return Result;
 }
+/// Is it valid to apply a register constraint for a variable marked with
+/// the "register asm" construct?
+/// Optionally, if it is determined that we can, we set "Register" to the
+/// regiser name.
+static bool
+ShouldApplyRegisterVariableConstraint(const Expr &AsmExpr,
+                                      std::string *Register = nullptr) {
 
-/// AddVariableConstraints - Look at AsmExpr and if it is a variable declared
-/// as using a particular register add that as a constraint that will be used
-/// in this asm stmt.
-static std::string
-AddVariableConstraints(const std::string &Constraint, const Expr &AsmExpr,
-                       const TargetInfo &Target, CodeGenModule &CGM,
-                       const AsmStmt &Stmt, const bool EarlyClobber,
-                       std::string *GCCReg = nullptr) {
   const DeclRefExpr *AsmDeclRef = dyn_cast<DeclRefExpr>(&AsmExpr);
   if (!AsmDeclRef)
-    return Constraint;
+    return false;
   const ValueDecl &Value = *AsmDeclRef->getDecl();
   const VarDecl *Variable = dyn_cast<VarDecl>(&Value);
   if (!Variable)
-    return Constraint;
+    return false;
   if (Variable->getStorageClass() != SC_Register)
-    return Constraint;
+    return false;
   AsmLabelAttr *Attr = Variable->getAttr<AsmLabelAttr>();
   if (!Attr)
+    return false;
+
+  if (Register != nullptr)
+    // Set the register to return from Attr.
+    *Register = Attr->getLabel().str();
+  return true;
+}
+
+/// AddVariableConstraints:
+/// Look at AsmExpr and if it is a variable declared as using a particular
+/// register add that as a constraint that will be used in this asm stmt.
+/// Whether it can be used or not is dependent on querying
+/// ShouldApplyRegisterVariableConstraint() Also check whether the "hard
+/// register" inline asm constraint (i.e. "{reg-name}") is specified. If so, add
+/// that as a constraint that will be used in this asm stmt.
+static std::string
+AddVariableConstraints(const std::string &Constraint, const Expr &AsmExpr,
+                       const TargetInfo &Target, CodeGenModule &CGM,
+                       const AsmStmt &Stmt, const bool EarlyClobber,
+                       std::string *GCCReg = nullptr) {
+
+  // Do we have the "hard register" inline asm constraint.
+  bool ApplyHardRegisterConstraint =
+      Constraint[0] == '{' || (EarlyClobber && Constraint[1] == '{');
+
+  // Do we have "register asm" on a variable.
+  std::string Reg = "";
+  bool ApplyRegisterVariableConstraint =
+      ShouldApplyRegisterVariableConstraint(AsmExpr, &Reg);
+
+  // Diagnose the scenario where we apply both the register variable constraint
+  // and a hard register variable constraint as an unsupported error.
+  // Why? Because we could have a situation where the register passed in through
+  // {...} and the register passed in through the "register asm" construct could
+  // be different, and in this case, there's no way for the compiler to know
+  // which one to emit.
+  if (ApplyHardRegisterConstraint && ApplyRegisterVariableConstraint) {
+    CGM.getDiags().Report(AsmExpr.getExprLoc(),
+                          diag::err_asm_hard_reg_variable_duplicate);
     return Constraint;
-  StringRef Register = Attr->getLabel();
-  assert(Target.isValidGCCRegisterName(Register));
+  }
+
+  if (!ApplyHardRegisterConstraint && !ApplyRegisterVariableConstraint)
+    return Constraint;
+
   // We're using validateOutputConstraint here because we only care if
   // this is a register constraint.
   TargetInfo::ConstraintInfo Info(Constraint, "");
-  if (Target.validateOutputConstraint(Info) &&
-      !Info.allowsRegister()) {
+  if (Target.validateOutputConstraint(Info) && !Info.allowsRegister()) {
     CGM.ErrorUnsupported(&Stmt, "__asm__");
     return Constraint;
   }
+
+  if (ApplyHardRegisterConstraint) {
+    int Start = EarlyClobber ? 2 : 1;
+    int End = Constraint.find('}');
+    Reg = Constraint.substr(Start, End - Start);
+    // If we don't have a valid register name, simply return the constraint.
+    // For example: There are some targets like X86 that use a constraint such
+    // as "@cca", which is validated and then converted into {@cca}. Now this
+    // isn't necessarily a "GCC Register", but in terms of emission, it is
+    // valid since it lowered appropriately in the X86 backend. For the {..}
+    // constraint, we shouldn't be too strict and error out if the register
+    // itself isn't a valid "GCC register".
+    if (!Target.isValidGCCRegisterName(Reg))
+      return Constraint;
+  }
+
+  StringRef Register(Reg);
   // Canonicalize the register here before returning it.
   Register = Target.getNormalizedGCCRegisterName(Register);
   if (GCCReg != nullptr)
diff --git a/clang/test/CodeGen/SystemZ/systemz-inline-asm-02.c b/clang/test/CodeGen/SystemZ/systemz-inline-asm-02.c
index 754d7e66f04b24..60237c81fd7298 100644
--- a/clang/test/CodeGen/SystemZ/systemz-inline-asm-02.c
+++ b/clang/test/CodeGen/SystemZ/systemz-inline-asm-02.c
@@ -5,9 +5,15 @@
 // Test that an error is given if a physreg is defined by multiple operands.
 int test_physreg_defs(void) {
   register int l __asm__("r7") = 0;
+  int m;
 
   // CHECK: error: multiple outputs to hard register: r7
-  __asm__("" : "+r"(l), "=r"(l));
+  __asm__(""
+          : "+r"(l), "=r"(l));
 
-  return l;
+  // CHECK: error: multiple outputs to hard register: r6
+  __asm__(""
+          : "+{r6}"(m), "={r6}"(m));
+
+  return l + m;
 }
diff --git a/clang/test/CodeGen/SystemZ/systemz-inline-asm.c b/clang/test/CodeGen/SystemZ/systemz-inline-asm.c
index e38d37cd345e26..a3b47700dc30bb 100644
--- a/clang/test/CodeGen/SystemZ/systemz-inline-asm.c
+++ b/clang/test/CodeGen/SystemZ/systemz-inline-asm.c
@@ -134,12 +134,25 @@ long double test_f128(long double f, long double g) {
 int test_physregs(void) {
   // CHECK-LABEL: define{{.*}} signext i32 @test_physregs()
   register int l __asm__("r7") = 0;
+  int m = 0;
 
   // CHECK: call i32 asm "lr $0, $1", "={r7},{r7}"
-  __asm__("lr %0, %1" : "+r"(l));
+  __asm__("lr %0, %1"
+          : "+r"(l));
 
   // CHECK: call i32 asm "$0 $1 $2", "={r7},{r7},{r7}"
-  __asm__("%0 %1 %2" : "+r"(l) : "r"(l));
+  __asm__("%0 %1 %2"
+          : "+r"(l)
+          : "r"(l));
 
-  return l;
+  // CHECK: call i32 asm "lr $0, $1", "={r6},{r6}"
+  __asm__("lr %0, %1"
+          : "+{r6}"(m));
+
+  // CHECK: call i32 asm "$0 $1 $2", "={r6},{r6},{r6}"
+  __asm__("%0 %1 %2"
+          : "+{r6}"(m)
+          : "{r6}"(m));
+
+  return l + m;
 }
diff --git a/clang/test/CodeGen/aarch64-inline-asm.c b/clang/test/CodeGen/aarch64-inline-asm.c
index 8ddee560b11da4..860cc858275ea6 100644
--- a/clang/test/CodeGen/aarch64-inline-asm.c
+++ b/clang/test/CodeGen/aarch64-inline-asm.c
@@ -77,7 +77,15 @@ void test_gcc_registers(void) {
 
 void test_tied_earlyclobber(void) {
   register int a asm("x1");
-  asm("" : "+&r"(a));
+  asm(""
+      : "+&r"(a));
+  // CHECK: call i32 asm "", "=&{x1},0"(i32 %0)
+}
+
+void test_tied_earlyclobber2(void) {
+  int a;
+  asm(""
+      : "+&{x1}"(a));
   // CHECK: call i32 asm "", "=&{x1},0"(i32 %0)
 }
 
@@ -102,4 +110,4 @@ void test_sme_constraints(){
 
   asm("movt zt0[3, mul vl], z0" : : : "zt0");
 // CHECK: call void asm sideeffect "movt zt0[3, mul vl], z0", "~{zt0}"()
-}
\ No newline at end of file
+}
diff --git a/clang/test/CodeGen/asm-goto.c b/clang/test/CodeGen/asm-goto.c
index 4037c1b2a3d7a2..77bd77615f2998 100644
--- a/clang/test/CodeGen/asm-goto.c
+++ b/clang/test/CodeGen/asm-goto.c
@@ -55,14 +55,14 @@ int test3(int out1, int out2) {
 
 int test4(int out1, int out2) {
   // CHECK-LABEL: define{{.*}} i32 @test4(
-  // CHECK: callbr { i32, i32 } asm sideeffect "jne ${5:l}", "={si},={di},r,0,1,!i,!i
+  // CHECK: callbr { i32, i32 } asm sideeffect "jne ${5:l}", "={si},={di},r,{si},{di},!i,!i
   // CHECK: to label %asm.fallthrough [label %label_true.split, label %loop.split]
   // CHECK-LABEL: asm.fallthrough:
   if (out1 < out2)
     asm volatile goto("jne %l5" : "+S"(out1), "+D"(out2) : "r"(out1) :: label_true, loop);
   else
     asm volatile goto("jne %l7" : "+S"(out1), "+D"(out2) : "r"(out1), "r"(out2) :: label_true, loop);
-  // CHECK: callbr { i32, i32 } asm sideeffect "jne ${7:l}", "={si},={di},r,r,0,1,!i,!i
+  // CHECK: callbr { i32, i32 } asm sideeffect "jne ${7:l}", "={si},={di},r,r,{si},{di},!i,!i
   // CHECK: to label %asm.fallthrough6 [label %label_true.split11, label %loop.split14]
   // CHECK-LABEL: asm.fallthrough6:
   return out1 + out2;
@@ -92,7 +92,7 @@ int test5(int addr, int size, int limit) {
 
 int test6(int out1) {
   // CHECK-LABEL: define{{.*}} i32 @test6(
-  // CHECK: callbr i32 asm sideeffect "testl $0, $0; testl $1, $1; jne ${3:l}", "={si},r,0,!i,!i,{{.*}}
+  // CHECK: callbr i32 asm sideeffect "testl $0, $0; testl $1, $1; jne ${3:l}", "={si},r,{si},!i,!i,{{.*}}
   // CHECK: to label %asm.fallthrough [label %label_true.split, label %landing.split]
   // CHECK-LABEL: asm.fallthrough:
   // CHECK-LABEL: landing:
diff --git a/clang/test/CodeGen/ms-intrinsics.c b/clang/test/CodeGen/ms-intrinsics.c
index 5bb003d1f91fc0..375258ca609675 100644
--- a/clang/test/CodeGen/ms-intrinsics.c
+++ b/clang/test/CodeGen/ms-intrinsics.c
@@ -36,12 +36,12 @@ void test__movsb(unsigned char *Dest, unsigned char *Src, size_t Count) {
   return __movsb(Dest, Src, Count);
 }
 // CHECK-I386-LABEL: define{{.*}} void @test__movsb
-// CHECK-I386:   tail call { ptr, ptr, i32 } asm sideeffect "xchg $(%esi, $1$|$1, esi$)\0Arep movsb\0Axchg ...
[truncated]

@github-actions
Copy link

github-actions bot commented Mar 19, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@tltao
Copy link
Contributor Author

tltao commented Mar 19, 2024

@jyknight
Copy link
Member

Thanks for reviving the change!

I'm definitely in favor of this, but since it adds no new functionality (only better usability), the value proposition is significantly lessened (IMO, to the point where it may not be be worthwhile to do) if it gets implemented ONLY in clang. So I'd love to see some movement on the GCC side, too. Maybe revive the existing discussion thread?

statement which improves readability and solves a few other issues experienced
by local register variables, such as:

* function calls might clobber register variables
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't actually an issue (at least in LLVM), since local register variables are not actually assigned to the named register except when passed/returned from inline-asm statements.

case 'E':
case 'F':
break; // Pass them.
case '{': {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's unclear to me whether this uses the same register parsing logic as the existing named-register asm on variables. It looks like it's going to do something different, which worries me.

I think we ought to be accepting the exact same names in both syntaxes, on all platforms. Can you confirm if that's actually the case with this PR?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, we should be using the same names. The register asm variable is parsed in clang/lib/Sema/SemaDecl.cpp using:

case SC_Register:
  // Local Named register
  if (!Context.getTargetInfo().isValidGCCRegisterName(Label) &&
      DeclAttrsMatchCUDAMode(getLangOpts(), getCurFunctionDecl()))
    Diag(E->getExprLoc(), diag::err_asm_unknown_register_name) << Label;
  break;

Tony Tao added 6 commits December 3, 2025 15:05
https://reviews.llvm.org/D105142

The main idea is to allow Clang to support the ability to
specify specific hardware registers in inline assembly
constraints via the use of curly braces ``{}``. As such,
this is mainly a Clang change.

Relevant RFCs posted here

https://lists.llvm.org/pipermail/llvm-dev/2021-June/151370.html
https://gcc.gnu.org/pipermail/gcc/2021-June/236269.html
@tltao
Copy link
Contributor Author

tltao commented Dec 5, 2025

Ping. Restarting this topic as GCC have implemented the same feature. https://gcc.gnu.org/onlinedocs/gcc/Hard-Register-Constraints.html

@tltao tltao requested a review from perry-ca December 5, 2025 21:37
Copy link
Collaborator

@efriedma-quic efriedma-quic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add tests for multiple constraints, like void f() { asm(""::"i{x0}"(0)); }.

@tltao tltao self-assigned this Dec 8, 2025
@tltao
Copy link
Contributor Author

tltao commented Dec 8, 2025

Please add tests for multiple constraints, like void f() { asm(""::"i{x0}"(0)); }.

I think this raises an interesting question of how we want to resolve this pattern. If a variable named in this constraint is also set via register asm. E.g.

register int y asm ("r1") = ...
asm ("..." : "rm" (y));

Clang currently overwrites rm with {r1}. However, we currently have no mechanism to do this with hard registers, so rm{r1} would be passed down to InlineAsmLowering. At this point, TargetLowering::getConstraintPreferences follows getConstraintPiority which places a specific register at a lower priority than a register class, so we end up picking any generic register of class r, instead of r1.

IMO, this goes against the intended use of hard registers, so I think we should disallow this in the front end, and error out if it detects {} in the constraint string with other constraints. But I'm open to other ideas.

@efriedma-quic
Copy link
Collaborator

IMO, this goes against the intended use of hard registers, so I think we should disallow this in the front end, and error out if it detects {} in the constraint string with other constraints. But I'm open to other ideas.

If we can get away with rejecting it, sure, fine. gcc accepts it, but not sure if anyone is actually using it that way.

If you do want to reject, please make sure we reject gracefully at both the clang level and the LLVM IR level.

@JonPsson1 JonPsson1 removed their request for review December 8, 2025 22:14
@tltao
Copy link
Contributor Author

tltao commented Dec 8, 2025

If you do want to reject, please make sure we reject gracefully at both the clang level and the LLVM IR level.

Is there a reason we also need to check it in the LLVM IR level if we are already checking for it in clang? This PR only affects the Clang output so I don't know for sure if there's another path to generate the LLVM IR pattern.

@efriedma-quic
Copy link
Collaborator

Is there a reason we also need to check it in the LLVM IR level if we are already checking for it in clang?

We want to be friendly to people writing non-clang frontends.

@uweigand
Copy link
Member

uweigand commented Dec 9, 2025

IMO, this goes against the intended use of hard registers, so I think we should disallow this in the front end, and error out if it detects {} in the constraint string with other constraints. But I'm open to other ideas.

Not sure why we have to fully reject this. In GCC, using a register asm in an operand with hard register constraints is allowed if the register asm is compatible with the constraint. When there are multiple constraints, I think it could work the same way - if the register asm is compatible with at least one of the constraints, it is OK (and we effectively force the register from the register asm), otherwise we should error out.

@stefan-sf-ibm , the GCC docs do not specifically go into this case. Can you confirm how GCC handles register asm in combination with multiple constraints including hard register constraints?

@stefan-sf-ibm
Copy link
Contributor

Currently I've implemented/documented constraints consisting of single hard register constraints only since this is the most basic feature and I'm still running into RA problems in certain cases for those. Once those are fixed I would like to move on and have a look at how to implement multiple constraints involving hard register constraints. I haven't fully wrapped my head around those. There are certainly cases like constraint {r5}r should be equivalent to r, if register r5 is contained in the register class associated with constraint r which is more obvious than maybe other cases. In general, having the possibility to come up with "ad-hoc" register classes via hard register constraints as e.g. {r3}{r4}{r7} is valuable (one could also think of syntactic sugar like having a set-like notation {r3,r4,r7}).

This brings me to register asm and hard register constraints. If the constraint is a single hard register constraint and the operand a register asm object and both coincide in the register, then I don't see a reason to reject this. That being said, currently I accept this in GCC and as you said, if multiple constraints are involved and at least one is compatible with the register asm, then this could work, too. Though, this is where it becomes a bit more complicated since GCC and Clang do not always coincide here. Assume f5 is not contained in the register class associated with constraint r, then for

register long x asm ("f5");
asm ("" : "=r" (x));

we have that GCC assigns to the operand some register which is compatible with constraint r and therefore is different than f5 coming from register asm, whereas Clang assigns register f5 coming from register asm and ignores the constraint. That being said I haven't come up with a complete set of rules here but I certainly would love to see GCC and Clang being compatible here!

@tltao
Copy link
Contributor Author

tltao commented Dec 9, 2025

There are certainly cases like constraint {r5}r should be equivalent to r, if register r5 is contained in the register class associated with constraint r which is more obvious than maybe other cases.

Is there an example of a use case that looks something like {r5}r? And would we lose any functionality if we enforced that hard registers cannot be combined with other constraints?

To me, the constraint {r5}r does not make sense. If we are telling the compiler, through the use of inline asm hard register, that we know exactly what register we want to generate for a specific instruction operand, then why would we want the compiler to override that decision? If {r5}r is equivalent to r, then what is the purpose of accepting {r5}? I have no insight into the GCC implementation/history, but one guess that perhaps this was allowed prior to hard registers because there's simply no way to re-associate a variable with a different register if it's already attached to a specific register via register asm?

With hard register however, we are able to fine tune inline asm much more precisely, so to me, allowing something like r{r5} can cause more confusion, since you would expect {r5} to have some meaning, but it is instead quietly discarded?

@efriedma-quic
Copy link
Collaborator

efriedma-quic commented Dec 9, 2025

An example that might be helpful (using x86 asm):

int shift(int x, char amt) {
    asm("shll %1, %0" : "+r"(x) : "{cl}i"(amt));
    return x;
}
int a(int x) {
    return shift(x, 1);
}

Here, you want one of two choices: either an immediate, or the register "cl".

I can't think of any cases where you'd want to specify both a specific register and a generic register class, though.

@uweigand
Copy link
Member

I can image cases where multiple register constraints may still be useful; e.g. to express that we can use any one of several hard-coded registers, or we can use a hard-coded register or some register class that doesn't contain that hard register.

It is of course superfluous to specific both a hard register and a register class containing this register. But that's really the same issue as specifying two classes where one is a superclass of the other - and we don't explicitly forbid this either.

@tltao
Copy link
Contributor Author

tltao commented Dec 10, 2025

Thanks for all the comments. I think from @stefan-sf-ibm's response, it sounds like the GCC side is not fully decided on what to do with multiple constraints involving hard registers, so we can wait for the details to be hashed out further there.

In the meantime, I plan on making a few changes:

  • Modify the current register asm implementation in Clang to append any constraints to the register asm register. This follows the GCC implementation. One caveat is there's no type checking in Clang for constraints, so we have no way of excluding the register asm register if there's a mismatch. However, because TargetLowering::C_RegisterClass has higher priority than TargetLowering::C_Register, we will pick the register class constraint instead of the register asm constraint in TargetLowering.
  • Allow Clang to accept multiple constraints involving hard registers. As mentioned before, there is no real logic in TargetLowering to pick the "best" choice. It simply selects based on a fixed priority given to the different constraint types, and if all the constraints are the same priority, then it just picks the first one. I think it's outside the scope of this PR to do any work here, so I want to leave it as is.
  • Add tests for these scenarios.

Does this sound reasonable?

@efriedma-quic
Copy link
Collaborator

Modify the current register asm implementation in Clang to append any constraints to the register asm register. This follows the GCC implementation. One caveat is there's no type checking in Clang for constraints, so we have no way of excluding the register asm register if there's a mismatch. However, because TargetLowering::C_RegisterClass has higher priority than TargetLowering::C_Register, we will pick the register class constraint instead of the register asm constraint in TargetLowering.

I'm not sure I follow what you mean here; does this mean that if you have a register asm variable, it might not end up in that register? That seems like a regression for existing code.


Other bits seem fine.

@tltao
Copy link
Contributor Author

tltao commented Dec 10, 2025

I'm not sure I follow what you mean here; does this mean that if you have a register asm variable, it might not end up in that register? That seems like a regression for existing code.

Ah, good point. I was thinking more in line of mismatching register classes, but you're right, with our current TargetLowering logic, it's not possible to combine register asm with any other constraints.

@jyknight
Copy link
Member

A register-asm variable declaration passed to an asm statement should definitely override the constraints listed on the asm statement itself, forcing the use of the specified register. Doing otherwise is extremely surprising behavior! That said, it'd (probably) be OK to report a compiler error if the constraints listed on the asm statement would not otherwise permit the listed register. (But such a behavior change is pretty much unrelated to this PR)

Agreed that we should support multiple hard register constraints, or mixed hard and other register constraints specified on the asm statement. And, yes, we should support that at the same level as we support multiple constraints today (which is to say "we adhere to the specified constraints, but don't necessarily make GOOD choices").

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backend:AArch64 backend:ARM backend:RISC-V backend:SystemZ clang:codegen IR generation bugs: mangling, exceptions, etc. clang:frontend Language frontend issues, e.g. anything involving "Sema" clang Clang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants