Skip to content

Conversation

@ericastor
Copy link
Contributor

MASM allows no-op casts on register operands; for example, DWORD PTR eax is a legal expression. Any other cast is an error.

Fixes #132084

MASM allows no-op casts on register operands; for example, `DWORD PTR eax` is a legal expression. Any other cast is an error.

Fixes llvm#132084
@llvmbot
Copy link
Member

llvmbot commented Mar 24, 2025

@llvm/pr-subscribers-backend-x86

Author: Eric Astor (ericastor)

Changes

MASM allows no-op casts on register operands; for example, DWORD PTR eax is a legal expression. Any other cast is an error.

Fixes #132084


Full diff: https://github.com/llvm/llvm-project/pull/132751.diff

3 Files Affected:

  • (modified) llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp (+50-6)
  • (added) llvm/test/tools/llvm-ml/cast.asm (+25)
  • (added) llvm/test/tools/llvm-ml/cast_errors.asm (+41)
diff --git a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
index a6285a55f4155..475b0016ace68 100644
--- a/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
+++ b/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp
@@ -14,6 +14,9 @@
 #include "MCTargetDesc/X86TargetStreamer.h"
 #include "TargetInfo/X86TargetInfo.h"
 #include "X86Operand.h"
+#include "third_party/llvm/llvm-project/llvm/include/llvm/ADT/StringRef.h"
+#include "third_party/llvm/llvm-project/llvm/include/llvm/MC/MCRegister.h"
+#include "third_party/llvm/llvm-project/llvm/lib/Target/X86/X86RegisterInfo.h"
 #include "llvm/ADT/STLExtras.h"
 #include "llvm/ADT/SmallString.h"
 #include "llvm/ADT/SmallVector.h"
@@ -38,6 +41,7 @@
 #include "llvm/Support/SourceMgr.h"
 #include "llvm/Support/raw_ostream.h"
 #include <algorithm>
+#include <cstdint>
 #include <memory>
 
 using namespace llvm;
@@ -1150,7 +1154,7 @@ class X86AsmParser : public MCTargetAsmParser {
 
   X86::CondCode ParseConditionCode(StringRef CCode);
 
-  bool ParseIntelMemoryOperandSize(unsigned &Size);
+  bool ParseIntelMemoryOperandSize(unsigned &Size, StringRef *SizeStr);
   bool CreateMemForMSInlineAsm(MCRegister SegReg, const MCExpr *Disp,
                                MCRegister BaseReg, MCRegister IndexReg,
                                unsigned Scale, bool NonAbsMem, SMLoc Start,
@@ -2551,7 +2555,8 @@ bool X86AsmParser::ParseMasmOperator(unsigned OpKind, int64_t &Val) {
   return false;
 }
 
-bool X86AsmParser::ParseIntelMemoryOperandSize(unsigned &Size) {
+bool X86AsmParser::ParseIntelMemoryOperandSize(unsigned &Size,
+                                               StringRef *SizeStr) {
   Size = StringSwitch<unsigned>(getTok().getString())
     .Cases("BYTE", "byte", 8)
     .Cases("WORD", "word", 16)
@@ -2569,6 +2574,8 @@ bool X86AsmParser::ParseIntelMemoryOperandSize(unsigned &Size) {
     .Cases("ZMMWORD", "zmmword", 512)
     .Default(0);
   if (Size) {
+    if (SizeStr)
+      *SizeStr = getTok().getString();
     const AsmToken &Tok = Lex(); // Eat operand size (e.g., byte, word).
     if (!(Tok.getString() == "PTR" || Tok.getString() == "ptr"))
       return Error(Tok.getLoc(), "Expected 'PTR' or 'ptr' token!");
@@ -2577,6 +2584,31 @@ bool X86AsmParser::ParseIntelMemoryOperandSize(unsigned &Size) {
   return false;
 }
 
+uint16_t RegSizeInBits(MCRegister RegNo, const MCRegisterInfo &MRI) {
+  uint16_t Size = 0;
+  if (X86MCRegisterClasses[X86::GR8RegClassID].contains(RegNo))
+    Size = 8;
+  else if (X86MCRegisterClasses[X86::GR16RegClassID].contains(RegNo))
+    Size = 16;
+  else if (X86MCRegisterClasses[X86::GR32RegClassID].contains(RegNo))
+    Size = 32;
+  else if (X86MCRegisterClasses[X86::GR64RegClassID].contains(RegNo))
+    Size = 64;
+  else if (X86MCRegisterClasses[X86::RFP80RegClassID].contains(RegNo))
+    Size = 80;
+  else if (X86MCRegisterClasses[X86::VR128RegClassID].contains(RegNo))
+    Size = 128;
+  else if (X86MCRegisterClasses[X86::VR128XRegClassID].contains(RegNo))
+    Size = 128;
+  else if (X86MCRegisterClasses[X86::VR256XRegClassID].contains(RegNo))
+    Size = 256;
+  else if (X86MCRegisterClasses[X86::VR512RegClassID].contains(RegNo))
+    Size = 512;
+  else
+    llvm_unreachable("Register without known register class");
+  return Size;
+}
+
 bool X86AsmParser::parseIntelOperand(OperandVector &Operands, StringRef Name) {
   MCAsmParser &Parser = getParser();
   const AsmToken &Tok = Parser.getTok();
@@ -2584,7 +2616,8 @@ bool X86AsmParser::parseIntelOperand(OperandVector &Operands, StringRef Name) {
 
   // Parse optional Size directive.
   unsigned Size;
-  if (ParseIntelMemoryOperandSize(Size))
+  StringRef SizeStr;
+  if (ParseIntelMemoryOperandSize(Size, &SizeStr))
     return true;
   bool PtrInOperand = bool(Size);
 
@@ -2601,9 +2634,20 @@ bool X86AsmParser::parseIntelOperand(OperandVector &Operands, StringRef Name) {
       return Error(Start, "rip can only be used as a base register");
     // A Register followed by ':' is considered a segment override
     if (Tok.isNot(AsmToken::Colon)) {
-      if (PtrInOperand)
-        return Error(Start, "expected memory operand after 'ptr', "
-                            "found register operand instead");
+      if (PtrInOperand) {
+        if (!Parser.isParsingMasm())
+          return Error(Start, "expected memory operand after 'ptr', "
+                              "found register operand instead");
+
+        // If we are parsing MASM, we are allowed to cast registers to their own
+        // sizes, but not to other types.
+        if (RegSizeInBits(RegNo, *getContext().getRegisterInfo()) != Size)
+          return Error(
+              Start,
+              "cannot cast register '" +
+                  StringRef(getContext().getRegisterInfo()->getName(RegNo)) +
+                  "' to '" + SizeStr + "'; size does not match");
+      }
       Operands.push_back(X86Operand::CreateReg(RegNo, Start, End));
       return false;
     }
diff --git a/llvm/test/tools/llvm-ml/cast.asm b/llvm/test/tools/llvm-ml/cast.asm
new file mode 100644
index 0000000000000..2b4aaae88866e
--- /dev/null
+++ b/llvm/test/tools/llvm-ml/cast.asm
@@ -0,0 +1,25 @@
+; RUN: llvm-ml -m64 -filetype=s %s /Fo - | FileCheck %s
+
+.code
+
+mov byte ptr al, al
+mov al, byte ptr al
+; CHECK: mov al, al
+; CHECK-NEXT: mov al, al
+
+mov word ptr ax, ax
+mov ax, word ptr ax
+; CHECK: mov ax, ax
+; CHECK-NEXT: mov ax, ax
+
+mov dword ptr eax, eax
+mov eax, dword ptr eax
+; CHECK: mov eax, eax
+; CHECK-NEXT: mov eax, eax
+
+mov qword ptr rax, rax
+mov rax, qword ptr rax
+; CHECK: mov rax, rax
+; CHECK-NEXT: mov rax, rax
+
+END
diff --git a/llvm/test/tools/llvm-ml/cast_errors.asm b/llvm/test/tools/llvm-ml/cast_errors.asm
new file mode 100644
index 0000000000000..60cd9a4454ed8
--- /dev/null
+++ b/llvm/test/tools/llvm-ml/cast_errors.asm
@@ -0,0 +1,41 @@
+; RUN: not llvm-ml -m64 -filetype=s %s /Fo /dev/null 2>&1 | FileCheck %s
+
+.code
+
+mov word ptr al, ax
+; CHECK: error: cannot cast register 'AL' to 'word'; size does not match
+
+mov dword ptr al, eax
+; CHECK: error: cannot cast register 'AL' to 'dword'; size does not match
+
+mov qword ptr al, rax
+; CHECK: error: cannot cast register 'AL' to 'qword'; size does not match
+
+mov byte ptr ax, al
+; CHECK: error: cannot cast register 'AX' to 'byte'; size does not match
+
+mov dword ptr ax, eax
+; CHECK: error: cannot cast register 'AX' to 'dword'; size does not match
+
+mov qword ptr ax, rax
+; CHECK: error: cannot cast register 'AX' to 'qword'; size does not match
+
+mov byte ptr eax, al
+; CHECK: error: cannot cast register 'EAX' to 'byte'; size does not match
+
+mov word ptr eax, ax
+; CHECK: error: cannot cast register 'EAX' to 'word'; size does not match
+
+mov qword ptr eax, rax
+; CHECK: error: cannot cast register 'EAX' to 'qword'; size does not match
+
+mov byte ptr rax, al
+; CHECK: error: cannot cast register 'RAX' to 'byte'; size does not match
+
+mov word ptr rax, ax
+; CHECK: error: cannot cast register 'RAX' to 'word'; size does not match
+
+mov dword ptr rax, eax
+; CHECK: error: cannot cast register 'RAX' to 'dword'; size does not match
+
+END

@ericastor ericastor marked this pull request as draft March 24, 2025 14:54
@github-actions
Copy link

github-actions bot commented Mar 24, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

@ericastor ericastor marked this pull request as ready for review March 24, 2025 18:27
@ericastor ericastor requested review from MaskRay, compnerd and rnk March 24, 2025 18:27
@compnerd
Copy link
Member

This seems pretty close! I think that once we get another pair of eyes on the diagnostics, this should be ready.

Copy link
Collaborator

@AaronBallman AaronBallman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, the diagnostic text looks reasonable to me.

@MisterDA
Copy link

MisterDA commented Sep 6, 2025

Gentle ping! IIUC this PR has been reviewed and approved, is it just a rebase away from being merged?

@ericastor ericastor enabled auto-merge (squash) September 8, 2025 14:07
@ericastor ericastor merged commit 344708c into llvm:main Sep 8, 2025
9 checks passed
@nikic
Copy link
Contributor

nikic commented Sep 8, 2025

@ericastor
Copy link
Contributor Author

Not sure what's going on here, but this change increases the ThinLTO link time for clang by 2.5% (https://llvm-compile-time-tracker.com/compare_clang.php?from=40edc34782b3a804639042c2981c3c4ae6661a92&to=344708c2a3de005060242a77ccfea753406af8d0&stat=instructions%3Au&sortBy=interestingness).

All I can think is that we brought in "X86RegisterInfo.h"? But that doesn't seem like it should be dramatic.

@nikic
Copy link
Contributor

nikic commented Sep 9, 2025

Not sure what's going on here, but this change increases the ThinLTO link time for clang by 2.5% (https://llvm-compile-time-tracker.com/compare_clang.php?from=40edc34782b3a804639042c2981c3c4ae6661a92&to=344708c2a3de005060242a77ccfea753406af8d0&stat=instructions%3Au&sortBy=interestingness).

All I can think is that we brought in "X86RegisterInfo.h"? But that doesn't seem like it should be dramatic.

I've checked that it's not the header. I expect that something in this change is exposing pathological behavior in LLVM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ms] [llvm-ml] Allow size directives on register operands

8 participants