Skip to content

Conversation

@eugeneepshteyn
Copy link
Contributor

When REAL types are constant folded, the underneath implementation uses arrays of integers. Ensure that these arrays are properly aligned.

This matters when building flang with clang. In some cases, the resulting code for flang compiler ended up using SSE2 aligned load instructions for REAL(16) constant folding on x86_64, and these instructions require that the values are loaded from the aligned addresses.

When REAL types are constant folded, the underneath implementation uses
arrays of integers. Ensure that these arrays are properly aligned.

This matters when building flang with clang. In some cases, the
resulting code for flang compiler ended up using SSE2 aligned load
instructions for REAL(16) constant folding on x86_64, and these
instructions require that the values are loaded from the aligned
addresses.
@eugeneepshteyn eugeneepshteyn marked this pull request as ready for review July 17, 2025 23:15
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:semantics labels Jul 17, 2025
@llvmbot
Copy link
Member

llvmbot commented Jul 17, 2025

@llvm/pr-subscribers-flang-semantics

Author: Eugene Epshteyn (eugeneepshteyn)

Changes

When REAL types are constant folded, the underneath implementation uses arrays of integers. Ensure that these arrays are properly aligned.

This matters when building flang with clang. In some cases, the resulting code for flang compiler ended up using SSE2 aligned load instructions for REAL(16) constant folding on x86_64, and these instructions require that the values are loaded from the aligned addresses.


Full diff: https://github.com/llvm/llvm-project/pull/149381.diff

2 Files Affected:

  • (modified) flang/include/flang/Evaluate/integer.h (+1)
  • (modified) flang/include/flang/Evaluate/real.h (+4-1)
diff --git a/flang/include/flang/Evaluate/integer.h b/flang/include/flang/Evaluate/integer.h
index fccc2ad774a8f..5953fc81cb111 100644
--- a/flang/include/flang/Evaluate/integer.h
+++ b/flang/include/flang/Evaluate/integer.h
@@ -74,6 +74,7 @@ class Integer {
   static_assert(std::is_unsigned_v<BigPart>);
   static_assert(CHAR_BIT * sizeof(BigPart) >= 2 * partBits);
   static constexpr bool littleEndian{IS_LITTLE_ENDIAN};
+  static constexpr int alignment{ALIGNMENT};
 
 private:
   static constexpr int maxPartBits{CHAR_BIT * sizeof(Part)};
diff --git a/flang/include/flang/Evaluate/real.h b/flang/include/flang/Evaluate/real.h
index 03294881850a1..76d25d9fe2670 100644
--- a/flang/include/flang/Evaluate/real.h
+++ b/flang/include/flang/Evaluate/real.h
@@ -490,7 +490,10 @@ template <typename WORD, int PREC> class Real {
       bool isNegative, int exponent, const Fraction &, Rounding, RoundingBits,
       bool multiply = false);
 
-  Word word_{}; // an Integer<>
+  // Require alignment, in case code generation on x86_64 decides that our
+  // Real object is suitable for SSE2 instructions and then gets surprised
+  // by unaligned address.
+  alignas(Word::alignment / 8) Word word_{}; // an Integer<>
 };
 
 extern template class Real<Integer<16>, 11>; // IEEE half format

@eugeneepshteyn eugeneepshteyn merged commit 45a6c02 into llvm:main Jul 21, 2025
9 checks passed
@eugeneepshteyn eugeneepshteyn deleted the fold-real-align branch July 21, 2025 20:51
mahesh-attarde pushed a commit to mahesh-attarde/llvm-project that referenced this pull request Jul 28, 2025
When REAL types are constant folded, the underneath implementation uses
arrays of integers. Ensure that these arrays are properly aligned.

This matters when building flang with clang. In some cases, the
resulting code for flang compiler ended up using SSE2 aligned load
instructions for REAL(16) constant folding on x86_64, and these
instructions require that the values are loaded from the aligned
addresses.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flang:semantics flang Flang issues not falling into any other category

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants