Skip to content

Commit 757ad43

Browse files
SC llvm teamSC llvm team
authored andcommitted
Merge upstream LLVM into amd-gfx12
2 parents 2019829 + 6ebc423 commit 757ad43

File tree

20 files changed

+485
-103
lines changed

20 files changed

+485
-103
lines changed

clang/docs/LanguageExtensions.rst

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ for support for non-standardized features, i.e. features not prefixed ``c_``,
138138
``cxx_`` or ``objc_``.
139139

140140
Another use of ``__has_feature`` is to check for compiler features not related
141-
to the language standard, such as e.g. :doc:`AddressSanitizer
141+
to the language standard, such as :doc:`AddressSanitizer
142142
<AddressSanitizer>`.
143143

144144
If the ``-pedantic-errors`` option is given, ``__has_extension`` is equivalent
@@ -377,8 +377,8 @@ Builtin Macros
377377

378378
``__FILE_NAME__``
379379
Clang-specific extension that functions similar to ``__FILE__`` but only
380-
renders the last path component (the filename) instead of an invocation
381-
dependent full path to that file.
380+
renders the last path component (the filename) instead of an
381+
invocation-dependent full path to that file.
382382

383383
``__COUNTER__``
384384
Defined to an integer value that starts at zero and is incremented each time
@@ -716,7 +716,7 @@ See also :ref:`langext-__builtin_shufflevector`, :ref:`langext-__builtin_convert
716716
a NEON vector or an SVE vector, it's only available in C++ and uses normal bool
717717
conversions (that is, != 0).
718718
If it's an extension (OpenCL) vector, it's only available in C and OpenCL C.
719-
And it selects base on signedness of the condition operands (OpenCL v1.1 s6.3.9).
719+
And it selects based on signedness of the condition operands (OpenCL v1.1 s6.3.9).
720720
.. [#] sizeof can only be used on vector length specific SVE types.
721721
.. [#] Clang does not allow the address of an element to be taken while GCC
722722
allows this. This is intentional for vectors with a boolean element type and
@@ -857,7 +857,7 @@ Each builtin returns a scalar equivalent to applying the specified
857857
operation(x, y) as recursive even-odd pairwise reduction to all vector
858858
elements. ``operation(x, y)`` is repeatedly applied to each non-overlapping
859859
even-odd element pair with indices ``i * 2`` and ``i * 2 + 1`` with
860-
``i in [0, Number of elements / 2)``. If the numbers of elements is not a
860+
``i in [0, Number of elements / 2)``. If the number of elements is not a
861861
power of 2, the vector is widened with neutral elements for the reduction
862862
at the end to the next power of 2.
863863

@@ -1491,7 +1491,7 @@ C++14 digit separators
14911491

14921492
Use ``__cpp_digit_separators`` to determine if support for digit separators
14931493
using single quotes (for instance, ``10'000``) is enabled. At this time, there
1494-
is no corresponding ``__has_feature`` name
1494+
is no corresponding ``__has_feature`` name.
14951495

14961496
C++14 generalized lambda capture
14971497
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@@ -1545,7 +1545,7 @@ C++ type aware allocators
15451545
^^^^^^^^^^^^^^^^^^^^^^^^^
15461546

15471547
Use ``__has_extension(cxx_type_aware_allocators)`` to determine the existence of
1548-
support for the future C++2d type aware allocator feature. For full details see
1548+
support for the future C++2d type aware allocator feature. For full details, see
15491549
:doc:`C++ Type Aware Allocators <CXXTypeAwareAllocators>` for additional details.
15501550

15511551
C11
@@ -1643,7 +1643,7 @@ Modules
16431643
Use ``__has_feature(modules)`` to determine if Modules have been enabled.
16441644
For example, compiling code with ``-fmodules`` enables the use of Modules.
16451645

1646-
More information could be found `here <https://clang.llvm.org/docs/Modules.html>`_.
1646+
More information can be found `here <https://clang.llvm.org/docs/Modules.html>`_.
16471647

16481648
Language Extensions Back-ported to Previous Standards
16491649
=====================================================
@@ -1878,7 +1878,7 @@ The following type trait primitives are supported by Clang. Those traits marked
18781878
C++26 relocatable types, and types which
18791879
were made trivially relocatable via the ``clang::trivial_abi`` attribute.
18801880
This trait is deprecated and should be replaced by
1881-
``__builtin_is_cpp_trivially_relocatable``. Note however that it is generally
1881+
``__builtin_is_cpp_trivially_relocatable``. Note, however, that it is generally
18821882
unsafe to relocate a C++-relocatable type with ``memcpy`` or ``memmove``;
18831883
use ``__builtin_trivially_relocate``.
18841884
* ``__builtin_is_cpp_trivially_relocatable`` (C++): Returns true if an object

clang/unittests/Format/FormatTest.cpp

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -3185,7 +3185,7 @@ TEST_F(FormatTest, FormatsLabels) {
31853185
// The opening brace may either be on the same unwrapped line as the colon or
31863186
// on a separate one. The formatter should recognize both.
31873187
Style = getLLVMStyle();
3188-
Style.BreakBeforeBraces = FormatStyle::BraceBreakingStyle::BS_Allman;
3188+
Style.BreakBeforeBraces = FormatStyle::BS_Allman;
31893189
verifyFormat("{\n"
31903190
" some_code();\n"
31913191
"test_label:\n"
@@ -3206,7 +3206,7 @@ TEST_F(FormatTest, FormatsLabels) {
32063206

32073207
TEST_F(FormatTest, MultiLineControlStatements) {
32083208
FormatStyle Style = getLLVMStyleWithColumns(20);
3209-
Style.BreakBeforeBraces = FormatStyle::BraceBreakingStyle::BS_Custom;
3209+
Style.BreakBeforeBraces = FormatStyle::BS_Custom;
32103210
Style.BraceWrapping.AfterControlStatement = FormatStyle::BWACS_MultiLine;
32113211
// Short lines should keep opening brace on same line.
32123212
verifyFormat("if (foo) {\n"
@@ -3441,7 +3441,7 @@ TEST_F(FormatTest, MultiLineControlStatements) {
34413441

34423442
TEST_F(FormatTest, BeforeWhile) {
34433443
FormatStyle Style = getLLVMStyle();
3444-
Style.BreakBeforeBraces = FormatStyle::BraceBreakingStyle::BS_Custom;
3444+
Style.BreakBeforeBraces = FormatStyle::BS_Custom;
34453445

34463446
verifyFormat("do {\n"
34473447
" foo();\n"
@@ -23779,7 +23779,7 @@ TEST_F(FormatTest, FormatsLambdas) {
2377923779
LLVMWithBeforeLambdaBody.BreakBeforeBraces = FormatStyle::BS_Custom;
2378023780
LLVMWithBeforeLambdaBody.BraceWrapping.BeforeLambdaBody = true;
2378123781
LLVMWithBeforeLambdaBody.AllowShortLambdasOnASingleLine =
23782-
FormatStyle::ShortLambdaStyle::SLS_None;
23782+
FormatStyle::SLS_None;
2378323783
verifyFormat("FctWithOneNestedLambdaInline_SLS_None(\n"
2378423784
" []()\n"
2378523785
" {\n"
@@ -23815,7 +23815,7 @@ TEST_F(FormatTest, FormatsLambdas) {
2381523815
LLVMWithBeforeLambdaBody);
2381623816

2381723817
LLVMWithBeforeLambdaBody.AllowShortLambdasOnASingleLine =
23818-
FormatStyle::ShortLambdaStyle::SLS_Empty;
23818+
FormatStyle::SLS_Empty;
2381923819
verifyFormat("FctWithOneNestedLambdaInline_SLS_Empty(\n"
2382023820
" []()\n"
2382123821
" {\n"
@@ -23862,7 +23862,7 @@ TEST_F(FormatTest, FormatsLambdas) {
2386223862
LLVMWithBeforeLambdaBody);
2386323863

2386423864
LLVMWithBeforeLambdaBody.AllowShortLambdasOnASingleLine =
23865-
FormatStyle::ShortLambdaStyle::SLS_Inline;
23865+
FormatStyle::SLS_Inline;
2386623866
verifyFormat("FctWithOneNestedLambdaInline_SLS_Inline([]() { return 17; });",
2386723867
LLVMWithBeforeLambdaBody);
2386823868
verifyFormat("FctWithOneNestedLambdaEmpty_SLS_Inline([]() {});",
@@ -23893,7 +23893,7 @@ TEST_F(FormatTest, FormatsLambdas) {
2389323893
LLVMWithBeforeLambdaBody);
2389423894

2389523895
LLVMWithBeforeLambdaBody.AllowShortLambdasOnASingleLine =
23896-
FormatStyle::ShortLambdaStyle::SLS_All;
23896+
FormatStyle::SLS_All;
2389723897
verifyFormat("FctWithOneNestedLambdaInline_SLS_All([]() { return 17; });",
2389823898
LLVMWithBeforeLambdaBody);
2389923899
verifyFormat("FctWithOneNestedLambdaEmpty_SLS_All([]() {});",
@@ -24025,7 +24025,7 @@ TEST_F(FormatTest, FormatsLambdas) {
2402524025
LLVMWithBeforeLambdaBody);
2402624026

2402724027
LLVMWithBeforeLambdaBody.AllowShortLambdasOnASingleLine =
24028-
FormatStyle::ShortLambdaStyle::SLS_None;
24028+
FormatStyle::SLS_None;
2402924029

2403024030
verifyFormat("auto select = [this]() -> const Library::Object *\n"
2403124031
"{\n"
@@ -24273,7 +24273,7 @@ TEST_F(FormatTest, LambdaWithLineComments) {
2427324273
LLVMWithBeforeLambdaBody.BreakBeforeBraces = FormatStyle::BS_Custom;
2427424274
LLVMWithBeforeLambdaBody.BraceWrapping.BeforeLambdaBody = true;
2427524275
LLVMWithBeforeLambdaBody.AllowShortLambdasOnASingleLine =
24276-
FormatStyle::ShortLambdaStyle::SLS_All;
24276+
FormatStyle::SLS_All;
2427724277

2427824278
verifyFormat("auto k = []() { return; }", LLVMWithBeforeLambdaBody);
2427924279
verifyFormat("auto k = []() // comment\n"
@@ -28371,7 +28371,7 @@ TEST_F(FormatTest, BreakAfterAttributes) {
2837128371
"Foo &operator-(Foo &);",
2837228372
Style);
2837328373

28374-
Style.ReferenceAlignment = FormatStyle::ReferenceAlignmentStyle::RAS_Left;
28374+
Style.ReferenceAlignment = FormatStyle::RAS_Left;
2837528375
verifyFormat("[[nodiscard]]\n"
2837628376
"Foo& operator-(Foo&);",
2837728377
Style);

clang/unittests/Format/FormatTestSelective.cpp

Lines changed: 2 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -689,8 +689,7 @@ TEST_F(FormatTestSelective, FormatMacroRegardlessOfPreviousIndent) {
689689
" }};", // Ditto: Bug?
690690
format(Code, 57, 0));
691691

692-
Style.IndentPPDirectives =
693-
FormatStyle::PPDirectiveIndentStyle::PPDIS_BeforeHash;
692+
Style.IndentPPDirectives = FormatStyle::PPDIS_BeforeHash;
694693
EXPECT_EQ(" class Foo {\n"
695694
" void test() {\n"
696695
" #ifdef 1\n"
@@ -699,8 +698,7 @@ TEST_F(FormatTestSelective, FormatMacroRegardlessOfPreviousIndent) {
699698
" }};",
700699
format(Code, 57, 0));
701700

702-
Style.IndentPPDirectives =
703-
FormatStyle::PPDirectiveIndentStyle::PPDIS_AfterHash;
701+
Style.IndentPPDirectives = FormatStyle::PPDIS_AfterHash;
704702
EXPECT_EQ(" class Foo {\n"
705703
" void test() {\n"
706704
" #ifdef 1\n"

llvm/include/llvm/IR/IntrinsicsNVVM.td

Lines changed: 65 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -331,6 +331,11 @@ class WMMA_REGS<string Geom, string Frag, string PtxEltType> {
331331
!eq(gf,"m8n16:x2") : !listsplat(llvm_i32_ty, 2),
332332
!eq(gf,"m8n16:x4") : !listsplat(llvm_i32_ty, 4),
333333

334+
// stmatrix b8 -> s32 @ m16n8
335+
!eq(gf,"m16n8:x1") : !listsplat(llvm_i32_ty, 1),
336+
!eq(gf,"m16n8:x2") : !listsplat(llvm_i32_ty, 2),
337+
!eq(gf,"m16n8:x4") : !listsplat(llvm_i32_ty, 4),
338+
334339
);
335340
}
336341

@@ -403,6 +408,17 @@ class LDMATRIX_NAME<WMMA_REGS Frag, int Trans> {
403408
!subst("llvm.", "int_", intr));
404409
}
405410

411+
class STMATRIX_NAME<WMMA_REGS Frag, int Trans> {
412+
string intr = "llvm.nvvm.stmatrix.sync.aligned"
413+
# "." # Frag.geom
414+
# "." # Frag.frag
415+
# !if(Trans, ".trans", "")
416+
# "." # Frag.ptx_elt_type
417+
;
418+
string record = !subst(".", "_",
419+
!subst("llvm.", "int_", intr));
420+
}
421+
406422
// Generates list of 4-tuples of WMMA_REGS representing a valid MMA op.
407423
// Geom: list of supported geometries.
408424
// TypeN: PTX type of the corresponding fragment's element.
@@ -443,6 +459,16 @@ class LDMATRIX_OPS<list<string> Geom, list<string> Frags, list<string> Types> {
443459
list<string> ops = !foreach(x, ret, x.gft);
444460
}
445461

462+
class STMATRIX_OPS<list<string> Geom, list<string> Frags, list<string> Types> {
463+
list<WMMA_REGS> ret =
464+
!foldl([]<WMMA_REGS>, Geom, t1, geom, !listconcat(t1,
465+
!foldl([]<WMMA_REGS>, Frags, t2, frag, !listconcat(t2,
466+
!foldl([]<WMMA_REGS>, Types, t3, type, !listconcat(t3,
467+
[WMMA_REGS<geom, frag, type>]))))));
468+
// Debugging aid for readable representation of the list above.
469+
list<string> ops = !foreach(x, ret, x.gft);
470+
}
471+
446472
// Creates list of valid combinations of fragments. This is the main list that
447473
// drives generation of corresponding intrinsics and instructions.
448474
class NVVM_MMA_OPS {
@@ -537,9 +563,18 @@ class NVVM_MMA_OPS {
537563
list<WMMA_REGS> ldmatrix_geom_m8n16_ops = LDMATRIX_OPS<
538564
["m8n16"], ["x1", "x2", "x4"], ["b8x16.b6x16_p32", "b8x16.b4x16_p64"]>.ret;
539565

566+
list<WMMA_REGS> stmatrix_b16_ops = STMATRIX_OPS<
567+
["m8n8"], ["x1", "x2", "x4"], ["b16"]>.ret;
568+
569+
list<WMMA_REGS> stmatrix_b8_ops = STMATRIX_OPS<
570+
["m16n8"], ["x1", "x2", "x4"], ["b8"]>.ret;
571+
540572
list<WMMA_REGS> all_ldmatrix_ops = !listconcat(ldmatrix_b16_ops,
541573
ldmatrix_geom_m16n16_ops,
542574
ldmatrix_geom_m8n16_ops);
575+
576+
list<WMMA_REGS> all_stmatrix_ops = !listconcat(stmatrix_b16_ops,
577+
stmatrix_b8_ops);
543578
}
544579

545580
def NVVM_MMA_OPS : NVVM_MMA_OPS;
@@ -680,6 +715,19 @@ class NVVM_LDMATRIX_SUPPORTED<WMMA_REGS frag, bit trans> {
680715
);
681716
}
682717

718+
// Returns true if the fragment is valid for stmatrix ops is supported;
719+
// false otherwise.
720+
class NVVM_STMATRIX_SUPPORTED<WMMA_REGS frag, bit trans> {
721+
string g = frag.geom;
722+
string t = frag.ptx_elt_type;
723+
724+
bit ret = !cond(
725+
!and(!eq(g, "m8n8"), !eq(t, "b16")): true,
726+
!and(!eq(g, "m16n8"), !eq(t, "b8"), !eq(trans, 1)): true,
727+
true: false
728+
);
729+
}
730+
683731
class SHFL_INFO<bit sync, string mode, string type, bit return_pred> {
684732
string Suffix = !if(sync, "sync_", "")
685733
# mode # "_"
@@ -1969,6 +2017,23 @@ foreach transposed = [0, 1] in {
19692017
}
19702018
}
19712019

2020+
// STMATRIX
2021+
class NVVM_STMATRIX<WMMA_REGS Frag, int Transposed>
2022+
: Intrinsic<[],
2023+
!listconcat([llvm_anyptr_ty], Frag.regs),
2024+
[IntrWriteMem, IntrArgMemOnly, IntrNoCallback,
2025+
WriteOnly<ArgIndex<0>>, NoCapture<ArgIndex<0>>],
2026+
STMATRIX_NAME<Frag, Transposed>.intr>;
2027+
2028+
foreach transposed = [0, 1] in {
2029+
foreach frag = NVVM_MMA_OPS.all_stmatrix_ops in {
2030+
if NVVM_STMATRIX_SUPPORTED<frag, transposed>.ret then {
2031+
def STMATRIX_NAME<frag, transposed>.record
2032+
: NVVM_STMATRIX<frag, transposed>;
2033+
}
2034+
}
2035+
}
2036+
19722037
// MAPA
19732038
let IntrProperties = [IntrNoMem, IntrSpeculatable, NoCapture<ArgIndex<0>>] in {
19742039
def int_nvvm_mapa

llvm/include/llvm/MC/MCObjectStreamer.h

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -80,6 +80,13 @@ class MCObjectStreamer : public MCStreamer {
8080
/// \name MCStreamer Interface
8181
/// @{
8282

83+
// Add a fragment with a variable-size tail and start a new empty fragment.
84+
void insert(MCFragment *F);
85+
86+
void addFixup(const MCExpr *Value, MCFixupKind Kind, uint32_t Offset = 0);
87+
// Add a new fragment to the current section without a variable-size tail.
88+
void newFragment();
89+
8390
void emitLabel(MCSymbol *Symbol, SMLoc Loc = SMLoc()) override;
8491
virtual void emitLabelAtPos(MCSymbol *Symbol, SMLoc Loc, MCFragment &F,
8592
uint64_t Offset);

llvm/include/llvm/MC/MCStreamer.h

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -259,6 +259,8 @@ class LLVM_ABI MCStreamer {
259259
bool AllowAutoPadding = false;
260260

261261
protected:
262+
bool IsObj = false;
263+
262264
// Symbol of the current epilog for which we are processing SEH directives.
263265
WinEH::FrameInfo::Epilog *CurrentWinEpilog = nullptr;
264266

@@ -310,6 +312,7 @@ class LLVM_ABI MCStreamer {
310312
virtual void reset();
311313

312314
MCContext &getContext() const { return Context; }
315+
bool isObj() const { return IsObj; }
313316

314317
// MCObjectStreamer has an MCAssembler and allows more expression folding at
315318
// parse time.
@@ -462,11 +465,6 @@ class LLVM_ABI MCStreamer {
462465

463466
MCSymbol *endSection(MCSection *Section);
464467

465-
/// Add a new fragment to the current section without a variable-size tail.
466-
void newFragment();
467-
/// Add a fragment with a variable-size tail and start a new empty fragment.
468-
void insert(MCFragment *F);
469-
470468
/// Returns the mnemonic for \p MI, if the streamer has access to a
471469
/// instruction printer and returns an empty string otherwise.
472470
virtual StringRef getMnemonic(const MCInst &MI) const { return ""; }

llvm/lib/MC/MCObjectStreamer.cpp

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,7 @@ MCObjectStreamer::MCObjectStreamer(MCContext &Context,
3333
Context, std::move(TAB), std::move(Emitter), std::move(OW))),
3434
EmitEHFrame(true), EmitDebugFrame(false) {
3535
assert(Assembler->getBackendPtr() && Assembler->getEmitterPtr());
36+
IsObj = true;
3637
setAllowAutoPadding(Assembler->getBackend().allowAutoPadding());
3738
if (Context.getTargetOptions() && Context.getTargetOptions()->MCRelaxAll)
3839
Assembler->setRelaxAll(true);
@@ -46,6 +47,23 @@ MCAssembler *MCObjectStreamer::getAssemblerPtr() {
4647
return nullptr;
4748
}
4849

50+
void MCObjectStreamer::newFragment() {
51+
addFragment(getContext().allocFragment<MCFragment>());
52+
}
53+
54+
void MCObjectStreamer::insert(MCFragment *F) {
55+
assert(F->getKind() != MCFragment::FT_Data &&
56+
"F should have a variable-size tail");
57+
addFragment(F);
58+
newFragment();
59+
}
60+
61+
void MCObjectStreamer::addFixup(const MCExpr *Value, MCFixupKind Kind,
62+
uint32_t Offset) {
63+
CurFrag->addFixup(
64+
MCFixup::create(CurFrag->getFixedSize() + Offset, Value, Kind));
65+
}
66+
4967
// As a compile-time optimization, avoid allocating and evaluating an MCExpr
5068
// tree for (Hi - Lo) when Hi and Lo are offsets into the same fragment's fixed
5169
// part.

llvm/lib/MC/MCParser/MCTargetAsmParser.cpp

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@
88

99
#include "llvm/MC/MCParser/MCTargetAsmParser.h"
1010
#include "llvm/MC/MCContext.h"
11+
#include "llvm/MC/MCObjectStreamer.h"
1112
#include "llvm/MC/MCRegister.h"
12-
#include "llvm/MC/MCStreamer.h"
1313

1414
using namespace llvm;
1515

@@ -25,8 +25,9 @@ MCSubtargetInfo &MCTargetAsmParser::copySTI() {
2525
STI = &STICopy;
2626
// The returned STI will likely be modified. Create a new fragment to prevent
2727
// mixing STI values within a fragment.
28-
if (getStreamer().getCurrentFragment())
29-
getStreamer().newFragment();
28+
auto &S = getStreamer();
29+
if (S.isObj() && S.getCurrentFragment())
30+
static_cast<MCObjectStreamer &>(S).newFragment();
3031
return STICopy;
3132
}
3233

0 commit comments

Comments
 (0)