Skip to content
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
Show all changes
113 commits
Select commit Hold shift + click to select a range
f186964
[CIR][X86] Implement lowering for AVX512 mask builtins (kadd, kand, k…
GeneraluseAI Nov 22, 2025
6fe75b0
Merge branch 'main' into cir_x86_avx512_mask_builtin_lowering
GeneraluseAI Nov 26, 2025
472e6d1
Merge branch 'main' into cir_x86_avx512_mask_builtin_lowering
GeneraluseAI Nov 26, 2025
4cdfc08
[Flang-rt] Remove COMPILE_ONLY from flang-rt CMake file. (#169534)
DominikAdamski Nov 26, 2025
637840d
MC: Remove unneeded parameter `MCAsmBackend *`. NFC
MaskRay Nov 26, 2025
1277c9a
[clang][bytecode][NFC] Make Program::getNativePointer() const (#169502)
tbaederr Nov 26, 2025
66ef94b
[lldb][NFC] Fix incorrect comments in TestArm64InstEmulation
felipepiovezan Nov 25, 2025
0b0378b
[RISCV] Remove intrinsic declarations in tests, NFC (#167474)
jacquesguan Nov 26, 2025
eed70d5
[AArch64] Add vector tests for add(trunc(shift))
davemgreen Nov 26, 2025
e4fac9d
[mlir][tensor] Add new builders for insert_slice/extract_slice Ops (n…
banach-space Nov 26, 2025
fbb8328
[clang][Sema] Merge Check[Sizeless]VectorConditionalTypes implementat…
MacDue Nov 26, 2025
a727af6
[clang][bytecode][NFC] Remove unused Integral range functions (#169508)
tbaederr Nov 26, 2025
6de0b1e
[OpenMP][flang] Add initial support for by-ref reductions on the GPU …
ergawy Nov 26, 2025
e04c71f
[LifetimeSafety] Move GSL pointer/owner type detection to LifetimeAnn…
usx95 Nov 26, 2025
54dc073
[LoopCacheAnalysis] Fix crash after #164798 (#169486)
kasuga-fj Nov 26, 2025
434ca33
[LV][NFC] Remove remaining uses of undef in tests (#169357)
david-arm Nov 26, 2025
588973b
[SPIRV] Improve Logical SPIR-V Pointer Access and GEP Legalization (#…
s-perron Nov 26, 2025
a80de09
[VPlan] Use DL index type consistently for GEPs (#169396)
artagnon Nov 26, 2025
9d0ec0b
[clang][DebugInfo] Add call site debug info flag (#169574)
jryans Nov 26, 2025
5d621f1
Reland "[clang] Refactor to remove clangDriver dependency from clangF…
naveen-seth Nov 26, 2025
748f2be
[dwarf] make dwarf fission compatible with RISCV relaxations 2/2 (#16…
daniilavdeev Nov 26, 2025
ac4827e
Reland "[clang][Driver] Support for the SPIR-V backend when compiling…
mgcarrasco Nov 26, 2025
b8f4874
[Delinearization] Remove tryDelinearizeFixedSizeImpl (#169046)
kasuga-fj Nov 26, 2025
d6c9cd1
Reland: [GPUToXeVMPipeline][Pipeline] Modify pipeline to add `convert…
mshahneo Nov 26, 2025
e0fa1d5
[HIP] Perform implicit pointer cast when compiling HIP, not when -fcu…
jmmartinez Nov 26, 2025
fed8dc7
[gn build] Port d090311aa7df
llvmgnsyncbot Nov 26, 2025
7b40dfc
[VPlan] Hoist predicated loads with complementary masks. (#168373)
fhahn Nov 26, 2025
1576ca9
[Support] Add getAllocTokenModeAsString() helper (#169650)
melver Nov 26, 2025
570edcd
[AArch64] Combine vector add(trunc(shift)) (#169523)
davemgreen Nov 26, 2025
8496184
[CIR] Add missing switch cases for AO__scoped_atomic_uinc/udec_wrap i…
wenju-he Nov 26, 2025
1fce2e2
[BOLT] Fix assertion test (#169635)
bgergely0 Nov 26, 2025
f3d904e
[OpenMP] Add docs for fb_nullify/fb_preserve (#169558)
zahiraam Nov 26, 2025
50e54e2
opt: Try to respect target-abi command line option (#169604)
arsenm Nov 26, 2025
b85b5fa
[SPIRV] Enable DCE in instruction selection and update tests (#168428)
s-perron Nov 26, 2025
76bf8f2
CodeGen: Make all targets override pseudos with pointers (#159881)
arsenm Nov 26, 2025
d737da4
[tysan] Type Sanitizer support for SystemZ (#162396)
anoopkg6 Nov 26, 2025
e09df95
[SPIRV] Support Peeled Array Layouts for HLSL CBuffers (#169078)
s-perron Nov 26, 2025
508251f
[SPIRV] Use OpCopyMemory for logical SPIRV memcpy (#169348)
s-perron Nov 26, 2025
5563c93
CodeGen: Make target overrides of PointerLikeRegClass mandatory (#159…
arsenm Nov 26, 2025
e2e7dcd
[scudo] Add scudo_standalone support for SystemZ (#166187)
anoopkg6 Nov 26, 2025
93ea77d
[mlir][xegpu] Add layout based SIMT distribution support for `vector.…
charithaintc Nov 26, 2025
0beb70a
[MC][RISCV] Add missing Predicates for NDS_FMV_BF16_X (#169662)
sunshaoce Nov 26, 2025
a06d561
[LV] Use VPReductionRecipe for partial reductions (#147513)
SamTebbs33 Nov 26, 2025
c78c419
CodeGen: Remove PointerLikeRegClass handling from codegen (#159883)
arsenm Nov 26, 2025
18dc623
RuntimeLibcalls: Add malloc and free entries (#167081)
arsenm Nov 26, 2025
505a47b
RuntimeLibcalls: Add more function entries from TargetLibraryInfo (#1…
arsenm Nov 26, 2025
6f0158e
RuntimeLibcalls: Add memset_pattern* calls to darwin systems (#167083)
arsenm Nov 26, 2025
d3dea6e
Revert [Driver] Error for -gsplit-dwarf with RISC-V linker relaxation…
daniilavdeev Nov 26, 2025
d49fdc9
[AArch64] Enable maximising scalable vector bandwidth (#166748)
SamTebbs33 Nov 26, 2025
583ef8c
[HLSL] Remove `faceforward` SPIRV fast path (#169547)
kmpeng Nov 26, 2025
8cb359f
[BOLT][BTI] Add MCPlusBuilder::updateBTIVariant (#167308)
bgergely0 Nov 26, 2025
3008fa9
[NFC][PowerPC] Merge ppc64 encoding error tests (#169669)
lei137 Nov 26, 2025
3f61fe2
[IndVarSimplify] Fix `IndVarSimplify` to skip unfolding predicates wh…
luciechoi Nov 26, 2025
e80166c
[OpenMP][clang] Register vtables on device for indirect calls runtim…
Jason-VanBeusekom Nov 26, 2025
abb73c2
[Clang] Allow AVX/AVX512 subvector shuffles in constexpr (#168700)
mooori Nov 26, 2025
43d0f31
[CIR] Upstream Builtin Exp2Op (#169152)
FantasqueX Nov 26, 2025
4ff219b
Move static test variable into the #if that uses it (#169695)
bogner Nov 26, 2025
46e7381
[Clang] Fix false positive -Wignored-qualifiers (#169664)
cor3ntin Nov 26, 2025
24df068
[CIR] CountOf VLA with Array element type (#169404)
AmrDeveloper Nov 26, 2025
439150a
[CIR][NFC] Cleanup builtin helper function interfaces (#169586)
andykaylor Nov 26, 2025
944f54e
[lldb-dap] Add multi-session support with shared debugger instances (…
qxy11 Nov 26, 2025
f77aa7f
[flang][OpenMP] Remove unused #include "dump-parse-tree.h", NFC (#169…
kparzysz Nov 26, 2025
b1a649f
[lldb] Fix reading 32-bit signed integers (#169150)
igorkudrin Nov 26, 2025
5e9bc54
[SLP][NFC]Add a test with single op inst, used in many nodes, NFC.
alexey-bataev Nov 26, 2025
a7dcceb
[ROCDL] Added missing `cluster.load.async.to.lds` op (gfx1250) (#169042)
ravil-mobile Nov 26, 2025
dc8951c
Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-…
fhahn Nov 26, 2025
a09fb96
[lldb] [test-suite] fix typo in variable in darwin builder (#169254)
n2h9 Nov 26, 2025
f0ef0f6
[lldb] [scripting bridge] 167388 chore: add api to return arch name f…
n2h9 Nov 26, 2025
27232f4
Revert "Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs wh…
fhahn Nov 26, 2025
c3e9988
[mlir][amdgpu] Add make_dma_base operation (#169086)
amd-eochoalo Nov 26, 2025
091929c
[CIR][NFC] Fix build problem inside an assert (#169715)
andykaylor Nov 26, 2025
7359388
Reapply "[LV] Use ExtractLane(LastActiveLane, V) live outs when tail-…
fhahn Nov 26, 2025
dc0d1e6
[X86] addcarry.ll - add test coverage for #169691 (#169716)
RKSimon Nov 26, 2025
a62ea2c
[libc++][flat_map] Applied `[[nodiscard]]` (#169453)
H-G-Hristov Nov 26, 2025
0fb94f6
[libc++] Applied `[[nodiscard]]` to Language Support (partially) (#16…
H-G-Hristov Nov 26, 2025
313e5ea
[clang-format] Add xxxMaxDigitsNoSeparator (#164286)
HazardyKnusperkeks Nov 26, 2025
2afb9dd
CodeGen: Optionally emit PAuth relocations as IRELATIVE relocations.
pcc Nov 26, 2025
d06c53f
Add IR and codegen support for deactivation symbols.
pcc Nov 26, 2025
48423ee
Add deactivation symbol operand to ConstantPtrAuth.
pcc Nov 26, 2025
4ce0ec7
Revert "[tysan] Type Sanitizer support for SystemZ" (#169726)
uweigand Nov 26, 2025
b398d0a
[bazel] Fix build after #169086 (#169725)
boomanaiden154 Nov 26, 2025
30809a7
[flang][cuda][rt] Add entry point to get the allocation stream (#169608)
clementval Nov 26, 2025
a47812e
[SystemZ] Emit optional argument area length field (#169679)
redstar Nov 26, 2025
dba25ec
[SPIRV] Fix a warning
kazutakahirata Nov 26, 2025
607872a
[mlir][acc] Introduce ACCImplicitDeclare pass for globals handling (#…
razvanlupusoru Nov 26, 2025
13b2094
[clang][Driver] Use -no-canonical-prefixes in hip-spirv-backend-opt t…
boomanaiden154 Nov 26, 2025
7498a2b
[libc++] Applied `[[nodiscard]]` to concurrency (partially) (#169463)
H-G-Hristov Nov 26, 2025
f4afbcf
[SLP][NFC]Add another test with the user with multiple copyable opera…
alexey-bataev Nov 26, 2025
44ac252
[CIR] Add undef handling to enable global lambdas (#169721)
andykaylor Nov 26, 2025
3049ce7
Fix sanitizer failure introduced by #133537
pcc Nov 26, 2025
034bbe0
[CIR][NFC] Move builtin tests to their own directory (#169737)
andykaylor Nov 26, 2025
5b32908
[lld][MachO] Follow-up to use madvise() for threaded file page-in. (#…
johnno1962 Nov 26, 2025
faa9601
github-upload-release.py: add requirements and lock files for install…
nightlark Nov 26, 2025
18ef269
[llvm-objdump] Optimize live element tracking (#158763)
gulfemsavrun Nov 27, 2025
1fb5be9
[ORC] Clear stale ElemToPendingSN entries in WaitingOnGraph. (#169747)
lhames Nov 27, 2025
58393bc
[lldb] Use InlHostByteOrder in RegisterValue::SetValueFromData (#169624)
sedymrak Nov 27, 2025
e626624
[UBSan] Use -fsanitize-handler-preserve-all-regs in codegen
fmayer Nov 20, 2025
464163d
[AMDGPU] Remove unused functions isSigned. NFC (#169750)
tclin914 Nov 27, 2025
fcf60ef
[mlir][dataflow] Add arguemnt print for test-liveness-analysis (#169625)
linuxlonelyeagle Nov 27, 2025
fe467e6
[LoadStoreVectorizer] Fix one-element vector handling (#169671)
cmc-rep Nov 27, 2025
bc09d91
[libc++][queue] Applied `[[nodiscard]]` (#169469)
H-G-Hristov Nov 27, 2025
bb8b3bf
[flang] Use default constructor for FIRToSCF pass (#169741)
clementval Nov 27, 2025
c8f6bd3
[mlir][Transforms] Dialect conversion: Add support for `replaceUsesWi…
matthias-springer Nov 27, 2025
dea6173
[libc++][mdspan] Applied `[[nodiscard]]` (#169326)
H-G-Hristov Nov 27, 2025
0521408
[lldb-dap] Add breakpoints after debugger initialization in DExTer (#…
qxy11 Nov 27, 2025
eef0b21
Add missing freeConstants() call for ConstantPtrAuths.
pcc Nov 27, 2025
ed9c9aa
[ReplaceConstant] Don't create instructions for the same constant mul…
shiltian Nov 27, 2025
d96c019
[MLIR][NVVM] Add missing rounding modes in fp16x2 conversions (#169005)
Wolfram70 Nov 27, 2025
c67bb87
[MLIR][Intrinsics] Add new MLIR API to automatically resolve overload…
rajatbajpai Nov 27, 2025
efee37a
[clang][Tooling] Fix `getFileRange` returning a range spanning across…
tJener Nov 27, 2025
b212448
[mlir][LLVMIR] Handle missing functions in CGProfile module flags (#1…
Men-cotton Nov 27, 2025
1354f82
[clang][bytecode] Remove double diagnostic emission (#169658)
tbaederr Nov 27, 2025
926e017
[CIR][X86] Implement lowering for AVX512 mask builtins
GeneraluseAI Nov 27, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
61 changes: 59 additions & 2 deletions clang/lib/CIR/CodeGen/CIRGenBuiltinX86.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,37 @@ static mlir::Value getMaskVecValue(CIRGenFunction &cgf, const CallExpr *expr,
return maskVec;
}

static mlir::Value emitX86MaskAddLogic(CIRGenFunction &cgf,
const CallExpr *expr,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
static mlir::Value emitX86MaskAddLogic(CIRGenFunction &cgf,
const CallExpr *expr,
static mlir::Value emitX86MaskAddLogic(CIRGenBuilderTy &builder,
mlir::Location loc,

I've updated the signatures of the helper functions so we don't need to pass cgf and expr around so many places.

const std::string &intrinsicName,
SmallVectorImpl<mlir::Value> &ops) {
CIRGenBuilderTy &builder = cgf.getBuilder();
auto intTy = cast<cir::IntType>(ops[0].getType());
unsigned numElts = intTy.getWidth();
mlir::Value lhsVec = getMaskVecValue(cgf, expr, ops[0], numElts);
mlir::Value rhsVec = getMaskVecValue(cgf, expr, ops[1], numElts);
mlir::Type vecTy = lhsVec.getType();
mlir::Value resVec = emitIntrinsicCallOp(cgf, expr, intrinsicName, vecTy,
mlir::ValueRange{lhsVec, rhsVec});
return builder.createBitcast(resVec, ops[0].getType());
}

static mlir::Value emitX86MaskLogic(CIRGenFunction &cgf, const CallExpr *expr,
cir::BinOpKind binOpKind,
SmallVectorImpl<mlir::Value> &ops,
bool invertLHS = false) {
CIRGenBuilderTy &builder = cgf.getBuilder();
unsigned numElts = cast<cir::IntType>(ops[0].getType()).getWidth();
mlir::Value lhs = getMaskVecValue(cgf, expr, ops[0], numElts);
mlir::Value rhs = getMaskVecValue(cgf, expr, ops[1], numElts);

if (invertLHS)
lhs = builder.createNot(lhs);
return builder.createBitcast(
builder.createBinop(cgf.getLoc(expr->getExprLoc()), lhs, binOpKind, rhs),
ops[0].getType());
}

mlir::Value CIRGenFunction::emitX86BuiltinExpr(unsigned builtinID,
const CallExpr *expr) {
if (builtinID == Builtin::BI__builtin_cpu_is) {
Expand Down Expand Up @@ -743,38 +774,64 @@ mlir::Value CIRGenFunction::emitX86BuiltinExpr(unsigned builtinID,
case X86::BI__builtin_ia32_ktestzsi:
case X86::BI__builtin_ia32_ktestcdi:
case X86::BI__builtin_ia32_ktestzdi:
cgm.errorNYI(expr->getSourceRange(),
std::string("unimplemented X86 builtin call: ") +
getContext().BuiltinInfo.getName(builtinID));
return {};
case X86::BI__builtin_ia32_kaddqi:
return emitX86MaskAddLogic(*this, expr, "x86.avx512.kadd.b", ops);
case X86::BI__builtin_ia32_kaddhi:
return emitX86MaskAddLogic(*this, expr, "x86.avx512.kadd.w", ops);
case X86::BI__builtin_ia32_kaddsi:
return emitX86MaskAddLogic(*this, expr, "x86.avx512.kadd.d", ops);
case X86::BI__builtin_ia32_kadddi:
return emitX86MaskAddLogic(*this, expr, "x86.avx512.kadd.q", ops);
case X86::BI__builtin_ia32_kandqi:
case X86::BI__builtin_ia32_kandhi:
case X86::BI__builtin_ia32_kandsi:
case X86::BI__builtin_ia32_kanddi:
return emitX86MaskLogic(*this, expr, cir::BinOpKind::And, ops);
case X86::BI__builtin_ia32_kandnqi:
case X86::BI__builtin_ia32_kandnhi:
case X86::BI__builtin_ia32_kandnsi:
case X86::BI__builtin_ia32_kandndi:
return emitX86MaskLogic(*this, expr, cir::BinOpKind::And, ops, true);
case X86::BI__builtin_ia32_korqi:
case X86::BI__builtin_ia32_korhi:
case X86::BI__builtin_ia32_korsi:
case X86::BI__builtin_ia32_kordi:
return emitX86MaskLogic(*this, expr, cir::BinOpKind::Or, ops);
case X86::BI__builtin_ia32_kxnorqi:
case X86::BI__builtin_ia32_kxnorhi:
case X86::BI__builtin_ia32_kxnorsi:
case X86::BI__builtin_ia32_kxnordi:
return emitX86MaskLogic(*this, expr, cir::BinOpKind::Xor, ops, true);
case X86::BI__builtin_ia32_kxorqi:
case X86::BI__builtin_ia32_kxorhi:
case X86::BI__builtin_ia32_kxorsi:
case X86::BI__builtin_ia32_kxordi:
return emitX86MaskLogic(*this, expr, cir::BinOpKind::Xor, ops);
case X86::BI__builtin_ia32_knotqi:
case X86::BI__builtin_ia32_knothi:
case X86::BI__builtin_ia32_knotsi:
case X86::BI__builtin_ia32_knotdi:
case X86::BI__builtin_ia32_knotdi: {
cir::IntType intTy = cast<cir::IntType>(ops[0].getType());
unsigned numElts = intTy.getWidth();
mlir::Value resVec = getMaskVecValue(*this, expr, ops[0], numElts);
return builder.createBitcast(builder.createNot(resVec), ops[0].getType());
}
case X86::BI__builtin_ia32_kmovb:
case X86::BI__builtin_ia32_kmovw:
case X86::BI__builtin_ia32_kmovd:
case X86::BI__builtin_ia32_kmovq:
case X86::BI__builtin_ia32_kmovq: {
// Bitcast to vXi1 type and then back to integer. This gets the mask
// register type into the IR, but might be optimized out depending on
// what's around it.
cir::IntType intTy = cast<cir::IntType>(ops[0].getType());
unsigned numElts = intTy.getWidth();
mlir::Value resVec = getMaskVecValue(*this, expr, ops[0], numElts);
return builder.createBitcast(resVec, ops[0].getType());
}
case X86::BI__builtin_ia32_kunpckdi:
case X86::BI__builtin_ia32_kunpcksi:
case X86::BI__builtin_ia32_kunpckhi:
Expand Down
Loading