Skip to content

[HLSL] [DXIL] Implement the AddUint64 HLSL function and the UAddc DXIL op #125319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1,959 commits into from
Closed
Show file tree
Hide file tree
Changes from 8 commits
Commits
Show all changes
1959 commits
Select commit Hold shift + click to select a range
a1ee186
[Clang] Disable failing offload test on darwin
jhuber6 Feb 9, 2025
daa1f12
[mlir][vector] Remove references to non-existing patterns (nfc)
banach-space Feb 9, 2025
06a77df
[NFC][libc++] Fixes minor issues in the synopsis.
mordante Feb 9, 2025
b819c13
[libc++][doc] Updates format status.
mordante Feb 9, 2025
11ff86e
[LV][X86] Regenerate interleaved load/store costs. NFC.
RKSimon Feb 9, 2025
1b12151
[OpenMP] Replace use of target address space with <gpuintrin.h> local…
jhuber6 Feb 9, 2025
3119116
[AST] Avoid repeated hash lookups (NFC) (#126400)
kazutakahirata Feb 9, 2025
ffcbceb
[CodeGen] Avoid repeated hash lookups (NFC) (#126403)
kazutakahirata Feb 9, 2025
fc14861
[Passes] Avoid repeated hash lookups (NFC) (#126404)
kazutakahirata Feb 9, 2025
99b2303
[TableGen] Avoid repeated hash lookups (NFC) (#126405)
kazutakahirata Feb 9, 2025
96119a9
[X86] lowerV4F64Shuffle - prefer lowerShuffleAsDecomposedShuffleMerge…
RKSimon Feb 9, 2025
138fce3
[ValueTracking] Test for not in dominating condition. (NFC)
andjo403 Feb 9, 2025
2bf83e1
[lldb] Merge TestSBCommandReturnObject tests
JDevlieghere Feb 9, 2025
4629505
Reland "[LV]: Teach LV to recursively (de)interleave." (#125094)
hassnaaHamdi Feb 9, 2025
3cb7c29
[CSKY] Default to unsigned char
arichardson Feb 9, 2025
68c434c
[Analysis] Avoid repeated hash lookups (NFC) (#126402)
kazutakahirata Feb 9, 2025
684b1f0
[Sema] Avoid repeated hash lookups (NFC) (#126428)
kazutakahirata Feb 9, 2025
ed3b459
[Serialization] Avoid repeated hash lookups (NFC) (#126429)
kazutakahirata Feb 9, 2025
de6164c
[AMDGPU] Avoid repeated hash lookups (NFC) (#126430)
kazutakahirata Feb 9, 2025
b11c69e
[AsmPrinter] Avoid repeated map lookups (NFC) (#126431)
kazutakahirata Feb 9, 2025
07fe365
[Coroutines] Avoid repeated hash lookups (NFC) (#126432)
kazutakahirata Feb 9, 2025
564a401
[TableGen] Avoid repeated hash lookups (NFC) (#126433)
kazutakahirata Feb 9, 2025
ef61517
[TableGen] Remove recursive walk of linked list from ContractNodes. NFC
topperc Feb 9, 2025
5427cc5
[NFC][AArch64] move AArch64 non auto-generated tests to static file (…
jthackray Feb 10, 2025
4e7205c
[LV] Prevent query the computeCost() when VF=1 in emitInvalidCostRema…
ElvisWang123 Feb 10, 2025
3bd3131
[AMDGPU] Don't unify divergent exit nodes with `musttail` calls (#126…
shiltian Feb 10, 2025
63c24fc
[AVX10.2] Fix wrong intrinsic names after rename (#126390)
mikolaj-pirog Feb 10, 2025
3cdbcb8
[benchmark] Sync a few commits from upstream to help with CPU count (…
brad0 Feb 10, 2025
bbb418d
[GlobalISel] Check whether `G_CTLZ` is legal in `matchUMulHToLShr` (#…
shiltian Feb 10, 2025
f54efa9
[RISCV] Improve Errors for X1/X5/X1X5 Reg Classes (#126184)
lenary Feb 10, 2025
35d2363
[BoundsSafety][doc] Fix a typo (#126247)
pfusik Feb 10, 2025
a19bd2e
[AMDGPU] - Fix non-deterministic compile issue (#126271)
dstutt Feb 10, 2025
09623c1
Revert "[mlir] Python: Parse ModuleOp from file path" (#126482)
joker-eph Feb 10, 2025
5cc672e
[AArch64] Add MATCH loops to LoopIdiomVectorizePass (#101976)
rj-jesus Feb 10, 2025
ea411bc
[Driver][HIP] Do not pass -dependency-file flag for HIP Device offloa…
lalaniket8 Feb 10, 2025
e9409c0
[LoongArch] Pre-commit tests for tls-le merge base offset. NFC (#122998)
zhaoqi5 Feb 10, 2025
32c3da0
[openmp] Fix for 32-bit PowerPC (#126412)
brad0 Feb 10, 2025
fe5259f
[ScalarEvolution] Handle addrec incoming value in isImpliedViaMerge()…
nikic Feb 10, 2025
81499c1
[AArch64] Add SUBHN patterns for xor variant (#126100)
davemgreen Feb 10, 2025
3edfa00
[SDAG] Precommit tests for #126207 (NFC) (#126208)
c-rhodes Feb 10, 2025
ad9b3ad
[DSE] Don't use initializes on byval argument (#126259)
nikic Feb 10, 2025
1d5e5f1
[mlir][scf]: Add value bound for the computed upper bound of for loop…
amirBish Feb 10, 2025
0033cd2
Revert "[LinkerWrapper] Clean up options after proper forwarding" (#1…
jplehr Feb 10, 2025
cc2e0a5
[libclc] Have all targets build all CLC functions (#124779)
frasercrmck Feb 10, 2025
0fd992f
[Clang][Driver][HIP] Do not specify explicit target cpu in host compi…
lalaniket8 Feb 10, 2025
b6f01ad
[llvm][Docs] Explain how to handle excessive formatting changes (#126…
DavidSpickett Feb 10, 2025
8179d64
[X86] LowerSelect - use BLENDV for scalar selection on all SSE41+ tar…
RKSimon Feb 10, 2025
4c3549e
[X86] canonicalizeShuffleWithOp - pull out repeated flag settings to …
RKSimon Feb 10, 2025
862cf2d
[libc][math][c23] Add asinf16() function (#124212)
wldfngrs Feb 10, 2025
03fe9b9
InstSimplify: improve computePointerICmp (NFC) (#126255)
artagnon Feb 10, 2025
2415638
[LoongArch] Merge base and offset for tls-le code sequence (#122999)
zhaoqi5 Feb 10, 2025
796b754
[RISCV][VLOPT] Precommit tests for opt info on passthrus. NFC
lukel97 Feb 10, 2025
eb62d36
[RISCV][VLOPT] Add support for Widening Floating-Point Fused Multiply…
lukel97 Feb 10, 2025
bdabea2
[MLIR][Linalg] Expose linalg.matmul and linalg.contract via Python AP…
rolfmorel Feb 10, 2025
47ecb7a
[analyzer][NFC] Remove "V2" from ArrayBoundCheckerV2.cpp (#126094)
NagyDonat Feb 10, 2025
85c6b03
[RISCV] Add cost model for fma (#126076)
mikhailramalho Feb 10, 2025
d400713
[X86] IsElementEquivalent - pull out repeated getValueType calls. NFC.
RKSimon Feb 10, 2025
8cfcb74
[AMDGPU] Only run `AMDGPUPrintfRuntimeBindingPass` at non-prelink pha…
shiltian Feb 10, 2025
c471e51
[clang][bytecode] Support partial initializers for CXXNewExprs (#126494)
tbaederr Feb 10, 2025
9423ff8
[RISCV][VLOPT] Fix passthru operand info for mixed-width instructions…
lukel97 Feb 10, 2025
67bb421
Revert "SCEV: teach isImpliedViaOperations about samesign" (#126506)
artagnon Feb 10, 2025
53f7dea
[clang] Expose -f(no-)strict-overflow as a clang-cl option (#126512)
nico Feb 10, 2025
aedb2ad
[clang] CTAD alias: Respect explicit deduction guides defined after t…
hokein Feb 10, 2025
58fbcc6
[clang][bytecode][NFC] Discard all CastExprs uniformly (#126511)
tbaederr Feb 10, 2025
2b52e4f
[XCOFF][llvm-readobj] Print symbol value kind when dumping symbols (#…
diggerlin Feb 10, 2025
9489ea4
SCEV: thread samesign in isBasicBlockEntryGuardedByCond (NFC) (#125840)
artagnon Feb 10, 2025
0b96c6b
[RISCV][VLOPT] Add support for Vector Fixed-Point Arithmetic Instruct…
lukel97 Feb 10, 2025
9d9d77c
[AST] Avoid repeated hash lookups (NFC) (#126461)
kazutakahirata Feb 10, 2025
63ab8be
[Lex] Avoid repeated hash lookups (NFC) (#126462)
kazutakahirata Feb 10, 2025
6893320
[TableGen] Avoid repeated hash lookups (NFC) (#126464)
kazutakahirata Feb 10, 2025
91b3b7b
[Analysis] Avoid repeated hash lookups (NFC) (#126465)
kazutakahirata Feb 10, 2025
dfe35d9
[Coroutines] Avoid repeated hash lookups (NFC) (#126466)
kazutakahirata Feb 10, 2025
61d3277
[llvm-profgen] Avoid repeated hash lookups (NFC) (#126467)
kazutakahirata Feb 10, 2025
c533cf9
[clang] Handle f(no-)strict-overflow, f(no-)wrapv, f(no-)wrapv-pointe…
nico Feb 10, 2025
9aa5b6f
[llvm][docs] Tweak backporting instructions a bit (#126519)
nico Feb 10, 2025
a305150
[acc][mlir] Add functionality for categorizing OpenACC variable types…
razvanlupusoru Feb 10, 2025
4acc5f4
[NFC][LoopVectorize] Add more partial reduction tests (#126525)
david-arm Feb 10, 2025
81710a2
[llvm][lit] Update regexes in Xunit test (#126527)
DavidSpickett Feb 10, 2025
99547a2
[Driver][ROCm][OpenMP] Fix default ockl linking for OpenMP. (#126186)
ampandey-1995 Feb 10, 2025
90d1c36
[NFC][StructurizeCFG] Add a test that can crash StructurizeCFG pass (…
shiltian Feb 10, 2025
8f61e7d
[TableGen][InstrInfo] Cull mapping that have not been enabled/not nee…
jurahul Feb 10, 2025
338e786
[OpenMP][OpenMPIRBuilder] Add initial changes for SPIR-V target front…
sarnex Feb 10, 2025
d9d3492
[flang] Correctly handle `!dir$ unroll` with unrolling factors of 0 a…
ashermancinelli Feb 10, 2025
1770d07
[mlir][vector]add extractInsertFoldConstantOp fold function and apply…
linuxlonelyeagle Feb 10, 2025
708065e
[LV] Forget LCSSA phi with new pred before other SCEV invalidation. (…
fhahn Feb 10, 2025
ae9586a
[mlir][tosa] Fix conv op build functions (#126321)
Tai78641 Feb 10, 2025
741b2b0
[libc][docs] Add sys/statvfs to documentation and YAML definitions (#…
StarOne01 Feb 10, 2025
da916af
[libc++] Improves type-safety in generator script. (#101880)
mordante Feb 10, 2025
3287c94
[RISCV] Improve Errors for GPRNoX0X2/SP Reg Classes (#126394)
lenary Feb 10, 2025
83bcb8d
[libc++][CI] Updates Clang HEAD version in Docker. (#126419)
mordante Feb 10, 2025
ce65cee
[NFC][TableGen] Delete `getLogicalOperandType` from InstrInfoEmitter …
jurahul Feb 10, 2025
863f4aa
[RISCV] Match widening fp instructions with same fpext used in multip…
lukel97 Feb 10, 2025
9985834
[ValueTracking] Handle not in dominating condition. (#126423)
andjo403 Feb 10, 2025
6fd2e76
[ELF] --package-metadata: support %[0-9a-fA-F][0-9a-fA-F]
MaskRay Feb 10, 2025
2e079b2
MachineCopyPropagation: Do not remove copies preserved by regmask (#1…
jsji Feb 10, 2025
4cd98d1
[ARM] Move MCStreamer::emitThumbFunc to ARMTargetStreamer
MaskRay Feb 10, 2025
cfcdc10
[MLIR][Math] Add fine-grained populate-patterns functions for math fu…
bjacob Feb 10, 2025
624c312
[clang-tidy] Address false positives in misc-redundant-expression che…
earnol Feb 10, 2025
1b5dc8d
[-Wunsafe-buffer-usage] Fix assert when constexpr size passed to snpr…
tsepez Feb 10, 2025
bc4f49b
[lld] Remove usage of `%T` in `lld/test` (#126133)
DataCorrupted Feb 10, 2025
d36e0df
[compiler-rt][windows] Test fixups for MSVC. (#109887)
barcharcraz Feb 10, 2025
2ce713c
[Clang][NFC] clang-format __has_builtin implementation (#126571)
sarnex Feb 10, 2025
f74c91c
[libc++][Github] Remove workflow-scoped write permissions (#126447)
boomanaiden154 Feb 10, 2025
46dd60d
[OMPIRBuilder][debug] Fix debug info for variables in target region. …
abidh Feb 10, 2025
08d98ee
[ELF] Add support for CREL to getSectionAndRelocations
boomanaiden154 Feb 10, 2025
3cbed80
[ELF] Add support for CREL locations for SHT_LLVM_BB_ADDR_MAP
boomanaiden154 Feb 10, 2025
282e356
[lldb][telemetry] Implement LLDB Telemetry (part 1) (#119716)
oontvoo Feb 10, 2025
6725ebc
Modify dwarfdump verification to allow sub-category counts (#125062)
youngd007 Feb 10, 2025
22beca5
[Offload] Fix the offload cache file triggering libc++ / libstdc++ mi…
jhuber6 Feb 10, 2025
82546b0
[Offload][NFC] Rename `src/` -> `libomptarget/` (#126573)
jhuber6 Feb 10, 2025
4c86e20
[Fuchsia] Disable building runtimes with LTO (#126306)
ilovepi Feb 10, 2025
5f86f9f
[MLIR] Fix LLVMIRTransforms build failure (#125485)
RoboTux Feb 10, 2025
432fc7c
[X86] Use StackArgTokenFactor for all stores when setting up tail cal…
fhahn Feb 10, 2025
3c1b8aa
Revert "Modify dwarfdump verification to allow sub-category counts (…
nico Feb 10, 2025
0b925b9
[libc++] Extract destroy algorithms into separate headers (#126449)
ldionne Feb 10, 2025
460d1b5
[NVPTX] Make ctor/dtor lowering always enabled in NVPTX (#126544)
jhuber6 Feb 10, 2025
eb530f7
[gn build] Port f332455dd9a2
llvmgnsyncbot Feb 10, 2025
46f8146
[LLD][COFF] Add support for -includeoptional on ARM64X (#126300)
cjacek Feb 10, 2025
a7b9026
[flang][acc] Ensure data exit action is generated for present & nocre…
razvanlupusoru Feb 10, 2025
f5fbf54
[HLSL] Desugar type when converting from a ConstantArrayType to an Ar…
spall Feb 10, 2025
67353dc
[llvm][GitHub] Move PR project status to Done once backport PR is mad…
tstellar Feb 10, 2025
6c963c7
[LLD][COFF] Fix a typo in REQUIRES directive (NFC)
cjacek Feb 10, 2025
d055209
Delete erroneously test not updated with dwarf verification JSON chan…
youngd007 Feb 10, 2025
23a961e
[LLD][MinGW] Add support for wrapped symbols on ARM64X (#126296)
cjacek Feb 10, 2025
3f8bc23
Rough attempt to fix lldb bazel BUILD file
dwblaikie Feb 10, 2025
5536cdd
[analyzer] Remove some false negatives in StackAddrEscapeChecker (#12…
Flandini Feb 10, 2025
24a7921
[lldb][sbapi] Namespace CommandReturnObjectCallbackResult in SBDefine…
chelcassanova Feb 10, 2025
1fe820c
[NVPTX] Remove unused static functions following #126544
jhuber6 Feb 10, 2025
b95bd97
Revert "[analyzer] Remove some false negatives in StackAddrEscapeChec…
Xazax-hun Feb 10, 2025
cd799e6
[lldb] Add a test for terminal dimensions (#126598)
JDevlieghere Feb 10, 2025
9dd96df
[lldb] [darwin] Upstream a few DriverKit cases (#126604)
jasonmolenda Feb 10, 2025
43f3479
[NFC][LLVM] Remove unused `TargetIntrinsicInfo` class (#126003)
jurahul Feb 10, 2025
56dbc47
[clang] Disaqble test/Analysis/live-stmts.cpp on aarch64
nico Feb 10, 2025
0d7ea81
[lldb] Fix two old UUID method calls in ObjectFileMachO
jasonmolenda Feb 10, 2025
169ae3a
[lldb] Assert on invalid default {S,U}Int64 (NFC) (#126590)
JDevlieghere Feb 10, 2025
1b2a822
[lldb-dap] Silence Wunused-result warning (#126580)
keith Feb 10, 2025
a6231bc
[flang] Propagate fast-math flags to FIROpBuilder. (#126316)
vzakhari Feb 10, 2025
67334a3
[lldb] Fix a warning
kazutakahirata Feb 10, 2025
755e4d1
[libc++][test] Fixes for `hash<Emplaceable>` and value discarding (#1…
frederick-vs-ja Feb 10, 2025
df03536
[MLIR][Affine] Make affine fusion MDG API const correct (#125994)
bondhugula Feb 10, 2025
4ac3b7f
Revert "[Driver][ROCm][OpenMP] Fix default ockl linking for OpenMP." …
fmayer Feb 11, 2025
2c905f1
[Clang] disallow attributes on void parameters (#124920)
a-tarasyuk Feb 11, 2025
e9cf570
[FIX] Add `REQUIRES: asserts` to `llvm/test/Transforms/StructurizeCFG…
shiltian Feb 11, 2025
2b8ea55
[Mips] Support llvm.readcyclecounter intrinsic (#114953)
yingopq Feb 11, 2025
8623bb8
[clang-tidy] Added support for 3-argument std::string ctor in bugpron…
vbvictor Feb 11, 2025
504f7d9
[RISCV] Improve Errors for GPRNoX0 Reg Class (#126397)
lenary Feb 11, 2025
be0beb1
[compiler-rt][Mips] Fix mips SP register definition (#124493)
Gelbpunkt Feb 11, 2025
f9294d1
[WebKit checkers] Treat an implicit value initialization as trivial (…
rniwa Feb 11, 2025
64c2536
[AVX10.2] Fix wrong mask casting in some convert intrinsics (#126627)
mikolaj-pirog Feb 11, 2025
aa1bb43
[X86] Generate cvtpd2dq for (v2i32 lrint(v2f64)) (#126508)
phoebewang Feb 11, 2025
af76105
[libc++][NFC] Run the container tests through clang-format (#126499)
ldionne Feb 11, 2025
aabc8f6
[HLSL] Constant buffer layout struct update (#124840)
hekota Feb 11, 2025
ce9917b
[AMDGPU][NewPM] Port "GCNPreRAOptimizations" pass to NPM (#126040)
vikramRH Feb 11, 2025
3555b1a
[LLVM][CMake][MSVC] Install PDBs alongside executables (#120683)
mayanez Feb 11, 2025
4dd0923
[Hexagon] Fix typos discovered by codespell (NFC) (#126233)
svs-quic Feb 11, 2025
9fc81c4
[clang][Sema] Emit warnings about incorrect AVR interrupt/signal hand…
benshi001 Feb 11, 2025
fb54089
[C++20] [Modules] Don't diagnose duplicated declarations in different…
ChuanqiXu9 Feb 11, 2025
93c0c41
[mlir]linalg][NFC]-Add lit test for tile and fuse transformation (#12…
amirBish Feb 11, 2025
b241f1f
Revert "[Mips] Support llvm.readcyclecounter intrinsic (#114953)"
wzssyqa Feb 11, 2025
f023dd7
[mlir][transforms] Process RegionBranchOp with empty region (#123895)
CoTinker Feb 11, 2025
c8006b3
Fix false positive of [[clang::require_explicit_initialization]] on c…
higher-performance Feb 11, 2025
6bcaf3a
[RISCV][compiler-rt] drop __riscv_vendor_feature_bits (#126460)
BeMg Feb 11, 2025
8e9f5c5
[clang][bytecode][NFC] Add failing memmove testcase (#126682)
tbaederr Feb 11, 2025
5f9038e
[libc] Don't manually override the optimization level for math (#126322)
petrhosek Feb 11, 2025
38eade4
[sanitizer_common][test] Remove second SanitizerCommon.ReportFile tem…
rorth Feb 11, 2025
253a950
[clang][HeuristicResolver] Additional hardening against an infinite l…
HighCommander4 Feb 11, 2025
c6808c1
[mlir][LLVM] handle argument and result attributes in llvm.call and l…
jeanPerier Feb 11, 2025
f7a2bdf
[NVPTX] Add intrinsics for prefetch.* (#125887)
abhilash1910 Feb 11, 2025
1abbb1d
[IR] Add llvm.sincospi intrinsic (#125873)
MacDue Feb 11, 2025
d324e38
[mlir][xegpu] Improve scatter attribute definition (#126540)
adam-smnk Feb 11, 2025
4770a65
[compiler-rt] Fix tests of __aeabi_(idivmod|uidivmod|uldivmod) to sup…
vhscampos Feb 11, 2025
f13ee84
[GitHub] Correct word in commit access request greeting
DavidSpickett Feb 11, 2025
ddc1394
[clang][analyzer][NFC] Fix typos in comments (#126676)
benshi001 Feb 11, 2025
04554c6
[flang][NFC] fix rewrite-out_of_range.F90 tests (#126699)
jeanPerier Feb 11, 2025
2b40642
[mlir][tosa] Use generic QuantizedType in Conv verifiers (#126275)
Tai78641 Feb 11, 2025
c25e4e7
[libclc] Move sign to the CLC builtins library (#115699)
frasercrmck Feb 11, 2025
bbef9c7
[SPIRV] Add support for `cl_khr_extended_bit_ops` (#120571)
maarquitos14 Feb 11, 2025
3a39d6b
[RTLIB] Rename getFSINCOS() to getSINCOS (NFC) (#126705)
MacDue Feb 11, 2025
88d78c6
[mlir][bufferization] Canonicalize to_memref(to_tensor(x)) to a CopyO…
amrami Feb 11, 2025
cd9ade0
[mlir][target][nvvm] Perf by stage and store into properties (#126178)
MikaOvO Feb 11, 2025
4a583f2
[VPlan] Only skip expansion for SCEVUnknown if it isn't an instructio…
fhahn Feb 11, 2025
2516012
[analyzer] Reapply recent stack addr escape checker changes + buildbo…
Flandini Feb 11, 2025
f096256
[NFC][analyzer] OOB test consolidation I: no-outofbounds.c (#126539)
NagyDonat Feb 11, 2025
a40b0c4
[mlir][cmake] Do not export MLIR_MAIN_SRC_DIR and MLIR_INCLUDE_DIR (#…
nikic Feb 11, 2025
c328c5d
[clang] Force AttributedStmtClass to not be scope parents (#125370)
YutongZhuu Feb 11, 2025
e639ffa
[NFC][AMDGPU] Rename test (#126725)
jmmartinez Feb 11, 2025
61b708e
[lldb] Use preprocessor guard for `LLVM_BUILD_TELEMETRY` (#126715)
mgorny Feb 11, 2025
f830a19
[gn build] Port bf2d4eb7030b
llvmgnsyncbot Feb 11, 2025
168164e
[NFC][TableGen] Code cleanup in InstrInfoEmitter.cpp (#126578)
jurahul Feb 11, 2025
37a1156
[AArch64] Improve getPartialReductionCost for fixed-width VFs (#126538)
david-arm Feb 11, 2025
a3384b5
[SPIR-V] Add SPIR-V Linker (#126319)
sarnex Feb 11, 2025
ae2c603
[Offload] Treat an empty packager architecture as 'generic' (#126655)
jhuber6 Feb 11, 2025
45e24db
AMDGPU: Handle gfx950 XDL-write-VGPR-Overlap-Src-AB wait state (#126732)
VigneshwarJ Feb 11, 2025
c6246c0
[clang][bytecode] Fix diagnosing replaceable global allocator functio…
tbaederr Feb 11, 2025
4efac9b
[mlir][tosa] Change ClampOp's min/max attributes (#125197)
Hsiangkai Feb 11, 2025
a6eb9e4
[lldb] Fix ubsan violation with plugin loading (#126652)
keith Feb 11, 2025
975d42c
[AMDGPU][NFC] Remove an unneeded return value. (#126739)
kosarev Feb 11, 2025
65ebe3a
[llvm] [cmake] Expose `LLVM_BUILD_TELEMETRY` in `LLVMConfig.cmake` (#…
mgorny Feb 11, 2025
fabf269
[AMDGPU] Create new directive .amdhsa_inst_pref_size (#126622)
rampitec Feb 11, 2025
b690a76
[Docs] Fix typo in TypeSanitizer.rst "tale" -> "table" (NFC) (#126721)
sitrin Feb 11, 2025
1c1bc96
[NFC] [clang] simplify isDesignatorAtObjectEnd (#126658)
fmayer Feb 11, 2025
8a7a66d
[CodeGen] Avoid repeated hash lookups (NFC) (#126672)
kazutakahirata Feb 11, 2025
203992f
[Sema] Avoid repeated hash lookups (NFC) (#126674)
kazutakahirata Feb 11, 2025
4e5f70b
[clang-installapi] Avoid repeated hash lookups (NFC) (#126677)
kazutakahirata Feb 11, 2025
5cd3a55
[Analysis] Avoid repeated hash lookups (NFC) (#126678)
kazutakahirata Feb 11, 2025
461c4c9
[SystemZ] Avoid repeated hash lookups (NFC) (#126679)
kazutakahirata Feb 11, 2025
ffa5907
[Vectorize] Avoid repeated hash lookups (NFC) (#126681)
kazutakahirata Feb 11, 2025
df2fe27
[ValueTracking] Handle trunc to i1 as condition in dominating conditi…
andjo403 Feb 11, 2025
37c4aaf
[Clang][Driver][HIP] Fix test for HIP as it was failing (#126585)
Sharjeel-Khan Feb 11, 2025
1dcb996
[flang][acc] Fill-in name for privatized loop iv (#126601)
razvanlupusoru Feb 11, 2025
e3304f6
[clang][Sema] Move computing enum bits into a separate function (#126…
kuilpd Feb 11, 2025
2360295
[GitHub] Add aaronmondal to Bazel codeowners (#126760)
aaronmondal Feb 11, 2025
fd5cd01
[HLSL] Appropriately set function attribute optnone (#125937)
bharadwajy Feb 11, 2025
18d6cd3
[libc][test][stdbit] fix -Wimplicit-int-conversion (#126616)
nickdesaulniers Feb 11, 2025
4c3a9ae
[Release Notes] Mention -fprofile-continuous in release notes
Feb 11, 2025
66de8d7
[analyzer][docs] Document how to use perf and uftrace to debug perfor…
steakhal Feb 11, 2025
7de9daa
[GitHub] Skip undefcheck if no relevant files changed (#126749)
aaronmondal Feb 11, 2025
1e12d17
[lldb] Devirtualize GetValueProperties (NFC) (#126583)
JDevlieghere Feb 11, 2025
1f0b48d
[WebKit Checkers] Treat const Objective-C ivar as a safe origin (#126…
rniwa Feb 11, 2025
11bf20b
AMDGPU: Handle gfx950 XDL-write-VGPR-VALU-Mem-Exp wait state change (…
VigneshwarJ Feb 11, 2025
1c7600c
AMDGPU: Handle gfx950 XDL Write-VGPR-VALU-WAW wait state change (#126…
VigneshwarJ Feb 11, 2025
4948689
[llvm] Avoid out-of-order evaluation in DebugInfo (#125116)
elvinw-intel Feb 11, 2025
9ad5843
[flang][NFCI] Stop tracking memory source after a load in a more expl…
Renaud-K Feb 11, 2025
a49ce92
[Clang] [OpenMP] Add support for '#pragma omp stripe'. (#119891)
zahiraam Feb 11, 2025
060f864
[mlir][tosa] Change the shift of mul to be required (#125297)
Tai78641 Feb 11, 2025
8605608
[WebKit Checkers] Allow operator T&() in a const member function (#12…
rniwa Feb 11, 2025
2155741
[libc++] Fixes building with Python 3.8.
mordante Feb 11, 2025
a0d916d
[Hexagon][Disassembler] Set CommentStream of Disassembler (#126766)
quic-areg Feb 11, 2025
283df81
[flang][rt] Add decimal files to device runtime (#126778)
clementval Feb 11, 2025
f05d2d1
[clang-linker-wrapper][lit] Fix SPIR-V ELF test when spirv-tools feat…
sarnex Feb 11, 2025
7127d8a
[MLIR] Make generated markdown doc more consistent (#119926)
GleasonK Feb 11, 2025
81fe251
Revert "[Clang] [OpenMP] Add support for '#pragma omp stripe'. (#1198…
kazutakahirata Feb 11, 2025
e9113ce
[DAG] Use ArrayRef to simplify ShuffleVectorSDNode::isSplatMask
preames Feb 11, 2025
8e640c8
[Offload] Properly guard modifications to the RPC device array (#126790)
jhuber6 Feb 11, 2025
eb6ebf2
[AArch64] Add a phase-order test for dot patterns. NFC
davemgreen Feb 11, 2025
b4edecc
[NFC] [clang] fix unused variable warning (#126796)
fmayer Feb 11, 2025
255f225
[DependenceAnalysis][NFC] Removing PossiblyLoopIndependent parameter …
1997alireza Feb 11, 2025
82e7289
[RISCV] Add coverage for vmerge.vim shuffle lowering
preames Feb 11, 2025
d60c797
[clang] Assert the enum FPOpts and LangOpts fit into the storage (#12…
fmayer Feb 11, 2025
f65eb70
[mlir][vector][nfc] Add clarification on "dim-1" bcast (#125425)
banach-space Feb 11, 2025
2d2fb2b
[SLP]Fix attempt to build the reorder mask for non-adjusted reuse mask
alexey-bataev Feb 11, 2025
166c42f
[NFC] [clang] Use isa instead of dyn_cast
fmayer Feb 11, 2025
5d0ca9d
[SandboxVec][Scheduler] Update ready list comparator (#126160)
vporpo Feb 11, 2025
f0d3036
[mlir] Silence -Wdangling-assignment-gsl in OperationSupport.h (#126140)
smeenai Feb 11, 2025
eb49b1a
[libc][docgen] make note of sys/time.h interfaces removed in POSIX.1-…
nickdesaulniers Feb 11, 2025
d7c0f83
[clang][HeuristicResolver] Track the expression whose type is being s…
HighCommander4 Feb 11, 2025
17c7833
[libc][math] Add float-only option for atan2f. (#122979)
lntue Feb 11, 2025
98aaff0
[AMDGPU][True16][CodeGen] true16 codegen for MadFmaMixPat (#124892)
broxigarchen Feb 11, 2025
e784664
[libc] create TimeReader to look at a struct tm (#126138)
michaelrj-google Feb 11, 2025
7475996
Make AddUint64 use llvm.uadd.with.overflow.v2i32 for uint4 args
Icohedron Feb 11, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions clang/include/clang/Basic/Builtins.td
Original file line number Diff line number Diff line change
Expand Up @@ -4753,6 +4753,12 @@ def GetDeviceSideMangledName : LangBuiltin<"CUDA_LANG"> {
}

// HLSL
def HLSLAddUint64: LangBuiltin<"HLSL_LANG"> {
let Spellings = ["__builtin_hlsl_adduint64"];
let Attributes = [NoThrow, Const];
let Prototype = "void(...)";
}

def HLSLResourceGetPointer : LangBuiltin<"HLSL_LANG"> {
let Spellings = ["__builtin_hlsl_resource_getpointer"];
let Attributes = [NoThrow];
Expand Down
2 changes: 2 additions & 0 deletions clang/include/clang/Basic/DiagnosticSemaKinds.td
Original file line number Diff line number Diff line change
Expand Up @@ -10624,6 +10624,8 @@ def err_second_argument_to_cwsc_not_pointer : Error<

def err_vector_incorrect_num_elements : Error<
"%select{too many|too few}0 elements in vector %select{initialization|operand}3 (expected %1 elements, have %2)">;
def err_invalid_even_odd_vector_element_count : Error<
"invalid element count of %0 in vector %select{initialization|operand}4 (expected an %select{even|odd}3 element count in the range of %1 and %2)">;
def err_altivec_empty_initializer : Error<"expected initializer">;

def err_invalid_neon_type_code : Error<
Expand Down
45 changes: 45 additions & 0 deletions clang/lib/CodeGen/CGBuiltin.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -19105,6 +19105,51 @@ Value *CodeGenFunction::EmitHLSLBuiltinExpr(unsigned BuiltinID,
return nullptr;

switch (BuiltinID) {
case Builtin::BI__builtin_hlsl_adduint64: {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose I know why we weren't able to re-use the __builtin_add_c but for other reviewers it would be good to add context as a pr comment here. Maybe they will have suggestions as to how we could use it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see that you have it in the commit notes. I still think it would be worth noting with more context here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

__builtin_addc was not able to be used to implement AddUint64 in hlsl_intrinsics.h and (by extension) hlsl_detail.h because its carryout argument is a pointer (as documented here).

Since pointers are not supported in HLSL, an error is emitted when running HLSL codegen tests with an example implementation like the following in hlsl_intrinsics.h.

_HLSL_AVAILABILITY(shadermodel, 6.0)
const inline uint32_t2 AddUint64(uint32_t2 a, uint32_t2 b) {
  uint32_t carry;
  uint32_t low_sum = __builtin_addc(a.x, b.x, 0, &carry);
  uint32_t high_sum = __builtin_addc(a.y, b.y, carry, nullptr);
  return uint32_t2(low_sum, high_sum);
}
build/lib/clang/20/include/hlsl/hlsl_intrinsics.h:158:50: error: the '&' operator is unsupported in HLSL
  158 |   uint32_t low_sum = __builtin_addc(a.x, b.x, 0, &carry);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So while HLSL does not support pointers we do have a concept of out args. if you search for EmitHLSLOutArgExpr I think you can find some uses. My thinking is maybe we could do our own builtin like you have done but without the pointer and have an anonymous struct returned. then we could still piggy back off of the code genen for __builtin_addc even if we don't use the builtin itself. Maybe thats more complicated than it has to be, but it could be a way to keep the codegen for the uadd_with_overflow intrinsic in one place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it is something I should do for this implementation?
Are there other HLSL functions that would benefit from / reuse the new builtin using the out args?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the downside of what I suggested is that it would be a hybrid solution. You are writing the algorithm in HLSL, but you are also massaging the codegen to do out args instead of pointers, and write sema checks because we have to introduce a new builtin.

My thinking was there would be less total codgen if we did it the way I suggested and some of the sema checks would benefit from language rules instead of us having to put a bunch of effort into HLSLSema.cpp. I don't have a strong opinion. So I won't make a requirement here.

Value *OpA = EmitScalarExpr(E->getArg(0));
Value *OpB = EmitScalarExpr(E->getArg(1));
assert(E->getArg(0)->getType()->hasIntegerRepresentation() &&
E->getArg(1)->getType()->hasIntegerRepresentation() &&
"AddUint64 operands must have an integer representation");
assert(((E->getArg(0)->getType()->castAs<VectorType>()->getNumElements() ==
2 &&
E->getArg(1)->getType()->castAs<VectorType>()->getNumElements() ==
2) ||
(E->getArg(0)->getType()->castAs<VectorType>()->getNumElements() ==
4 &&
E->getArg(1)->getType()->castAs<VectorType>()->getNumElements() ==
4)) &&
"input vectors must have 2 or 4 elements each");

llvm::Value *Result = PoisonValue::get(OpA->getType());
uint64_t NumElements =
E->getArg(0)->getType()->castAs<VectorType>()->getNumElements();
for (uint64_t i = 0; i < NumElements / 2; ++i) {

// Obtain low and high words of inputs A and B
llvm::Value *LowA = Builder.CreateExtractElement(OpA, 2 * i + 0);
llvm::Value *HighA = Builder.CreateExtractElement(OpA, 2 * i + 1);
llvm::Value *LowB = Builder.CreateExtractElement(OpB, 2 * i + 0);
llvm::Value *HighB = Builder.CreateExtractElement(OpB, 2 * i + 1);

// Use an uadd_with_overflow to compute the sum of low words and obtain a
// carry value
llvm::Value *Carry;
llvm::Value *LowSum = EmitOverflowIntrinsic(
*this, llvm::Intrinsic::uadd_with_overflow, LowA, LowB, Carry);
llvm::Value *ZExtCarry = Builder.CreateZExt(Carry, HighA->getType());

// Sum the high words and the carry
llvm::Value *HighSum = Builder.CreateAdd(HighA, HighB);
llvm::Value *HighSumPlusCarry = Builder.CreateAdd(HighSum, ZExtCarry);

// Insert the low and high word sums into the result vector
Result = Builder.CreateInsertElement(Result, LowSum, 2 * i + 0);
Result = Builder.CreateInsertElement(Result, HighSumPlusCarry, 2 * i + 1,
"hlsl.AddUint64");
}
return Result;
}
case Builtin::BI__builtin_hlsl_resource_getpointer: {
Value *HandleOp = EmitScalarExpr(E->getArg(0));
Value *IndexOp = EmitScalarExpr(E->getArg(1));
Expand Down
21 changes: 21 additions & 0 deletions clang/lib/Headers/hlsl/hlsl_intrinsics.h
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,27 @@ _HLSL_BUILTIN_ALIAS(__builtin_elementwise_acos)
float4 acos(float4);

//===----------------------------------------------------------------------===//
// AddUint64 builtins
//===----------------------------------------------------------------------===//

/// \fn T AddUint64(T a, T b)
/// \brief Implements unsigned 64-bit integer addition using pairs of unsigned
/// 32-bit integers.
/// \param x [in] The first unsigned 32-bit integer pair(s)
/// \param y [in] The second unsigned 32-bit integer pair(s)
///
/// This function takes one or two pairs (low, high) of unsigned 32-bit integer
/// values and returns pairs (low, high) of unsigned 32-bit integer
/// values representing the result of unsigned 64-bit integer addition.

_HLSL_AVAILABILITY(shadermodel, 6.0)
_HLSL_BUILTIN_ALIAS(__builtin_hlsl_adduint64)
uint32_t2 AddUint64(uint32_t2, uint32_t2);
_HLSL_AVAILABILITY(shadermodel, 6.0)
_HLSL_BUILTIN_ALIAS(__builtin_hlsl_adduint64)
uint32_t4 AddUint64(uint32_t4, uint32_t4);

// //===----------------------------------------------------------------------===//
// all builtins
//===----------------------------------------------------------------------===//

Expand Down
47 changes: 47 additions & 0 deletions clang/lib/Sema/SemaHLSL.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -2023,6 +2023,18 @@ static bool CheckAllArgsHaveFloatRepresentation(Sema *S, CallExpr *TheCall) {
checkAllFloatTypes);
}

static bool CheckUnsignedIntRepresentations(Sema *S, CallExpr *TheCall) {
auto checkUnsignedInteger = [](clang::QualType PassedType) -> bool {
clang::QualType BaseType =
PassedType->isVectorType()
? PassedType->getAs<clang::VectorType>()->getElementType()
: PassedType;
return !BaseType->isUnsignedIntegerType();
};
return CheckAllArgTypesAreCorrect(S, TheCall, S->Context.UnsignedIntTy,
checkUnsignedInteger);
}

static bool CheckFloatOrHalfRepresentations(Sema *S, CallExpr *TheCall) {
auto checkFloatorHalf = [](clang::QualType PassedType) -> bool {
clang::QualType BaseType =
Expand Down Expand Up @@ -2214,6 +2226,41 @@ static bool CheckResourceHandle(
// returning an ExprError
bool SemaHLSL::CheckBuiltinFunctionCall(unsigned BuiltinID, CallExpr *TheCall) {
switch (BuiltinID) {
case Builtin::BI__builtin_hlsl_adduint64: {
if (SemaRef.checkArgCount(TheCall, 2))
return true;
if (CheckVectorElementCallArgs(&SemaRef, TheCall))
return true;
if (CheckUnsignedIntRepresentations(&SemaRef, TheCall))
return true;

// CheckVectorElementCallArgs(...) guarantees both args are the same type.
assert(TheCall->getArg(0)->getType() == TheCall->getArg(1)->getType() &&
"Both args must be of the same type");

// ensure both args are vectors
auto *VTy = TheCall->getArg(0)->getType()->getAs<VectorType>();
if (!VTy) {
SemaRef.Diag(TheCall->getBeginLoc(), diag::err_vec_builtin_non_vector)
<< "AddUint64" << /*all*/ 1;
return true;
}

// ensure both args have 2 elements, or both args have 4 elements
int NumElementsArg = VTy->getNumElements();
if (NumElementsArg != 2 && NumElementsArg != 4) {
SemaRef.Diag(TheCall->getBeginLoc(),
diag::err_invalid_even_odd_vector_element_count)
<< NumElementsArg << 2 << 4 << /*even*/ 0 << /*operand*/ 1;
return true;
}

ExprResult A = TheCall->getArg(0);
QualType ArgTyA = A.get()->getType();
// return type is the same as the input type
TheCall->setType(ArgTyA);
break;
}
case Builtin::BI__builtin_hlsl_resource_getpointer: {
if (SemaRef.checkArgCount(TheCall, 2) ||
CheckResourceHandle(&SemaRef, TheCall, 0) ||
Expand Down
71 changes: 71 additions & 0 deletions clang/test/CodeGenHLSL/builtins/AddUint64.hlsl
Original file line number Diff line number Diff line change
@@ -0,0 +1,71 @@
// NOTE: Assertions have been autogenerated by utils/update_cc_test_checks.py UTC_ARGS: --version 5
// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.3-library %s \
// RUN: -emit-llvm -disable-llvm-passes -o - | \
// RUN: FileCheck %s --check-prefixes=CHECK


// CHECK-LABEL: define noundef <2 x i32> @_Z20test_AddUint64_uint2Dv2_jS_(
// CHECK-SAME: <2 x i32> noundef [[A:%.*]], <2 x i32> noundef [[B:%.*]]) #[[ATTR0:[0-9]+]] {
// CHECK-NEXT: [[ENTRY:.*:]]
// CHECK-NEXT: [[A_ADDR:%.*]] = alloca <2 x i32>, align 8
// CHECK-NEXT: [[B_ADDR:%.*]] = alloca <2 x i32>, align 8
// CHECK-NEXT: store <2 x i32> [[A]], ptr [[A_ADDR]], align 8
// CHECK-NEXT: store <2 x i32> [[B]], ptr [[B_ADDR]], align 8
// CHECK-NEXT: [[A_LOAD:%.*]] = load <2 x i32>, ptr [[A_ADDR]], align 8
// CHECK-NEXT: [[B_LOAD:%.*]] = load <2 x i32>, ptr [[B_ADDR]], align 8
// CHECK-NEXT: [[LowA:%.*]] = extractelement <2 x i32> [[A_LOAD]], i64 0
// CHECK-NEXT: [[HighA:%.*]] = extractelement <2 x i32> [[A_LOAD]], i64 1
// CHECK-NEXT: [[LowB:%.*]] = extractelement <2 x i32> [[B_LOAD]], i64 0
// CHECK-NEXT: [[HighB:%.*]] = extractelement <2 x i32> [[B_LOAD]], i64 1
// CHECK-NEXT: [[UAddc:%.*]] = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 [[LowA]], i32 [[LowB]])
// CHECK-NEXT: [[Carry:%.*]] = extractvalue { i32, i1 } [[UAddc]], 1
// CHECK-NEXT: [[LowSum:%.*]] = extractvalue { i32, i1 } [[UAddc]], 0
// CHECK-NEXT: [[CarryZExt:%.*]] = zext i1 [[Carry]] to i32
// CHECK-NEXT: [[HighSum:%.*]] = add i32 [[HighA]], [[HighB]]
// CHECK-NEXT: [[HighSumPlusCarry:%.*]] = add i32 [[HighSum]], [[CarryZExt]]
// CHECK-NEXT: [[HLSL_ADDUINT64_UPTO0:%.*]] = insertelement <2 x i32> poison, i32 [[LowSum]], i64 0
// CHECK-NEXT: [[HLSL_ADDUINT64:%.*]] = insertelement <2 x i32> [[HLSL_ADDUINT64_UPTO0]], i32 [[HighSumPlusCarry]], i64 1
// CHECK-NEXT: ret <2 x i32> [[HLSL_ADDUINT64]]
//
uint2 test_AddUint64_uint2(uint2 a, uint2 b) {
return AddUint64(a, b);
}

// CHECK-LABEL: define noundef <4 x i32> @_Z20test_AddUint64_uint4Dv4_jS_(
// CHECK-SAME: <4 x i32> noundef [[A:%.*]], <4 x i32> noundef [[B:%.*]]) #[[ATTR0]] {
// CHECK-NEXT: [[ENTRY:.*:]]
// CHECK-NEXT: [[A_ADDR:%.*]] = alloca <4 x i32>, align 16
// CHECK-NEXT: [[B_ADDR:%.*]] = alloca <4 x i32>, align 16
// CHECK-NEXT: store <4 x i32> [[A]], ptr [[A_ADDR]], align 16
// CHECK-NEXT: store <4 x i32> [[B]], ptr [[B_ADDR]], align 16
// CHECK-NEXT: [[A_LOAD:%.*]] = load <4 x i32>, ptr [[A_ADDR]], align 16
// CHECK-NEXT: [[B_LOAD:%.*]] = load <4 x i32>, ptr [[B_ADDR]], align 16
// CHECK-NEXT: [[LowA:%.*]] = extractelement <4 x i32> [[A_LOAD]], i64 0
// CHECK-NEXT: [[HighA:%.*]] = extractelement <4 x i32> [[A_LOAD]], i64 1
// CHECK-NEXT: [[LowB:%.*]] = extractelement <4 x i32> [[B_LOAD]], i64 0
// CHECK-NEXT: [[HighB:%.*]] = extractelement <4 x i32> [[B_LOAD]], i64 1
// CHECK-NEXT: [[UAddc:%.*]] = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 [[LowA]], i32 [[LowB]])
// CHECK-NEXT: [[Carry:%.*]] = extractvalue { i32, i1 } [[UAddc]], 1
// CHECK-NEXT: [[LowSum:%.*]] = extractvalue { i32, i1 } [[UAddc]], 0
// CHECK-NEXT: [[CarryZExt:%.*]] = zext i1 [[Carry]] to i32
// CHECK-NEXT: [[HighSum:%.*]] = add i32 [[HighA]], [[HighB]]
// CHECK-NEXT: [[HighSumPlusCarry:%.*]] = add i32 [[HighSum]], [[CarryZExt]]
// CHECK-NEXT: [[HLSL_ADDUINT64_UPTO0:%.*]] = insertelement <4 x i32> poison, i32 [[LowSum]], i64 0
// CHECK-NEXT: [[HLSL_ADDUINT64_UPTO1:%.*]] = insertelement <4 x i32> [[HLSL_ADDUINT64_UPTO0]], i32 [[HighSumPlusCarry]], i64 1
// CHECK-NEXT: [[LowA1:%.*]] = extractelement <4 x i32> [[A_LOAD]], i64 2
// CHECK-NEXT: [[HighA1:%.*]] = extractelement <4 x i32> [[A_LOAD]], i64 3
// CHECK-NEXT: [[LowB1:%.*]] = extractelement <4 x i32> [[B_LOAD]], i64 2
// CHECK-NEXT: [[HighB1:%.*]] = extractelement <4 x i32> [[B_LOAD]], i64 3
// CHECK-NEXT: [[UAddc1:%.*]] = call { i32, i1 } @llvm.uadd.with.overflow.i32(i32 [[LowA1]], i32 [[LowB1]])
// CHECK-NEXT: [[Carry1:%.*]] = extractvalue { i32, i1 } [[UAddc1]], 1
// CHECK-NEXT: [[LowSum1:%.*]] = extractvalue { i32, i1 } [[UAddc1]], 0
// CHECK-NEXT: [[CarryZExt1:%.*]] = zext i1 [[Carry1]] to i32
// CHECK-NEXT: [[HighSum1:%.*]] = add i32 [[HighA1]], [[HighB1]]
// CHECK-NEXT: [[HighSumPlusCarry1:%.*]] = add i32 [[HighSum1]], [[CarryZExt1]]
// CHECK-NEXT: [[HLSL_ADDUINT64_UPTO2:%.*]] = insertelement <4 x i32> [[HLSL_ADDUINT64_UPTO1]], i32 [[LowSum1]], i64 2
// CHECK-NEXT: [[HLSL_ADDUINT64:%.*]] = insertelement <4 x i32> [[HLSL_ADDUINT64_UPTO2]], i32 [[HighSumPlusCarry1]], i64 3
// CHECK-NEXT: ret <4 x i32> [[HLSL_ADDUINT64]]
//
uint4 test_AddUint64_uint4(uint4 a, uint4 b) {
return AddUint64(a, b);
}
41 changes: 41 additions & 0 deletions clang/test/SemaHLSL/BuiltIns/AddUint64-errors.hlsl
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
// RUN: %clang_cc1 -finclude-default-header -triple dxil-pc-shadermodel6.6-library %s -fnative-half-type -emit-llvm-only -disable-llvm-passes -verify

uint2 test_too_few_arg() {
return __builtin_hlsl_adduint64();
// expected-error@-1 {{too few arguments to function call, expected 2, have 0}}
}

uint4 test_too_many_arg(uint4 a) {
return __builtin_hlsl_adduint64(a, a, a);
// expected-error@-1 {{too many arguments to function call, expected 2, have 3}}
}

uint2 test_mismatched_arg_types(uint2 a, uint4 b) {
return __builtin_hlsl_adduint64(a, b);
// expected-error@-1 {{all arguments to '__builtin_hlsl_adduint64' must have the same type}}
}

uint2 test_bad_num_arg_elements(uint3 a, uint3 b) {
return __builtin_hlsl_adduint64(a, b);
// expected-error@-1 {{invalid element count of 3 in vector operand (expected an even element count in the range of 2 and 4)}}
}

uint2 test_scalar_arg_type(uint a) {
return __builtin_hlsl_adduint64(a, a);
// expected-error@-1 {{all arguments to AddUint64 must be vectors}}
}

uint2 test_signed_integer_args(int2 a, int2 b) {
return __builtin_hlsl_adduint64(a, b);
// expected-error@-1 {{passing 'int2' (aka 'vector<int, 2>') to parameter of incompatible type '__attribute__((__vector_size__(2 * sizeof(unsigned int)))) unsigned int' (vector of 2 'unsigned int' values)}}
}

struct S {
uint2 a;
};

uint2 test_incorrect_arg_type(S a) {
return __builtin_hlsl_adduint64(a, a);
// expected-error@-1 {{passing 'S' to parameter of incompatible type 'unsigned int'}}
}

13 changes: 13 additions & 0 deletions llvm/lib/Target/DirectX/DXIL.td
Original file line number Diff line number Diff line change
Expand Up @@ -50,6 +50,7 @@ def HandleTy : DXILOpParamType;
def ResBindTy : DXILOpParamType;
def ResPropsTy : DXILOpParamType;
def SplitDoubleTy : DXILOpParamType;
def BinaryWithCarryTy : DXILOpParamType;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is reminding me how much I didn't like how splitdouble ended up. Seems like we need a more generic way to define our anonymous struct return types.


class DXILOpClass;

Expand Down Expand Up @@ -738,6 +739,18 @@ def UMin : DXILOp<40, binary> {
let attributes = [Attributes<DXIL1_0, [ReadNone]>];
}

def UAddc : DXILOp<44, binaryWithCarryOrBorrow > {
let Doc = "Unsigned 32-bit integer arithmetic add with carry. uaddc(a,b) = (a+b, a+b overflowed ? 1 : 0)";
// TODO: This `let intrinsics = ...` line may be uncommented when
// https://github.com/llvm/llvm-project/issues/113192 is fixed
// let intrinsics = [IntrinSelect<int_uadd_with_overflow>];
let arguments = [OverloadTy, OverloadTy];
let result = BinaryWithCarryTy;
let overloads = [Overloads<DXIL1_0, [Int32Ty]>];
let stages = [Stages<DXIL1_0, [all_stages]>];
let attributes = [Attributes<DXIL1_0, [ReadNone]>];
}

def FMad : DXILOp<46, tertiary> {
let Doc = "Floating point arithmetic multiply/add operation. fmad(m,a,b) = m "
"* a + b.";
Expand Down
14 changes: 14 additions & 0 deletions llvm/lib/Target/DirectX/DXILOpBuilder.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -230,6 +230,14 @@ static StructType *getSplitDoubleType(LLVMContext &Context) {
return StructType::create({Int32Ty, Int32Ty}, "dx.types.splitdouble");
}

static StructType *getBinaryWithCarryType(LLVMContext &Context) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe something that we could consider being generated from DXIL.td, alongside the other custom struct types. Maybe not for this pull request, but we can make a note here for the follow up pr: #113192

if (auto *ST = StructType::getTypeByName(Context, "dx.types.i32c"))
return ST;
Type *Int32Ty = Type::getInt32Ty(Context);
Type *Int1Ty = Type::getInt1Ty(Context);
return StructType::create({Int32Ty, Int1Ty}, "dx.types.i32c");
}

static Type *getTypeFromOpParamType(OpParamType Kind, LLVMContext &Ctx,
Type *OverloadTy) {
switch (Kind) {
Expand Down Expand Up @@ -273,6 +281,8 @@ static Type *getTypeFromOpParamType(OpParamType Kind, LLVMContext &Ctx,
return getResPropsType(Ctx);
case OpParamType::SplitDoubleTy:
return getSplitDoubleType(Ctx);
case OpParamType::BinaryWithCarryTy:
return getBinaryWithCarryType(Ctx);
}
llvm_unreachable("Invalid parameter kind");
return nullptr;
Expand Down Expand Up @@ -539,6 +549,10 @@ StructType *DXILOpBuilder::getSplitDoubleType(LLVMContext &Context) {
return ::getSplitDoubleType(Context);
}

StructType *DXILOpBuilder::getBinaryWithCarryType(LLVMContext &Context) {
return ::getBinaryWithCarryType(Context);
}

StructType *DXILOpBuilder::getHandleType() {
return ::getHandleType(IRB.getContext());
}
Expand Down
3 changes: 3 additions & 0 deletions llvm/lib/Target/DirectX/DXILOpBuilder.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,9 @@ class DXILOpBuilder {
/// Get the `%dx.types.splitdouble` type.
StructType *getSplitDoubleType(LLVMContext &Context);

/// Get the `%dx.types.i32c` type.
StructType *getBinaryWithCarryType(LLVMContext &Context);

/// Get the `%dx.types.Handle` type.
StructType *getHandleType();

Expand Down
22 changes: 15 additions & 7 deletions llvm/lib/Target/DirectX/DXILOpLowering.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -359,17 +359,16 @@ class OpLowerer {
return lowerToBindAndAnnotateHandle(F);
}

Error replaceSplitDoubleCallUsages(CallInst *Intrin, CallInst *Op) {
Error replaceAggregateTypeOfCallUsages(CallInst *Intrin, CallInst *Op) {
for (Use &U : make_early_inc_range(Intrin->uses())) {
if (auto *EVI = dyn_cast<ExtractValueInst>(U.getUser())) {

if (EVI->getNumIndices() != 1)
return createStringError(std::errc::invalid_argument,
"Splitdouble has only 2 elements");
EVI->setOperand(0, Op);
} else if (auto *IVI = dyn_cast<InsertValueInst>(U.getUser())) {
IVI->setOperand(0, Op);
} else {
return make_error<StringError>(
"Splitdouble use is not ExtractValueInst",
(Intrin->getCalledFunction()->getName() +
" use is not a ExtractValueInst or InsertValueInst"),
inconvertibleErrorCode());
}
}
Expand Down Expand Up @@ -821,7 +820,16 @@ class OpLowerer {
F, OpCode::SplitDouble,
OpBuilder.getSplitDoubleType(M.getContext()),
[&](CallInst *CI, CallInst *Op) {
return replaceSplitDoubleCallUsages(CI, Op);
return replaceAggregateTypeOfCallUsages(CI, Op);
});
break;
// TODO: this can be removed when
// https://github.com/llvm/llvm-project/issues/113192 is fixed
case Intrinsic::uadd_with_overflow:
HasErrors |= replaceFunctionWithNamedStructOp(
F, OpCode::UAddc, OpBuilder.getBinaryWithCarryType(M.getContext()),
[&](CallInst *CI, CallInst *Op) {
return replaceAggregateTypeOfCallUsages(CI, Op);
});
break;
case Intrinsic::ctpop:
Expand Down
Loading