[Backport to LLVM 17] Reland "Remove CoordX and CoordY arguments of Of OpCooperativeMatrixPrefetchINTEL" (#2560) (#3637)#3646
Closed
obrotowy wants to merge 151 commits intoKhronosGroup:mainfrom
Closed
[Backport to LLVM 17] Reland "Remove CoordX and CoordY arguments of Of OpCooperativeMatrixPrefetchINTEL" (#2560) (#3637)#3646obrotowy wants to merge 151 commits intoKhronosGroup:mainfrom
obrotowy wants to merge 151 commits intoKhronosGroup:mainfrom
Conversation
Specification: KhronosGroup/SPIRV-Registry#216 Cherry-pick of KhronosGroup#2140
Co-authored-by: Stanley Gambarin <stanley.gambarin@intel.com>
…NTEL capability after Headers change (KhronosGroup#2258) (KhronosGroup#2308) * Bump SPIRV-Headers to 1c6bb2743599e6eb6f37b2969acc0aef812e32e3 * replace internal SPV_INTEL_long_composites ext with the published SPV_INTEL_long_composites * don't rename extension for now This closes: KhronosGroup#2261 Co-authored-by: Viktoria Maximova <viktoria.maksimova@intel.com> Co-authored-by: Wlodarczyk, Bertrand <bertrand.wlodarczyk@intel.com>
…pirv translator (KhronosGroup#2210) (KhronosGroup#2334) This PR aims to add f16 type support for atomicrmw in llvm-spirv translator, with the reference to the extension documented in [1]. There are two concerns related to the subject: SPIRVAtomicFAddEXTInst::getRequiredExtension() should return a list of required extension to support the requirement to list both SPV_EXT_shader_atomic_float16_add and SPV_EXT_shader_atomic_float_add extensions in the module (see "Extension Name" section of the ref [1]). However, the return type is std::optional<ExtensionID> and returning a vector would need a bigger rework. Including SPV_EXT_shader_atomic_float16_add into --spirv-ext argument of llvm-spirv doesn't result in producing the correspondent capability (AtomicFloat16AddEXT) and extension in a SPIRV output. $ llvm-spirv AtomicFAddEXT.ll.tmp.bc --spirv-ext=+SPV_EXT_shader_atomic_float_add,+SPV_EXT_shader_atomic_float16_add -o AtomicFAddEXT.ll.tmp.spv $ llvm-spirv -to-text AtomicFAddEXT.ll.tmp.spv -o /dev/stdout ... 2 Capability AtomicFloat32AddEXT 2 Capability AtomicFloat64AddEXT 9 Extension "SPV_EXT_shader_atomic_float_add" ... This prevents extending the test case of AtomicFAddEXT.ll in EXT/SPV_EXT_shader_atomic_float. References: [1] https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/EXT/SPV_EXT_shader_atomic_float16_add.asciidoc Co-authored-by: Vyacheslav Levytskyy <89994100+VyacheslavLevytskyy@users.noreply.github.com>
…roup#2273) (KhronosGroup#2336) Return type of image read and Texel type of image write builtins may be unsigned. Before this PR, the builtin names in SPIR-V Friendly IR were always mangled with signed type. (cherry picked from commit e9b95fb)
…Expression (KhronosGroup#2326) Ensure that SPIR-V that uses a DebugGlobalVariable's Variable field to hold an Expression can be reverse translated. A Variable field can be used to hold an Expression in order to preserve a DIExpression in a DIGlobalVariableExpression in LLVM IR. Signed-off-by: Lu, John <john.lu@intel.com>
) The SPIR-V Specification allows `OpConstantNull` types to be scalar or vector booleans, integers, or floats. Update an assert for this and add a SPIR-V -> LLVM IR test. (cherry picked from commit 9ec969c)
This commit implements bidirectional translation of the llvm.uadd.with.overflow and the IAddCarry intrinsic. Intrinsic llvm.uadd.with.overflow returns struct which second element have a type of i1. The llvm type i1 is, in llvm-spirv, directly translated to BoolType. SPIRV specification requires that the composite which returns from IAddCarry needs to have both elements of the same type. In result, current implementation is not compliant and should be considered temporary.
This commit implements bidirectional translation of the llvm.usub.with.overflow and the ISubBorrow intrinsic. Intrinsic llvm.usub.with.overflow returns struct which second element have a type of i1. The llvm type i1 is, in llvm-spirv, directly translated to BoolType. SPIRV specification requires that the composite which returns from ISubBorrow needs to have both elements of the same type. In result, current implementation is not compliant and should be considered temporary.
…p#2381) (KhronosGroup#2386) Usually sret parameters are accessed by a memory instruction, from which would tell SPIRVTypeScavenger which type to use for this function parameter. But if sret parameter is unused later in the module scavenger would fail attempting to deduce type from the mangled name. Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>
…osGroup#2344) (KhronosGroup#2393) Spec: KhronosGroup/SPIRV-Registry#235 Co-authored-by: Viktoria Maximova <viktoria.maksimova@intel.com>
…hronosGroup#2391) (KhronosGroup#2410) The SPIR-V to LLVM conversion would bail out when encountering an `OpVectorShuffle` whose vector operands differ in size. SPIR-V allows differing vector sizes, but LLVM's `shufflevector` does not. Remove the assert and insert an additional `shufflevector` to align the vector operands when needed. (cherry picked from commit 3df5e38)
…ing load types (KhronosGroup#2160) (KhronosGroup#2245) In some cases, we will see IR with the following @__spirv_BuiltInGlobalInvocationId = external dso_local local_unnamed_addr addrspace(1) constant <3 x i64>, align 32 ... %0 = load <6 x i32>, ptr addrspace(1) @__spirv_BuiltInGlobalInvocationId, align 32 %1 = extractelement <6 x i32> %0, i64 0 Note the global type and load type are different. Change the handling of vector loads from vector globals to reconstruct the global vector type and then bitcast to the load type. Thanks to @jcranmer-intel for helping me find the simplest solution. Co-authored-by: Nick Sarnie <sarnex@users.noreply.github.com>
…or debug types (KhronosGroup#2341) OpenCL and NonSemantic DebugInfo specifications are flexible in terms of allowing any debug information be replaced with DebugInfoNone, so various of SPIR-V producers follow that and generate it for base types of several debug instructions, leaving SPIR-V consumers to handle this. By default the translator replaces missing debug info with tag: null, which is in most cases correct. Yet, there are situations, where it's not allowed by both LLVM and DWARF, for example for DW_TAG_array_type DWARF spec sets, that DW_AT_type attribute is mandatory. For such cases new transNonNullDebugType wrapper function was added to the translator, generating "DIBasicType(tag: DW_TAG_unspecified_type, name: "SPIRV unknown type")" where DebugInfoNone was used as the type. This function doesn't replace all calls to transDebugInst<DIType> as there are cases, where we can generate null type, for example DWARF doesn't require it for DW_TAG_typedef, hence I'm not changing translation flow in this case. Additionally to this, while DWARF requires type attribute for DW_TAG_pointer_type, LLVM does not, hence I'm not changing translation flow in this case as well. Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>
It should have tested DebugInfoNone base type Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>
…sGroup#2440) If a global is initialized with a ConstantExpr returning a pointer (getelementptr), the right type has to be deduced instead of defaulting it to i8*.
…2455) The fix adds support for IR builtin calls __spirv_IAddCarry and __spirv_ISubBorrow. It's also first part of fix which removes noncompliance of uadd/sub_with_overflow intrinsics. SPIRVUtil changes are needed to support situations where builtin don't have corresponding store instruction. Co-authored-by: bwlodarcz <bertrand.wlodarczyk@intel.com>
Changes were cherry-picked from the following commit: KhronosGroup@c6fe12b Also cherry picked fixes from: KhronosGroup#2208 KhronosGroup#2192 This changes add SPIR-V translator support for the SPIR-V extension documented here: KhronosGroup/SPIRV-Registry#193. This extension adds one decoration to represent maximum error for FP operations and adds the related Capability. SPIRV Headers support for representing this in SPIR-V: KhronosGroup/SPIRV-Headers#363 intel/llvm#8134 added a new call-site attribute associated with FP builtin intrinsics. This attribute is named 'fpbuiltin-max-error'. Following example shows how this extension is supported in the translator. The input LLVM IR uses new LLVM builtin calls to represent FP operations. An attribute named 'fpbuiltin-max-error' is used to represent the max-error allowed in the FP operation. Example Input LLVM: %t6 = call float @llvm.fpbuiltin.sin.f32(float %f1) KhronosGroup#2 attributes KhronosGroup#2 = { "fpbuiltin-max-error"="2.5" } This is translated into a SPIR-V instruction (for add/sub/mul/div/rem) and OpenCl extended instruction for other instructions. A decoration to represent the max-error is attached to the SPIR-V instruction. SPIR-V code: 4 Decorate 97 FPMaxErrorDecorationINTEL 1075838976 6 ExtInst 2 97 1 sin 88 No new support is added to support translating this SPIR_V back to LLVM. Existing support is used. The decoration is translated back into named metadata associated with the LLVM instruction. This can be readily consumed by backends. Based on input from @andykaylor, we emit attributes when the FP operation is translated back to a call to a builtin function and emit metadata otherwise. Translated LLVM code for basic math functions (add/sub/mul/div/rem): %t6 = fmul float %f1, %f2, !fpbuiltin-max-error !7 !7 = !{!"2.500000"} Translated LLVM code for other math functions: %t6 = call spir_func float @_Z3sinf(float %f1) KhronosGroup#3 attributes KhronosGroup#3 = { "fpbuiltin-max-error"="4.000000" }
…hronosGroup#2463) This PR aims to introduce CooperativeMatrixPrefetchINTEL capability and operation, and make initial introduction of entities in llvm-spirv translator. Co-authored-by: Vyacheslav Levytskyy <89994100+VyacheslavLevytskyy@users.noreply.github.com>
…perands for both Source and Target (KhronosGroup#2474) Original change: a384e03
…up#2277) Add support for load/store operations for a cooperative matrix such that original matrix shape is known and implementations are able to reason about how to deal with the out of bounds. CapabilityCooperativeMatrixCheckedInstructionsINTEL = 6192 CooperativeMatrixLoadCheckedINTEL = 6193 CooperativeMatrixStoreCheckedINTEL = 6194
…eckedINTEL (KhronosGroup#2331) Add support for checked matrix construct instruction. Specification draft: https://github.com/intel/llvm/blob/2fa153ee852ea3d7d64df097f1f494cddacee90e/sycl/doc/design/spirv-extensions/SPV_INTEL_joint_matrix.asciidoc
…p#2174) (KhronosGroup#2492) This change is basically an update of KhronosGroup#1389 for spec changes. Implementation of the feature was based on Intel extension which was not officially published to Khronos. Now it has been split, updated, and published to Khronos by KhronosGroup/SPIRV-Registry#205 Summary of the things that have changed: Capability names and a new capability was added Values for decorations have been updated to enums Decoration names and IDs have been changed Specs: https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/INTEL/SPV_INTEL_global_variable_fpga_decorations.asciidoc https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/INTEL/SPV_INTEL_global_variable_host_access.asciidoc Co-authored-by: Viktoria Maximova <viktoria.maksimova@intel.com>
…k with legacy TypeJointMatrixINTEL (KhronosGroup#2318) (KhronosGroup#2518) This PR is to ensure that CooperativeMatrixLoadCheckedINTEL, CooperativeMatrixStoreCheckedINTEL and CooperativeMatrixConstructCheckedINTEL instructions can correctly accept TypeJointMatrixINTEL as an input during both forward and reverse translation. Co-authored-by: Vyacheslav Levytskyy <89994100+VyacheslavLevytskyy@users.noreply.github.com>
…hronosGroup#2490) (KhronosGroup#2523) Fixed ParentIdx was mismatched for DebugTypeInheritance type in context of NonSemantic.Shader.DebugInfo.100. That lead to heap buffer overflow in SPIRVExtInst::getExtOp getter when instruction was incorrectly casted to SPIRVExtInst as parent of the culprit instruction in SPIRVToLLVMDbgTran::getDIBuilder. The missmatch happen because in previous used standard OpenCL.DebugInfo.100 DebugTypeInheritance had Child field as zero indexed argument. In newer standard that field is removed.
…2529) (KhronosGroup#2538) When a temporary `OpForward` instruction is needed during translation to SPIR-V, do not add the decorations yet, as that would result in duplicate decorations when the actual instruction is visited and the `OpForward` is replaced by a real SPIR-V instruction. The SPIR-V Validator has recently started checking for duplicate decorations; this fixes some but not all issues arising from the new checks. Contributes to KhronosGroup#2509 (cherry picked from commit a278313)
…2537) (KhronosGroup#2550) The SPIR-V Validator has recently started checking for duplicate decorations. This commit fixes duplicate Alignment decorations that affected the `test/read_image.cl` test. Alignment decorations have two potential sources during LLVM to SPIR-V translation: the instruction's alignment property and `spirv.Decorations` metadata. Handle both of these through the `setAlignment` method, so that duplicates can be avoided. Calling `setAlignment` with different alignments for the same entity is probably an error, so add an assert. Contributes to KhronosGroup#2509 (cherry picked from commit 926ca2a)
…riable extensions (KhronosGroup#2228) (KhronosGroup#2573) This is done to provide smooth drivers transition to official extensions. Also updated tests for new extensions to use new tokens after KhronosGroup/SPIRV-Headers@cca08c6 Co-authored-by: Viktoria Maximova <viktoria.maksimova@intel.com>
Backport of PR KhronosGroup#3355 into `llvm_release_170`. All commits applied cleanly. Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com> Co-authored-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
…Group#3370) (KhronosGroup#3381) This extension adds predicated load and store instructions. Predicated load performs load from memory if predicate is true; otherwise, it uses default_value as a result. Predicated store performs store of value to memory if predicate is true; otherwise, it does nothing. Spec: intel/llvm#20158 Signed-off-by: Zhang, Yixing <yixing.zhang@intel.com>
…builtin ptr arg type. (KhronosGroup#3390) (KhronosGroup#3396) Add missing case so that pointer argument types for `OpAtomicCompareExchangeWeak` are resolved correctly. The newly added test, without this patch would introduce an invalid bitcast just before the atomic builtin.
…3424) This continues KhronosGroup#3343 and reflects specification update, including extension renaming. Specification: intel/llvm#20009
…R calls (KhronosGroup#3398) Map all cooperative matrix type conversions to SPIR-V friendly IR calls, regardless of the environment specified. In particular, do not attempt to map such conversions to the OpenCL `convert` builtin. The SPIR-V TargetExtType is already encoded in the function suffix, so the previous translation was an odd hybrid between OpenCL and SPIR-V friendly IR. (cherry picked from commit 9d56f01)
…#3374) Way they are implemented is described in: KhronosGroup#3221 The PR also adds SPV_EXT_float8 extension and packed conversions for SPV_INTEL_int4 Currently only conversion instructions (and internal builtins) are covered. TODO: in the following PR Saturation decoration will be added. Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com> --------- Signed-off-by: Sidorov, Dmitry <dmitry.sidorov@intel.com>
…ons extensions (KhronosGroup#3419) As well as their appropriate conversions via __builtin_spirv mechanism. Specification: intel/llvm#20467 Signed-off-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
…hronosGroup#3488) The extension adds support for the `OpFmaKHR` instruction, which provides a native SPIR-V instruction for fused multiply-add operations as an alternative to using OpenCL.std::Fma extended instruction. Translate both LLVM fma intrinsics as well as OCL builtins to `OpFmaKHR` if the extension is available. Specification: https://github.khronos.org/SPIRV-Registry/extensions/KHR/SPV_KHR_fma.html
…KhronosGroup#3504) Previous check in 199d2e0 translated __spirv_ocl_fmax to FMA. (cherry picked from commit 64b7a07)
KhronosGroup#3512) Instead of the hardcoded `-j2`, use the number of available processing units. Fixes KhronosGroup#3481 (cherry picked from commit f20a37d)
Changed in llvm-project commit f0fa2d7c292853b79b5bcd16be97940859a800ec
Changed in llvm-project commits: - bb7feae218745c666718b7a16b1f57d0d2165cf1 - 3b01fa264c36c1c7bb293f001579e0f459a92b84 - 61e1c3d80db6e94e8b5b83b3819afefeec4d357b
Changed in llvm-project commit ee19fabc984747b0ce971d1d47662d89b63fa0ab
…d lib" This reverts commit 2fc1351.
This reverts commit 0617edd.
…ositeConstruct" * Fix reverse translation of non-constant values of OpCompositeConstruct (KhronosGroup#2256) This patch introduces a way to use runtime values for structure fields. Co-authored-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com> * Fix reverse translation of non-constant values of OpCompositeConstruct pt.2 (KhronosGroup#2296) This patch introduces a way to use runtime values for array and vector types. It continues KhronosGroup#2256 --------- Co-authored-by: Viktoria Maximova <viktoria.maksimova@intel.com> Co-authored-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
…loat16ArithmeticINTEL` capability (KhronosGroup#3577) Backport of PR KhronosGroup#3567 into `llvm_release_170`. All commits applied cleanly. Co-authored-by: Viktoria Maximova <viktoria.maksimova@intel.com>
…upMatrixMultiplyAccumulateINTEL (KhronosGroup#3609) (KhronosGroup#3631) Extend SubgroupMatrixMultiplyAccumulateINTEL to support packed 4-bit and 8-bit floating-point matrix operands by implementing extensions: - SPV_INTEL_subgroup_matrix_multiply_accumulate_float4 - SPV_INTEL_subgroup_matrix_multiply_accumulate_float8 These extensions add operand flags that interpret packed integer data as FP4/FP8 without requiring actual FP4/FP8 type support added by SPV_INTEL_float4 or SPV_EXT_float8. FP4 operands: `MatrixAPackedFloat4E2M1INTEL` (0x40000) / `MatrixBPackedFloat4E2M1INTEL` (0x80000) FP8 operands: `MatrixAPackedFloat8E4M3INTEL` (0x4000) / `MatrixBPackedFloat8E4M3INTEL` (0x8000) `MatrixAPackedFloat8E5M2INTEL` (0x10000) / `MatrixBPackedFloat8E5M2INTEL` (0x20000) Specs: https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_subgroup_matrix_multiply_accumulate_float4.asciidoc https://github.com/intel/llvm/blob/sycl/sycl/doc/design/spirv-extensions/SPV_INTEL_subgroup_matrix_multiply_accumulate_float8.asciidoc Co-authored-by: Viktoria Maximova <viktoria.maksimova@intel.com>
…e upon mismatch (KhronosGroup#3255) (KhronosGroup#3612) The source element type used in a GEP may differ from the actual type of the pointer operand (e.g., ptr i8 vs. ptr [N x T]). This mismatch can lead to incorrect address computations during translation to SPIR-V of GEP used in constexpr context, which requires that pointer types match the type of the object being accessed. This patch inserts an explicit bitcast to convert the GEP pointer operand to the expected type, derived from the GEP’s source element type, before emitting an PtrAccessChain. This ensures the resulting SPIR-V instruction has a correctly typed base pointer and produces valid indexing behavior. For example: Before this change, the following GEP was translated incorrectly: getelementptr(i8, ptr addrspace(1) @a_var, i64 2) Whereas this nearly equivalent GEP was handled correctly: getelementptr inbounds ([2 x i8], ptr @a_var, i64 0, i64 1) Previously, the first form was incorrectly interpreted as: getelementptr inbounds ([2 x i8], ptr @a_var, i64 0, i64 2) (cherry picked from commit 1be9366) Co-authored-by: Karol Zwolak <karolzwolak7@gmail.com>
…anslation (KhronosGroup#3524) Co-authored-by: Dmitry Sidorov <dmitry.sidorov@intel.com>
…KhronosGroup#3542) Backport of PR KhronosGroup#3529 into `llvm_release_170`. All commits applied cleanly. Co-authored-by: Viktoria Maximova <viktoria.maksimova@intel.com>
…o 1-length arrays in SPIR-V (KhronosGroup#3546) Backport of PR KhronosGroup#2743 into `llvm_release_170`. All commits applied cleanly. Co-authored-by: Lorenc Bushi <lorenc.bushi@intel.com>
This reverts commit 310c785. Previously reverted before implementing fpclass llvm intrinsic lowering.
…td lib" This reverts commit 4619248. Previously reverted before implementing fpclass llvm intrinsic lowering.
…f OpCooperativeMatrixPrefetchINTEL" (KhronosGroup#2560) (KhronosGroup#3637) Original PR: KhronosGroup#2560
…hmetic instructions (KhronosGroup#3624) Backport of PRs: KhronosGroup#2117 KhronosGroup#2156 KhronosGroup#2165 KhronosGroup#2166 to llvm_release_170 authored by @vmaksimo
|
Dmitry Sidorov seems not to be a GitHub user. You need a GitHub account to be able to sign the CLA. If you have already a GitHub account, please add the email address used for this commit to your account. You have signed the CLA already but the status is still pending? Let us recheck it. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Original PR: #2560