Skip to content

Commit 648fc67

Browse files
committed
aarch64: Add support for SVE_B16B16
This patch adds support for the SVE_B16B16 extension, which provides non-widening BF16 versions of existing instructions. Mostly it's just a simple extension of iterators. The main complications are: (1) The new instructions have no immediate forms. This is easy to handle for the cond_* patterns (the ones that have an explicit else value) since those are already divided into register and non-register versions. All we need to do is tighten the predicates. However, the @aarch64_pred_<optab><mode> patterns handle the immediates directly. Rather than complicate them further, it seemed best to add a single @aarch64_pred_<optab><mode> for all BF16 arithmetic. (2) There is no BFSUBR, so the usual method of handling reversed operands breaks down. The patch deals with this using some new attributes that together disable the "BFSUBR" alternative. (3) Similarly, there are no BFMAD or BFMSB instructions, so we need to disable those forms in the BFMLA and BFMLS patterns. The patch includes support for generic bf16 vectors too. It would be possible to use these instructions for scalars, as with the recent FLOGB patch, but that's left as future work. gcc/ * config/aarch64/aarch64-option-extensions.def (sve-b16b16): New extension. * doc/invoke.texi: Document it. * config/aarch64/aarch64.h (TARGET_SME_B16B16, TARGET_SVE2_OR_SME2) (TARGET_SSVE_B16B16): New macros. * config/aarch64/aarch64-c.cc (aarch64_update_cpp_builtins): Conditionally define __ARM_FEATURE_SVE_B16B16 * config/aarch64/aarch64-sve-builtins-sve2.def: Add AARCH64_FL_SVE2 to the SVE2p1 requirements. Add SVE_B16B16 forms of existing intrinsics. * config/aarch64/aarch64-sve-builtins.cc (type_suffixes): Treat bfloat as a floating-point type. (TYPES_h_bfloat): New macro. * config/aarch64/aarch64.md (is_bf16, is_rev, supports_bf16_rev) (mode_enabled): New attributes. (enabled): Test mode_enabled. * config/aarch64/iterators.md (SVE_FULL_F_BF): New mode iterator. (SVE_CLAMP_F): Likewise. (SVE_Fx24): Add BF16 modes when TARGET_SSVE_B16B16. (sve_lane_con): Handle BF16 modes. (b): Handle SF and DF modes. (is_bf16): New mode attribute. (supports_bf16, supports_bf16_rev): New int attributes. * config/aarch64/predicates.md (aarch64_sve_float_maxmin_immediate): Reject BF16 modes. * config/aarch64/aarch64-sve.md (*post_ra_<sve_fp_op><mode>3): Add BF16 support, and likewise for the associated define_split. (<optab:SVE_COND_FP_BINARY_OPTAB><mode>): Add BF16 support. (@cond_<optab:SVE_COND_FP_BINARY><mode>): Likewise. (*cond_<optab:SVE_COND_FP_BINARY><mode>_2_relaxed): Likewise. (*cond_<optab:SVE_COND_FP_BINARY><mode>_2_strict): Likewise. (*cond_<optab:SVE_COND_FP_BINARY><mode>_3_relaxed): Likewise. (*cond_<optab:SVE_COND_FP_BINARY><mode>_3_strict): Likewise. (*cond_<optab:SVE_COND_FP_BINARY><mode>_any_relaxed): Likewise. (*cond_<optab:SVE_COND_FP_BINARY><mode>_any_strict): Likewise. (@aarch64_mul_lane_<mode>): Likewise. (<optab:SVE_COND_FP_TERNARY><mode>): Likewise. (@aarch64_pred_<optab:SVE_COND_FP_TERNARY><mode>): Likewise. (@cond_<optab:SVE_COND_FP_TERNARY><mode>): Likewise. (*cond_<optab:SVE_COND_FP_TERNARY><mode>_4_relaxed): Likewise. (*cond_<optab:SVE_COND_FP_TERNARY><mode>_4_strict): Likewise. (*cond_<optab:SVE_COND_FP_TERNARY><mode>_any_relaxed): Likewise. (*cond_<optab:SVE_COND_FP_TERNARY><mode>_any_strict): Likewise. (@aarch64_<optab:SVE_FP_TERNARY_LANE>_lane_<mode>): Likewise. * config/aarch64/aarch64-sve2.md (@aarch64_pred_<optab:SVE_COND_FP_BINARY><mode>): Define BF16 version. (@aarch64_sve_fclamp<mode>): Add BF16 support. (*aarch64_sve_fclamp<mode>_x): Likewise. (*aarch64_sve_<maxmin_uns_op><SVE_Fx24:mode>): Likewise. (*aarch64_sve_single_<maxmin_uns_op><SVE_Fx24:mode>): Likewise. * config/aarch64/aarch64.cc (aarch64_sve_float_arith_immediate_p) (aarch64_sve_float_mul_immediate_p): Return false for BF16 modes. gcc/testsuite/ * lib/target-supports.exp: Test the assembler for sve-b16b16 support. * gcc.target/aarch64/pragma_cpp_predefs_4.c: Test the new B16B16 macros. * gcc.target/aarch64/sve/fmad_1.c: Test bfloat16 too. * gcc.target/aarch64/sve/fmla_1.c: Likewise. * gcc.target/aarch64/sve/fmls_1.c: Likewise. * gcc.target/aarch64/sve/fmsb_1.c: Likewise. * gcc.target/aarch64/sve/cond_mla_9.c: New test. * gcc.target/aarch64/sme2/acle-asm/clamp_bf16_x2.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/clamp_bf16_x4.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/max_bf16_x2.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/max_bf16_x4.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/maxnm_bf16_x2.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/maxnm_bf16_x4.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/min_bf16_x2.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/min_bf16_x4.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/minnm_bf16_x2.c: Likewise. * gcc.target/aarch64/sme2/acle-asm/minnm_bf16_x4.c: Likewise. * gcc.target/aarch64/sve/bf16_arith_1.c: Likewise. * gcc.target/aarch64/sve/bf16_arith_1.h: Likewise. * gcc.target/aarch64/sve/bf16_arith_2.c: Likewise. * gcc.target/aarch64/sve/bf16_arith_3.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/add_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/clamp_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/max_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/maxnm_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/min_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/minnm_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mla_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mla_lane_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mls_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mls_lane_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mul_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/mul_lane_bf16.c: Likewise. * gcc.target/aarch64/sve2/acle/asm/sub_bf16.c: Likewise.
1 parent 164fbe0 commit 648fc67

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+5864
-209
lines changed

gcc/config/aarch64/aarch64-c.cc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -208,6 +208,9 @@ aarch64_update_cpp_builtins (cpp_reader *pfile)
208208
"__ARM_FEATURE_SVE_MATMUL_FP32", pfile);
209209
aarch64_def_or_undef (TARGET_SVE_F64MM,
210210
"__ARM_FEATURE_SVE_MATMUL_FP64", pfile);
211+
aarch64_def_or_undef (AARCH64_HAVE_ISA (SVE_B16B16)
212+
&& (TARGET_SVE2 || TARGET_SME2),
213+
"__ARM_FEATURE_SVE_B16B16", pfile);
211214
aarch64_def_or_undef (TARGET_SVE2, "__ARM_FEATURE_SVE2", pfile);
212215
aarch64_def_or_undef (TARGET_SVE2_AES, "__ARM_FEATURE_SVE2_AES", pfile);
213216
aarch64_def_or_undef (TARGET_SVE2_BITPERM,

gcc/config/aarch64/aarch64-option-extensions.def

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,9 @@ AARCH64_FMV_FEATURE("rpres", RPRES, ())
165165

166166
AARCH64_OPT_FMV_EXTENSION("sve", SVE, (SIMD, F16), (), (), "sve")
167167

168+
/* This specifically does not imply +sve. */
169+
AARCH64_OPT_EXTENSION("sve-b16b16", SVE_B16B16, (), (), (), "")
170+
168171
AARCH64_OPT_EXTENSION("f32mm", F32MM, (SVE), (), (), "f32mm")
169172

170173
AARCH64_FMV_FEATURE("f32mm", SVE_F32MM, (F32MM))

gcc/config/aarch64/aarch64-sve-builtins-sve2.def

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -335,3 +335,30 @@ DEF_SVE_FUNCTION_GS (svzipq, unaryxn, all_data, x24, none)
335335
DEF_SVE_FUNCTION (svamax, binary_opt_single_n, all_float, mxz)
336336
DEF_SVE_FUNCTION (svamin, binary_opt_single_n, all_float, mxz)
337337
#undef REQUIRED_EXTENSIONS
338+
339+
#define REQUIRED_EXTENSIONS \
340+
sve_and_sme (AARCH64_FL_SVE2 | AARCH64_FL_SVE_B16B16, \
341+
AARCH64_FL_SME2 | AARCH64_FL_SVE_B16B16)
342+
DEF_SVE_FUNCTION (svadd, binary_opt_n, h_bfloat, mxz)
343+
DEF_SVE_FUNCTION (svclamp, clamp, h_bfloat, none)
344+
DEF_SVE_FUNCTION (svmax, binary_opt_single_n, h_bfloat, mxz)
345+
DEF_SVE_FUNCTION (svmaxnm, binary_opt_single_n, h_bfloat, mxz)
346+
DEF_SVE_FUNCTION (svmla, ternary_opt_n, h_bfloat, mxz)
347+
DEF_SVE_FUNCTION (svmla_lane, ternary_lane, h_bfloat, none)
348+
DEF_SVE_FUNCTION (svmls, ternary_opt_n, h_bfloat, mxz)
349+
DEF_SVE_FUNCTION (svmls_lane, ternary_lane, h_bfloat, none)
350+
DEF_SVE_FUNCTION (svmin, binary_opt_single_n, h_bfloat, mxz)
351+
DEF_SVE_FUNCTION (svminnm, binary_opt_single_n, h_bfloat, mxz)
352+
DEF_SVE_FUNCTION (svmul, binary_opt_n, h_bfloat, mxz)
353+
DEF_SVE_FUNCTION (svmul_lane, binary_lane, h_bfloat, none)
354+
DEF_SVE_FUNCTION (svsub, binary_opt_n, h_bfloat, mxz)
355+
#undef REQUIRED_EXTENSIONS
356+
357+
#define REQUIRED_EXTENSIONS \
358+
streaming_only (AARCH64_FL_SME2 | AARCH64_FL_SVE_B16B16)
359+
DEF_SVE_FUNCTION_GS (svclamp, clamp, h_bfloat, x24, none)
360+
DEF_SVE_FUNCTION_GS (svmax, binary_opt_single_n, h_bfloat, x24, none)
361+
DEF_SVE_FUNCTION_GS (svmaxnm, binary_opt_single_n, h_bfloat, x24, none)
362+
DEF_SVE_FUNCTION_GS (svmin, binary_opt_single_n, h_bfloat, x24, none)
363+
DEF_SVE_FUNCTION_GS (svminnm, binary_opt_single_n, h_bfloat, x24, none)
364+
#undef REQUIRED_EXTENSIONS

gcc/config/aarch64/aarch64-sve-builtins.cc

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -139,7 +139,7 @@ CONSTEXPR const type_suffix_info type_suffixes[NUM_TYPE_SUFFIXES + 1] = {
139139
BITS / BITS_PER_UNIT, \
140140
TYPE_##CLASS == TYPE_signed || TYPE_##CLASS == TYPE_unsigned, \
141141
TYPE_##CLASS == TYPE_unsigned, \
142-
TYPE_##CLASS == TYPE_float, \
142+
TYPE_##CLASS == TYPE_float || TYPE_##CLASS == TYPE_bfloat, \
143143
TYPE_##CLASS != TYPE_bool, \
144144
TYPE_##CLASS == TYPE_bool, \
145145
false, \
@@ -292,6 +292,10 @@ CONSTEXPR const group_suffix_info group_suffixes[] = {
292292
D (s16, s8), D (s32, s16), D (s64, s32), \
293293
D (u16, u8), D (u32, u16), D (u64, u32)
294294

295+
/* _bf16. */
296+
#define TYPES_h_bfloat(S, D) \
297+
S (bf16)
298+
295299
/* _s16
296300
_u16. */
297301
#define TYPES_h_integer(S, D) \
@@ -739,6 +743,7 @@ DEF_SVE_TYPES_ARRAY (bhs_integer);
739743
DEF_SVE_TYPES_ARRAY (bhs_data);
740744
DEF_SVE_TYPES_ARRAY (bhs_widen);
741745
DEF_SVE_TYPES_ARRAY (c);
746+
DEF_SVE_TYPES_ARRAY (h_bfloat);
742747
DEF_SVE_TYPES_ARRAY (h_integer);
743748
DEF_SVE_TYPES_ARRAY (hs_signed);
744749
DEF_SVE_TYPES_ARRAY (hs_integer);

0 commit comments

Comments
 (0)