You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* Extract AVXVNNI from SapphireRapids
* Add AMD Zen5 target
* Enhance AVX_VNNI support in x86 code generation and tests
* Enhance Zen3/4 identification
* Add AVXVNNI feature flag for Zen5 and SP target completion
* Adjust Zen5 target return value based on LLVM version
* Add Zen4/5 tuning options for python bindings
* Update LLVM version check for Zen4/5
halide_target_feature_no_asserts, ///< Disable all runtime checks, for slightly tighter code.
1361
1361
halide_target_feature_no_bounds_query, ///< Disable the bounds querying functionality.
1362
1362
1363
-
halide_target_feature_sse41, ///< Use SSE 4.1 and earlier instructions. Only relevant on x86.
1364
-
halide_target_feature_avx, ///< Use AVX 1 instructions. Only relevant on x86.
1365
-
halide_target_feature_avx2, ///< Use AVX 2 instructions. Only relevant on x86.
1366
-
halide_target_feature_fma, ///< Enable x86 FMA instruction
1367
-
halide_target_feature_fma4, ///< Enable x86 (AMD) FMA4 instruction set
1368
-
halide_target_feature_f16c, ///< Enable x86 16-bit float support
1363
+
halide_target_feature_sse41, ///< Use SSE 4.1 and earlier instructions. Only relevant on x86.
1364
+
halide_target_feature_avx, ///< Use AVX 1 instructions. Only relevant on x86.
1365
+
halide_target_feature_avx2, ///< Use AVX 2 instructions. Only relevant on x86.
1366
+
halide_target_feature_avxvnni, ///< Enable the AVX-VNNI features supported by AVX2 instructions. Supports 256-bit VNNI instructions without EVEX encoding.
1367
+
halide_target_feature_fma, ///< Enable x86 FMA instruction
1368
+
halide_target_feature_fma4, ///< Enable x86 (AMD) FMA4 instruction set
1369
+
halide_target_feature_f16c, ///< Enable x86 16-bit float support
1369
1370
1370
1371
halide_target_feature_armv7s, ///< Generate code for ARMv7s. Only relevant for 32-bit ARM.
1371
1372
halide_target_feature_no_neon, ///< Avoid using NEON instructions. Only relevant for 32-bit ARM.
halide_target_feature_avx512_skylake, ///< Enable the AVX512 features supported by Skylake Xeon server processors. This adds AVX512-VL, AVX512-BW, and AVX512-DQ to the base set. The main difference from the base AVX512 set is better support for small integer ops. Note that this does not include the Knight's Landing features. Note also that these features are not available on Skylake desktop and mobile processors.
1410
1411
halide_target_feature_avx512_cannonlake, ///< Enable the AVX512 features expected to be supported by future Cannonlake processors. This includes all of the Skylake features, plus AVX512-IFMA and AVX512-VBMI.
1411
1412
halide_target_feature_avx512_zen4, ///< Enable the AVX512 features supported by Zen4 processors. This include all of the Cannonlake features, plus AVX512-VNNI, AVX512-BF16, and more.
1413
+
halide_target_feature_avx512_zen5, ///< Enable the AVX512 features supported by Zen5 processors. This include all of the Cannonlake features, plus AVX512-VNNI, AVX512-BF16, AVX-VNNI and more.
1412
1414
halide_target_feature_avx512_sapphirerapids, ///< Enable the AVX512 features supported by Sapphire Rapids processors. This include all of the Zen4 features, plus AVX-VNNI and AMX instructions.
1413
1415
halide_target_feature_trace_loads, ///< Trace all loads done by the pipeline. Equivalent to calling Func::trace_loads on every non-inlined Func.
1414
1416
halide_target_feature_trace_stores, ///< Trace all stores done by the pipeline. Equivalent to calling Func::trace_stores on every non-inlined Func.
0 commit comments