Skip to content

Commit f209132

Browse files
xen0nchenhuacai
authored andcommitted
raid6: Add LoongArch SIMD recovery implementation
Similar to the syndrome calculation, the recovery algorithms also work on 64 bytes at a time to align with the L1 cache line size of current and future LoongArch cores (that we care about). Which means unrolled-by-4 LSX and unrolled-by-2 LASX code. The assembly is originally based on the x86 SSSE3/AVX2 ports, but register allocation has been redone to take advantage of LSX/LASX's 32 vector registers, and instruction sequence has been optimized to suit (e.g. LoongArch can perform per-byte srl and andi on vectors, but x86 cannot). Performance numbers measured by instrumenting the raid6test code, on a 3A5000 system clocked at 2.5GHz: > lasx 2data: 354.987 MiB/s > lasx datap: 350.430 MiB/s > lsx 2data: 340.026 MiB/s > lsx datap: 337.318 MiB/s > intx1 2data: 164.280 MiB/s > intx1 datap: 187.966 MiB/s Because recovery algorithms are chosen solely based on priority and availability, lasx is marked as priority 2 and lsx priority 1. At least for the current generation of LoongArch micro-architectures, LASX should always be faster than LSX whenever supported, and have similar power consumption characteristics (because the only known LASX-capable uarch, the LA464, always compute the full 256-bit result for vector ops). Acked-by: Song Liu <[email protected]> Signed-off-by: WANG Xuerui <[email protected]> Signed-off-by: Huacai Chen <[email protected]>
1 parent 8f3f06d commit f209132

File tree

5 files changed

+525
-2
lines changed

5 files changed

+525
-2
lines changed

include/linux/raid/pq.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -125,6 +125,8 @@ extern const struct raid6_recov_calls raid6_recov_avx2;
125125
extern const struct raid6_recov_calls raid6_recov_avx512;
126126
extern const struct raid6_recov_calls raid6_recov_s390xc;
127127
extern const struct raid6_recov_calls raid6_recov_neon;
128+
extern const struct raid6_recov_calls raid6_recov_lsx;
129+
extern const struct raid6_recov_calls raid6_recov_lasx;
128130

129131
extern const struct raid6_calls raid6_neonx1;
130132
extern const struct raid6_calls raid6_neonx2;

lib/raid6/Makefile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,7 @@ raid6_pq-$(CONFIG_ALTIVEC) += altivec1.o altivec2.o altivec4.o altivec8.o \
99
vpermxor1.o vpermxor2.o vpermxor4.o vpermxor8.o
1010
raid6_pq-$(CONFIG_KERNEL_MODE_NEON) += neon.o neon1.o neon2.o neon4.o neon8.o recov_neon.o recov_neon_inner.o
1111
raid6_pq-$(CONFIG_S390) += s390vx8.o recov_s390xc.o
12-
raid6_pq-$(CONFIG_LOONGARCH) += loongarch_simd.o
12+
raid6_pq-$(CONFIG_LOONGARCH) += loongarch_simd.o recov_loongarch_simd.o
1313

1414
hostprogs += mktables
1515

lib/raid6/algos.c

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -111,6 +111,14 @@ const struct raid6_recov_calls *const raid6_recov_algos[] = {
111111
#endif
112112
#if defined(CONFIG_KERNEL_MODE_NEON)
113113
&raid6_recov_neon,
114+
#endif
115+
#ifdef CONFIG_LOONGARCH
116+
#ifdef CONFIG_CPU_HAS_LASX
117+
&raid6_recov_lasx,
118+
#endif
119+
#ifdef CONFIG_CPU_HAS_LSX
120+
&raid6_recov_lsx,
121+
#endif
114122
#endif
115123
&raid6_recov_intx1,
116124
NULL

0 commit comments

Comments
 (0)