Skip to content

Conversation

yinlenree
Copy link

I have implemented vector-accelerated CRC modules (including CRC16, CRC32, and CRC64) using the RISC-V V, Zbc, Zvbc, and Zvbb instruction sets, with full functional verification and performance testing completed.

The implementation primarily leverages the vclmul.v and vclmulh.v (carry-less multiply) instructions for data folding. For big-endian processing, it additionally utilizes vrev8.v, vslideup.vi, and vslidedown.vi instructions for byte-order reversal. The final checksum is computed via Barrett reduction.

@yinlenree yinlenree force-pushed the add-crc-riscv-vector-support branch from cd1a52d to 3259ef2 Compare July 24, 2025 03:36
@pablodelara
Copy link
Contributor

Hi @yinlenree. I suggest you include commit title like
"crc: add RISC-V implementation" and then you can include your current commit message as the body.
@sunyuechi, can you review this PR? Thanks!

@sunyuechi
Copy link
Contributor

sunyuechi commented Jul 25, 2025

I'll review it carefully next week. For now, I’ve noticed two issues:

If the compiler doesn't support those -march options, the newly added CRC files will still be compiled, leading to build failures.

Some files have trailing blank lines at the end,please remove all of them.

Some files have mixed indentation using both spaces and tabs. Please make them consistent. (You can see the differences using git diff.)

@yinlenree
Copy link
Author

Understood, thank you both for the suggestions. I'll first address the compilation-related issues. This might take some time since I haven't worked on this aspect before.

@yinlenree yinlenree force-pushed the add-crc-riscv-vector-support branch 3 times, most recently from adef254 to 2379b73 Compare August 1, 2025 02:31
configure.ac Outdated
__asm__ volatile(
".option arch, +zbc\n"
"clmul zero, zero, zero\n"
"clmulh zero, zero, zero\n"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use either tabs or spaces consistently for indentation, and check other files as well.

#include <sys/auxv.h>
#include <asm/hwprobe.h> // 包含 RISC-V 硬件探测相关的宏定义(如 RISCV_HWPROBE_EXT_ZBB)
#include <unistd.h> // 提供系统调用相关声明
#include <sys/syscall.h> // 定义 __NR_riscv_hwprobe 系统调用号
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use English comments, and check other files as well.

DEFINE_INTERFACE_DISPATCHER(crc32_gzip_refl)
{
#if HAVE_RVV && HAVE_ZBC && HAVE_ZVBC && HAVE_ZVBB
struct riscv_hwprobe _probe = INIT_PROBE_STRUCT();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All these lines (from 131 to 135) are always the same throughout this file. Would be better to move it to a separate function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote a function to determine experimental extensions, while the logic for standard extensions remains unchanged. Do you think this is acceptable?
DEFINE_INTERFACE_DISPATCHER(crc16_t10dif)
{
#if HAVE_RVV && HAVE_ZBC && HAVE_ZVBC && HAVE_ZVBB
unsigned long auxval = getauxval(AT_HWCAP);
if (auxval & HWCAP_RV('V') && CHECK_RISCV_EXTENSIONS("ZVBC", "ZVBB", "ZBC")) {
return crc16_t10dif_vclmul;
}
#endif
return crc16_t10dif_base;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks tidier, yes

@yinlenree
Copy link
Author

Understood. I will revise my code and comments according to the requirements. Thank you both for the review.

@yinlenree yinlenree force-pushed the add-crc-riscv-vector-support branch from 2379b73 to 740d8a4 Compare August 6, 2025 07:18
#define EXT_CODE(ext) ( \
strcmp(ext, "ZBC") == 0 ? 7 : \
strcmp(ext, "ZVBB") == 0 ? 17 : \
strcmp(ext, "ZVBC") == 0 ? 18 : -1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use macros instead of these numbers, for example: RISCV_HWPROBE_EXT_ZBC, RISCV_HWPROBE_EXT_ZVBB, ...

static inline int check_riscv_extensions(const char **extensions, size_t count)
{
struct riscv_hwprobe _probe = INIT_PROBE_STRUCT();
syscall(__NR_riscv_hwprobe, &_probe, 1, 0, NULL, 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that using hwprobe requires checking the kernel version?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will replace the numbers with macros. Additionally, regarding the extension check, I plan to use version 6.8 as the cutoff—versions below this will not utilize the ZVBC vector acceleration.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering that the linux/version.h file may change after a kernel upgrade, making it impossible to determine whether the current kernel supports RISC-V extension macros using the LINUX_VERSION_CODE macro, I plan to directly check whether these extension macros are defined to decide the code distribution. The code is as follows.

#if defined(RISCV_HWPROBE_EXT_ZBC) && defined(RISCV_HWPROBE_EXT_ZVBB) && defined(RISCV_HWPROBE_EXT_ZVBC)
#define EXT_CODE(ext) ( \
	strcmp(ext, "ZBC")  == 0 ? RISCV_HWPROBE_EXT_ZBC  : \
	strcmp(ext, "ZVBB") == 0 ? RISCV_HWPROBE_EXT_ZVBB : \
	strcmp(ext, "ZVBC") == 0 ? RISCV_HWPROBE_EXT_ZVBC : \
	-1)
#endif
...
static inline int check_riscv_extensions(const char **extensions, size_t count)
{
#ifdef EXT_CODE
	struct riscv_hwprobe _probe = INIT_PROBE_STRUCT();
	syscall(__NR_riscv_hwprobe, &_probe, 1, 0, NULL, 0);
	for (size_t i = 0; i < count; i++) {
		if (!(_probe.value & EXT_CODE(extensions[i]))) {
			return 0;
		}
	}
	return 1;
#else
	return 0;
#endif
}

Do you think this is okay?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that the detection works by checking whether asm/hwprobe.h exists in the kernel, since these files didn’t exist before.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found that in the kernel code from versions 6.4 to 6.7, although there is an hwprobe.h file in the include/asm directory, it lacks macros for offsets of extensions like ZVBC, ZBC, etc. These macros were only defined after version 6.8.
Additionally, I noticed that the offset for the V standard extension was defined in version 6.5. Does this mean I also need to check the definition of the V extension offset?
I'm unsure whether there was an interface for detecting extensions like ZVBC in kernels from versions 6.4 to 6.7. I have little experience in this area. Do you have any suggestions?
Here's the URL for the hwprobe.h file in Linux kernel 6.7 from the official repository:
https://github.com/torvalds/linux/blob/v6.7/arch/riscv/include/asm/hwprobe.h
The macro definitions are located here.
https://github.com/torvalds/linux/blob/v6.7/arch/riscv/include/uapi/asm/hwprobe.h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can directly detect this macro, similar to how DPDK introduced hwprobe, with a minimum requirement of kernel 6.8?
https://inbox.dpdk.org/dev/[email protected]/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the bitmasks for ZVBC and ZBC were defined starting from version 6.8. I plan to add detection macros for the hwprobe.h file during the compilation phase, as well as detection macros for the bitmasks of these three extensions: ZVBC, ZVBB, and ZBC.

@yinlenree yinlenree force-pushed the add-crc-riscv-vector-support branch from fe7ba9c to 0d79c34 Compare August 22, 2025 02:44
@sunyuechi
Copy link
Contributor

Please test what happens if the kernel does not have hwprobe support, for example, by changing asm/hwprobe.h to a non-existent file. Currently, this causes a build error.

@yinlenree yinlenree force-pushed the add-crc-riscv-vector-support branch from 0d79c34 to ddd6706 Compare August 28, 2025 02:02
@yinlenree
Copy link
Author

@sunyuechi I forgot to add the conditional check before including the asm/hwprobe.h header file. I've now fixed it and optimized the code, replacing time-consuming instructions and removing the Zvbb extension.

@pablodelara
Copy link
Contributor

Could you review this PR again, @sunyuechi? I am thinking of first merging the "prefetch" PR and then this PR (after they are reviewed, of course).

@sunyuechi
Copy link
Contributor

@pablodelara Okay, I'll check it tomorrow.

#define vec_15 v15
#define vec_16 v16
#define vec_17 v17
#define vec_18 v18
Copy link
Contributor

@sunyuechi sunyuechi Sep 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#define tmp_0         t0
#define tmp_1         t1
#define tmp_2         t2
#define tmp_3         t3
#define tmp_4         t4
#define tmp_5         t5
#define vec_0         v0
#define vec_1         v1
...
#define vec_10        v10
#define vec_11        v11
#define vec_12        v12

Please remove these #define statements that do not improve readability, these t and v registers are already sufficiently clear on their own.

#define len a2

// return
#define crc_ret a0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

unused

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, understood. I will remove these redundant and unused #define statements.

@yinlenree yinlenree force-pushed the add-crc-riscv-vector-support branch from ddd6706 to a2ee08c Compare September 11, 2025 06:59
vxor.vv v0, v8, v3

addi sp, sp, -16
vse64.v v0, (sp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t5 and t6 are unused and can be utilized to eliminate stack operations.

vslideup.vi v8, v9, 1
vxor.vv v0, v8, v3

addi sp, sp, -16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t5 and t6 are unused and can be utilized to eliminate stack operations.

addi t4, t4, 16
crc_fold_512b_to_128b

addi sp, sp, -16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use t6 to reduce stack operations (and also check other files as well).

ret

.crc_fold:
# Initialize vector registers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems to be describing the purpose of vset, but it feels a bit unclear. Could we adjust it slightly?

ret

.crc_fold:
# Initialize vector registers
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment seems to be describing the purpose of vset, but it feels a bit unclear. Could we adjust it slightly?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will replace stack pointer (sp) operations with idle registers in all files and improve the readability of comments in the code.

{
#if HAVE_RVV && HAVE_ZBC && HAVE_ZVBC
unsigned long auxval = getauxval(AT_HWCAP);
if (auxval & HWCAP_RV('V') && CHECK_RISCV_EXTENSIONS("ZVBC", "ZBC")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the RISC-V manual,

The Zvknhb and Zvbc Vector Crypto Extensions — and accordingly the composite extensions Zvkn and Zvks — require a Zve64x base, or application ("V") base Vector Extension.

Since the vector instructions you are using exist in both Zve64x and V, it seems that if Zvbc is detected, there’s no need to additionally check for the V extension. Please also review whether any changes are needed in the compilation part regarding this.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it. I will try to place the detection of Zvbc before that of the V extension, so as to omit the check for the V extension. Thank you.

@yinlenree yinlenree force-pushed the add-crc-riscv-vector-support branch from a2ee08c to ade80d5 Compare September 12, 2025 08:42
@yinlenree
Copy link
Author

I have retained the stack pointer operations when calling crc32_iscsi_refl_vclmul, which are used to save the call information.

@sunyuechi
Copy link
Contributor

It seems that the check in the file crc_riscv64_dispatcher.c has not been updated.

@yinlenree
Copy link
Author

It seems that the check in the file crc_riscv64_dispatcher.c has not been updated.

Sorry, I forgot about this. Now I have removed the HAVE_RVV macro from both the dispatcher and the assembly file.

@yinlenree yinlenree force-pushed the add-crc-riscv-vector-support branch from ade80d5 to 56e655a Compare September 12, 2025 09:42
.word 0x4ba80000, 0xc01f0000, 0xd7710000, 0x5cc60000, 0xf9ad0000, 0x721a0000, 0x65740000, 0xeec30000
.word 0xa4150000, 0x2fa20000, 0x38cc0000, 0xb37b0000, 0x16100000, 0x9da70000, 0x8ac90000, 0x017e0000
.word 0x1f650000, 0x94d20000, 0x83bc0000, 0x080b0000, 0xad600000, 0x26d70000, 0x31b90000, 0xba0e0000
.word 0xf0d80000, 0x7b6f0000, 0x6c010000, 0xe7b60000, 0x42dd0000, 0xc96a0000, 0xde040000, 0x55b30000
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you extract a .h file like crc32/64?
(crc/riscv64/crc16_t10dif_vclmul.S and crc/riscv64/crc16_t10dif_copy_vclmul.S)
Since the data is completely identical, many parts of the crc_fold_loop calculation .. are also the same.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not create a .h header file for crc16_t10dif_copy because its procedural code involves memory copying compared to other algorithm implementations. I only extracted the data for the calculation. Do you think this is acceptable?

{
#if HAVE_ZBC && HAVE_ZVBC
unsigned long auxval = getauxval(AT_HWCAP);
if (auxval & HWCAP_RV('V') && CHECK_RISCV_EXTENSIONS("ZVBC", "ZBC")) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The runtime v check can also be removed.


.crc_table_loop:
lbu a4, 0(a1)
add a1, a1, 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may cause errors in certain environments. For immediates, please use the standard addi whenever possible (and please check other files as well).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I will check my code.

The CRC module of ISA-L has been accelerated using RISC-V's V, Zbc and Zvbc, instruction sets, implementing data folding and Barrett reduction optimizations.

Signed-off-by: Ji Dong <[email protected]>
@yinlenree yinlenree force-pushed the add-crc-riscv-vector-support branch from 56e655a to cf87b79 Compare September 17, 2025 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants