-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[RISCV] Modify RegMask Settings of Scalar Library Functions to Reduce Spills #163311
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[RISCV] Modify RegMask Settings of Scalar Library Functions to Reduce Spills #163311
Conversation
… Spills Co-Authored-By: buggfg <[email protected]> Co-Authored-By: ict-ql <[email protected]> Co-Authored-By: MissGou <[email protected]>
|
Thank you for submitting a Pull Request (PR) to the LLVM Project! This PR will be automatically labeled and the relevant teams will be notified. If you wish to, you can add reviewers by using the "Reviewers" section on this page. If this is not working for you, it is probably because you do not have write permissions for the repository. In which case you can instead tag reviewers by name in a comment by using If you have received no comments on your PR for a week, you can request a review by "ping"ing the PR by adding a comment “Ping”. The common courtesy "ping" rate is once a week. Please remember that you are asking for valuable time from other developers. If you have further questions, they may be answered by the LLVM GitHub User Guide. You can also ask questions in a comment on this PR, on the LLVM Discord or on the forums. |
|
There's no guarantee in the ABI that scalar function doen't use vector registers. If glibc starts allowing vector code in memcpy, memset, it will be very easy for library code to break this. Have you tried using a vector math library like sleef that contains a vectorized version of expf for RISC-V? Or have you tried modifying the cost model to not vectorize functions with scalar library calls? Is it profitable to vectorize if you have to keep extracting elements? |
|
Hi, Thanks for your reply! I’d like to clarify that our change fully follows the calling convention. First, CSR_ILP32D_LP64D_V_RegMask is based on CSR_ILP32D_LP64D_RegMask (scalar callee-saved registers) and only adds a subset of vector callee-saved registers (v1–v7, v24–v31), not all vector registers. Second, all library functions comply with the ABI design and restore callee-saved registers upon exit — even scalar functions like memset and memcpy that internally use vector instructions. Therefore, when the subtarget supports the V extension, making vector callee-saved registers available is valid. Additionally, we verified correctness on SPEC CPU2006 and encountered no problems. Happy to continue the discussion :)
|
I think the patches I've seen for vector memset and memcpy are written in assembly and don't save any save/restore.
What if the library code is compiled with gcc? Or an older version of clang/gcc that doesn't have this patch? |
|
According to the RISC-V ELF psABI documentation:
In other words, library functions must strictly follow the calling convention, saving and restoring all callee-saved registers. This behavior is defined by the ISA, and is independent of any specific compiler or version :)
|
No, it's specified by the ABI, which in the case of RISC-V is https://riscv-non-isa.github.io/riscv-elf-psabi-doc/. |
The rules you are quoting are for the vector register calling convention variant. This is for function that have vector arguments/returns or are declared with the |
I got it. Our insight is that when callees do not violate the Calling Convention Variant, using this variant can significantly reduce register spills, enabling the kernel above to achieve 44% performance improvement on the SpacemiT Key Stone K1. When statically linked, could BOLT safely remove the spill and reload instructions if the spilled registers in the callees’ assembly code are not modified? Alternatively, could introduce a compiler option to let programmers explicitly choose the Calling Convention Variant when they are confident that the callees do not violate the variant? 🙂 |
In LLVM 21.1.0, the RegMask of scalar library functions on RISC-V does not include available vector registers. This behavior may increase the likelihood of register spills, especially in functions that frequently invoke scalar library calls within vectorized loops.
For the following C code:
the corresponding RISC-V assembly for line 8 (
output[i] = x / (1.0 + expf(-x));)is as follows. The reason for the v8 spill thatexpf(a scalar library function) does not have vector registers available in its RegMask.See the online code at https://godbolt.org/z/T7f7PxYor
RISCVRegisterInfo::getCallPreservedMask function implements the setting of the library function's regmask. When we modify it to CSR_ILP32D_LP64D_V_RegMask when the architecture supports the V extension, the regmask setting for the library function returns a V-inclusive regmaskt, spill was successfully eliminated.
We modified and tested only the RISCVABI::ABI_ILP32D and ABI_LP64D cases, but we recommend modifying the remaining cases too. We performed correctness testing on Spec06 and encountered no problems.
The XSCC compiler team developed this implementation.