-
Notifications
You must be signed in to change notification settings - Fork 179
Add Tag_RISCV_mop_and_hint_encoding for instruction reinterpretation #474
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
@ved-rivos @aswaterman this is what we discussed in the mail before, but a little bit late ...:P |
|
We already have the |
Because Also for the assembler side, this could let user use lpad/sspush/sspop(and all other ss instruction) without enable zicfiss/zicfilp. The goal is make the toolchain behavior consistent, so adding a new marker let toolchain know we are using a special convention here (rather than just introduce a special rule). |
Would there be plans to incorporate some kind of framework at the ISA level? Based on the PR description, the current conflicts only arise from toolchain conventions, so I think maybe this psABI place is the right place to solve this issue? Since after all the problems arise from toolchain implementations and decisions, which is technically independent from ISA. If this is the right place to regulate the usages of these kind of "always-on" extensions, would it be better to provide a more generic framework than the currently proposed one that serves only the CFI features? |
Another case is hint instructions (e.g. Zihintpause and Zihintntl). I originally thought they were different from the CFI situation, but yeah - they’re actually pretty similar. I'm still not like about adding a separate bit for every special case, but given what you said about the potential need to reallocate opcode space, having separate bits does sound like a more future-proof approach. So let me update and made this PR more generic. |
Various RISC-V extensions reuse existing instruction encoding spaces
for specialized behaviors. CFI extensions use NOP or MOP (Maybe Operation)
instructions to ensure correct program execution even when hardware
doesn't implement CFI. Similarly, hint extensions redefine specific
encodings for performance optimization purposes.
However, this design conflicts with current toolchain conventions,
where X extension instructions are only usable when the X extension
is enabled: assemblers accept corresponding mnemonics and disassemblers
decode instructions only when the extension is active.
The affected use cases require different behavior:
- Zicfiss instructions should be available when Zimop extension is present
- Zicfilp instructions should be usable even without explicit enabling
(since auipc is in baseline ISA)
- Hint extensions should reinterpret specific encodings regardless of
extension enablement
Following conventional toolchain behavior would force compilers to
generate generic instructions ('mop', 'fence w,0', 'auipc x0, <value>')
instead of extension-specific mnemonics ('sspush/sspop', 'pause',
'lpad <value>'), causing user confusion and reducing code clarity.
While this could be addressed at the ISA specification level, such
changes involve complex and time-consuming procedures. Therefore,
this psABI defines Tag_RISCV_mop_and_hint_encoding to provide toolchain
implementations with clear guidance.
This tag uses a bitmap format where each bit indicates whether specific
instruction encodings should be reinterpreted:
- Bit 0: auipc x0, <value> as lpad <value> (Zicfilp)
- Bit 1: Zimop encodings as Zicfiss instructions
- Bit 2: fence w,0 as pause hints (Zihintpause)
- Bit 3: specific encodings as non-temporal locality hints (Zihintntl)
The tag uses bitwise OR merge policy to allow combining multiple
reinterpretation requirements, with toolchain warnings recommended
when encoding space conflicts occur.
632954d to
54ac0a2
Compare
|
Changes:
|
|
Sounds like a good idea. Suggestion1: Instead of Suggestion2: Keep the highest bit reserved and consume/set it when we run out of 127 bits :-) |
I get your point about Tag_RISCV_Insn_Reinterpret being more general.
uleb128 encoding is variable length encoding, so we have infinite bits can be use in theory. |
|
Change:
|
|
Actually I would also suggest the name of Tag_insn_reinterpret if @ deepak0414 had not made the suggestion. In my opinion, the "mop_and_hint_encoding" name is too limiting. Take Zicfilp LPAD insn as the negative case: it's not encoded in MOP and also not a hint[1] but will be affected by "mop_and_hint_encoding". The name doesn't really suit the scenario here. Please do consider the name, even though it may leave some unintended juggling space. [1]:
|
|
@mylai-mtk lpad is hint according the spec I think? https://github.com/riscv/riscv-isa-manual/blob/main/src/rv32.adoc#hint-instructions |
If LPAD is a HINT, then I'm confused. LPAD surely modifies architecturally visible state (the ELP bit in CSRs), and implementation that implements it are not allowed to ignore it in that implementation must check reg x7 == uimm20 or raise an exception. I would argue that LPAD does not really qualify as a HINT insn, even though it uses a HINT insn encoding point. It's a loop-hole in the ISA description about HINTs. I think the problem is: HINT encoding points are not necessarily used as HINT insns (HINT insns do not change any architecturally visible state, except for advancing the pc and any applicable performance counters, and could be ignored by implementation). These HINT encoding points could be used to encode extension insns that have architecturally observable effects and thus could not be ignored if the implementation claims to implement the extension. But anyway, at least the |
|
In the last psABI meeting, we discussed this issue and the conclusion was to NOT add this tag, but instead handle it directly in the toolchain. So the next step is to implement it in the upstream open-source toolchain and continue discussion there. This PR will stay open until the upstream side is resolved. |
Excuse me, but I need a clearer description than this. The quoted description does not reveal or hint at the proposed way to drive this PR's vision forward.
I guess the "problem" here is this PR's vision? (which I would write down as "in assembler/disassembler, to encode/display MOP/hint insns as if a specific ISA extension is enabled, even though that ISA extension is not formally enabled and not listed in Tag_RISCV_arch"). If so, by saying "avoid giving the impression that this problem has solution", I assume that in the psABI meeting, the conclusion is that this "problem" (or as I'd like to call it, "feature") would not be addressed by the standard, i.e. it has no standard "solution", and the toolchain could do whatever they want with it, which includes the options of not implementing the "feature" at all or implementing it with some non-standard (but perhaps agreed by both gnu binutils and llvm) method. Am I correct in this understanding? If my understanding is correct, then "the next step is to implement it in the upstream" means not/stop implementing this "feature" at all or implementing it in whatever way the toolchain likes it?
Does this mean that in the meeting, the conclusion is that MOP/hint encoding spaces could not be reused? |
|
I publish public minutes from all psABI meetings; in this case, you can read a summary of the conversation at https://github.com/riscv-admin/psabi/blob/master/MINUTES/2025/meeting-20250911.adoc#status-update-for-cfi-related-prs |
|
After reading the meeting minutes, I'll answer my own questions:
The MOP/hint encoding spaces are strongly discouraged to be reused. The ecosystem (toolchains included) would not guarantee correctness in the case of reuses. ("Philip: Will go stronger. Changing it will not work.")
Since encoding space reuses are not welcomed, the meeting minute recommends implementing the "feature" as always-on, i.e. these encodings are always interpreted with the implicit (not listed in Tag_RISCV_arch) ISA extensions. ("Philip: Should just always enable recognising e.g. LPAD in the assembler and disassembler, or at least on by default. ; Sam: Agree with that. ;") (The meeting minutes are just dialogues from the meeting, and is largely incomplete in the sense of context and conclusion, but I think my understanding based on the reading aligns with the conclusion reached in the meeting as relayed by @ kito-cheng .) |
Various RISC-V extensions reuse existing instruction encoding spaces
for specialized behaviors. CFI extensions use NOP or MOP (Maybe Operation)
instructions to ensure correct program execution even when hardware
doesn't implement CFI. Similarly, hint extensions redefine specific
encodings for performance optimization purposes.
However, this design conflicts with current toolchain conventions,
where X extension instructions are only usable when the X extension
is enabled: assemblers accept corresponding mnemonics and disassemblers
decode instructions only when the extension is active.
The affected use cases require different behavior:
(since auipc is in baseline ISA)
extension enablement
Following conventional toolchain behavior would force compilers to
generate generic instructions ('mop', 'fence w,0', 'auipc x0, ')
instead of extension-specific mnemonics ('sspush/sspop', 'pause',
'lpad '), causing user confusion and reducing code clarity.
While this could be addressed at the ISA specification level, such
changes involve complex and time-consuming procedures. Therefore,
this psABI defines Tag_RISCV_mop_and_hint_encoding to provide toolchain
implementations with clear guidance.
This tag uses a bitmap format where each bit indicates whether specific
instruction encodings should be reinterpreted:
The tag uses bitwise OR merge policy to allow combining multiple
reinterpretation requirements, with toolchain warnings recommended
when encoding space conflicts occur.