Skip to content

Conversation

@lenary
Copy link
Contributor

@lenary lenary commented Jan 10, 2025

These options allow users better control of when the assembler should turn specific instructions into their smaller equivalents, without having to change the enabled architectures.

A prototype LLVM implementation is available here: llvm/llvm-project#122483

These options allow users better control of when the assembler should
turn specific instructions into their smaller equivalents, without
having to change the enabled architectures.

Signed-off-by: Sam Elliott <[email protected]>
@psherman42
Copy link

Do we really need another option, when there are already existing options and best practices?

.option push
.option norvc
...
.option pop

@lenary
Copy link
Contributor Author

lenary commented Jan 11, 2025

Do we really need another option, when there are already existing options and best practices?

I put more motivation in the LLVM message, for which I apologise, I was writing both messages at the same time. From that motivation:


This will become more useful as the following things happen:

  • .option norvc is deprecated/removed (which is sometimes used for this purpose).
  • Extensions are added where the destination instruction cannot be disabled separately to the source instruction, either because the destination is in the base architecture, or because it is in the same extension as the source.
  • Extensions wider than 32-bits are added, which make CompressPats [the feature used to implement this in LLVM] more complex to use intuitively, especially if the destination is a 32-bit instruction.

This document says the following about .option norvc, which suggests it is not a best practice:

This option will be deprecated soon after .option arch has been widely implemented on main stream open source toolchains.

Another problem with .option (no)rvc is that it is stated to only disable/enable the C extension, and how it interacts with e.g. Zca/Zcf/Zcd is underspecified, not very obvious, and complex.

In short, yes I do think we need an option for controlling this feature.

I'll note that the proposed LLVM implementation correctly works with .option push and .option pop, which is maybe not obvious from the diff.

@jrtc27
Copy link
Contributor

jrtc27 commented Jan 15, 2025

How does this interact with linker relaxation, which can also convert certain uncompressed instructions into compressed equivalents? That is, is this intended to be purely for the assembler, or is it also intended to constrain the linker's relaxations, and if so, how?

@lenary
Copy link
Contributor Author

lenary commented Jan 16, 2025

How does this interact with linker relaxation, which can also convert certain uncompressed instructions into compressed equivalents? That is, is this intended to be purely for the assembler, or is it also intended to constrain the linker's relaxations, and if so, how?

Thanks for bringing this up. I see two main choices:

  • This does nothing for relaxation, it's an assembler-only option.
  • This disables relaxation.

I am tending towards thinking the second is clearer. This option can be explained as "ensure exactly the instructions I wrote are used", which I think means it has to disable relaxation, or else you might still end up with not exactly what you wrote in the final binary.

By this explanation, the option might need to disable the Branch Pseudos as well (which replace a short conditional branch with a short (inverted) conditional branch and a longer jump).

I've covered neither of these options in my prototype so far, but I can update it if we think this is a reasonable direction.

@aswaterman
Copy link
Collaborator

This option can be explained as "ensure exactly the instructions I wrote are used"

Don't forget that it cuts the other way, too. The assembler will expand beqz t0, far into bnez t0, 1f; j far; 1:; suppressing that behavior will result in an assembler or linker error. Of course, this behavior isn't compression, it's expansion, and so one could argue it shouldn't be governed by an option named "autocompress".

@lenary
Copy link
Contributor Author

lenary commented Jan 17, 2025

This option can be explained as "ensure exactly the instructions I wrote are used"

Don't forget that it cuts the other way, too. The assembler will expand beqz t0, far into bnez t0, 1f; j far; 1:; suppressing that behavior will result in an assembler or linker error. Of course, this behavior isn't compression, it's expansion, and so one could argue it shouldn't be governed by an option named "autocompress".

That's what I meant by "Branch Pseudos as well (which replace a short conditional branch with a short (inverted) conditional branch and a longer jump)."

In discussion yesterday, it was noted there are other complex pseudos that can emit more than one instruction (call and li for example). I don't see as much of an issue with these other pseudos as they do not share mnemonics with real instructions.

I intend to update the proposal wording in this direction next week.

I didn't hear much dissent about the proposed direction (including the disabling of relaxation and long branch pseudo instructions) in yesterday's RISC-V toolchain SIG and LLVM meetings, but I realise this is still a new proposal.

I wonder if a better name might be (no)replace? It might be too close to (no)relax, but I'm not sure.

I welcome more feedback about this proposal.

@aswaterman
Copy link
Collaborator

Off the top of my head, I don’t have a great naming suggestion, but I do think the name should encompass the fact that we’re disabling these branch relaxations, too. “Resize” is the first verb that comes to mind that subsumes “compress” and “expand”, but it’s not especially descriptive.

@lenary
Copy link
Contributor Author

lenary commented Jan 24, 2025

I spent too long thinking about the name since last week.

Right now, we have noautocompress/autocompress, but I'm working on extending the prototype to disable relaxation, and disable the branch pseudos. I think that noresize/resize, and noreplace/replace are too similar to norelax/relax.

Here are some suggestions I'm happier with though, which flip the negative/positive versions:

  • exact/noexact (where noexact is the default), to mean "exactly what is written"
  • fixed/nofixed (where nofixed is the default), to mean "fix the size of what was written"

I'm leaning towards exact/noexact. I think both of these are far enough away from existing options that they won't be confused either.

@aswaterman
Copy link
Collaborator

What about noautoresize or similar? If the main goal is to make it look visibly dissimilar to norelax, then the additional "auto" suffices.

@psherman42
Copy link

The problem with exact, fixed, resize, and replace is that they are all relative to something. The use of no on top of auto on top of re makes a triple-negative that is excrutiatingly hard to interpret.
We should be a bit more absolute and concrete. This proposal is about allowing or preventing transformation of instructions from one shape (full 32-bit) style to another shaoe (lesser 16-bit, half) style.
Instead of asymmetric pair of options, one prefixed with with no and the other with no prefix, can we use two self-complementary terms? Something like fullsize and halfsize? We can even be brutally verbose with allow16bitinst and deny16bitinst.
Somthing like this is much more clear, simple, and elegant as Mona Lisa herself.

@lenary
Copy link
Contributor Author

lenary commented Jan 24, 2025

This proposal is about allowing or preventing transformation of instructions from one shape (full 32-bit) style to another shaoe (lesser 16-bit, half) style.

This is not correct. As I said, I'm looking further ahead, to longer-than-32-bit instructions, which some custom extensions are already defining, as well as ensuring that we get exactly one 32-bit conditional branch rather than a conditional branch and a jump.

I do sort-of agree with the layering of "no" and "auto" being fairly confusing, so want to get away from that.

I'm going to proceed with exact/noexact, on the basis that "assemble exactly what was written" is a reasonable name, and isn't, in my opinion, "relative". I think the no prefix matches what we've done for all the other options, even if notexact might be more grammatical.

I don't expect noexact to be used very much, instead people will mostly use .option push; .option exact; ...; .option pop as noexact - the same behaviour as today - will be the default.

@psherman42
Copy link

exact is, in fact, relative. It is relative to what was written.

I still feel much better choices are
fullsize / halfsize allow16bitinst/deny16bitinst`
... or something similar.

I kind of like that unconfusing conjugal pair of verbs allow and deny.

@jrtc27
Copy link
Contributor

jrtc27 commented Jan 24, 2025

allow/deny is out of place with all the existing .option boolean toggles.

fullsize/halfsize/allow16bitinst/deny16bitinst does not encompass the more-than-two instruction sizes that exist once you include 48-bit instructions, as this proposal is trying to do.

(no)exact is the best name I've heard so far, in my opinion.

@asb
Copy link
Contributor

asb commented Jan 30, 2025

I'm going to proceed with exact/noexact, on the basis that "assemble exactly what was written" is a reasonable name, and isn't, in my opinion, "relative". I think the no prefix matches what we've done for all the other options, even if notexact might be more grammatical.

Would this also disable the more trivial single instruction pseudos? e.g. add a0, a0, 1 -> addi a0, a0 1. I know it's irrelevant to your compression concerns, but feels like it's something that "exact" might imply.

@lenary
Copy link
Contributor Author

lenary commented Feb 13, 2025

I'm going to proceed with exact/noexact, on the basis that "assemble exactly what was written" is a reasonable name, and isn't, in my opinion, "relative". I think the no prefix matches what we've done for all the other options, even if notexact might be more grammatical.

Would this also disable the more trivial single instruction pseudos? e.g. add a0, a0, 1 -> addi a0, a0 1. I know it's irrelevant to your compression concerns, but feels like it's something that "exact" might imply.

Sorry, I forgot to get back to you on this. I don't mind either way on these pseudos, I don't think they're really used much by people writing assembly (except inline assembly), so I don't think it's a problem if we disable them. I'm not expecting to provide .option exact to C/C++ anyway.

@lenary lenary changed the title Proposal: Add Autocompress Control Proposal: Exact Mode (Compression and Relaxation Control) Feb 13, 2025
@a4lg
Copy link

a4lg commented Feb 13, 2025

Seems good in general. If someone haven't posted a patch set for GNU Binutils, I'll make a prototype implementation of this proposal.

@lenary
Copy link
Contributor Author

lenary commented Feb 13, 2025

Seems good in general. If someone haven't posted a patch set for GNU Binutils, I'll make a prototype implementation of this proposal.

That sounds fantastic to me! Thanks

- Branch Relaxation turns short branches of too long or unknown range into code
sequences with a longer range. For example `beq a0, a1, sym` will be turned
into `bne a0, a1, 4; j sym` because `j` has a longer range than `beq`.
- The assembler may accept the wrong mnemonic for an instruction because the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I agree with disabling allowing add in place of addi. If a user is looking for a strict mode on mnemonics like this they would need to apply it to the whole program for it be truly useful. In that use case they probably don't want to disable compression and branch relaxation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. As I said, I don't mind either way on these mnemonics, so I can remove this point.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Recent discussion reminded me I hadn't looped back here)

On reflection, I don't mind too much either. I agree that users wanting a "strict mode" probably do still want branch relaxation and compression. I think logically "exact" meaning "exactly what is written" is an intuitive definition even if allowing/disallowing pseudos like add with immediate arguments isn't that important either way. But I don't think it's going to impact people in any meaningful way if the pseudos are still allowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that users wanting a "strict mode" probably do still want branch relaxation and compression.

Can you clarify this statement? This reads to me as "a strict mode should keep branch relaxation and compression enabled", which is not my intention.

I think logically "exact" meaning "exactly what is written" is an intuitive definition even if allowing/disallowing pseudos like add with immediate arguments isn't that important either way. But I don't think it's going to impact people in any meaningful way if the pseudos are still allowed.

I think the add for addi is not going to impact users, and I will shortly push a patch to remove (only) this bullet.

I think that disabling branch relaxation could impact users relying on the ranges of the far branches, but if they opt into an exact mode then they should know this might happen.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify this statement? This reads to me as "a strict mode should keep branch relaxation and compression enabled", which is not my intention.

Sorry to be confusing, I was referring to Craig's comment up above that points to the idea that to the extent there is a need/demand for a "strict mode", it's probably different to what is implemented in this PR. i.e. it keeps branch relaxation and compression. Which of course is not the intention of the mode introduced in this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for clarifying. I had sort of aliased "strict mode" and "exact mode" in my head but I can see how a hypothetical strict mode is not the same as this proposal.

@MaskRay
Copy link

MaskRay commented Feb 16, 2025

exact/noexact, which mean "exactly what is written" looks good to me.

@fpetrogalli
Copy link

This is a useful option, thank you for putting up the proposal.

@lenary
Copy link
Contributor Author

lenary commented Mar 18, 2025

I'll proceed with implementing the wording as of commit cb73122 in LLVM, more feedback is of course welcome.

@kito-cheng
Copy link
Collaborator

@Nelson1225

@a4lg
Copy link

a4lg commented May 12, 2025

@kito-cheng @Nelson1225
Okay, I think I managed to fix the bug.
There's a logic (added later) to remove a relocation when the relaxation is disabled and the address is relative to a local symbol but there was a side-effect to disable offset checking performed on the linker side. So, I just added offset-checking when the relocation removal occurs.

Submitting a RFC patch set in a couple of days (this is due to the fact that I forgot a lot about my in-house tools to finish/format patch sets).

@Nelson1225
Copy link

Nelson1225 commented May 12, 2025

The .option relax/norelax is only for linker relaxations. The assembler branch conversion/relaxation doesn't belongs to the linker relaxation, they are different. The .option exact/noexact seems only work for the assembler branch conversion/relaxation, and try to let assembler keep the original branch or not, this makes sense.

@Nelson1225
Copy link

Nelson1225 commented May 12, 2025

Sorry for keeping you all waiting. I almost finished on GNU Binutils implementation (rejecting alias forms that will change the length of the instruction works fine) but with a severe bug on processing branches preserving the length/encoding (even without implementing the exact mode): relocations accept out-of-range offsets if the symbol is resolved inside the assembler (i.e. if no relocations are emitted for the linker) and upper bits are silently ignored (not generating errors).

For instance, GNU Binutils (vanilla) generates invalid jal opcode without warnings with existing .option norelax:

.option norelax
j   .+0x200000

... which turns into:

0000000000000000 <.text>:
   0:   0000006f                jal     zero,0 <.text>

There's a logic (added later) to remove a relocation when the relaxation is disabled and the address is relative to a local symbol but there was a side-effect to disable offset checking performed on the linker side. So, I just added offset-checking when the relocation removal occurs.

Yeah, It seems that we need VALID_JTYPE_IMM before we write it back to the code, and relative branches...
https://github.com/bminor/binutils-gdb/blob/master/gas/config/tc-riscv.c#L4849

Thanks, cool ;)

.extern some_symbol
.option norelax
j   some_symbol+0x200000

This make sense since some_symbol is unknown, so don't know if the offset will be overflow. If user wants got, then they should use call rather than j or jal.

@a4lg
Copy link

a4lg commented May 12, 2025

@Nelson1225

Yeah, It seems that we need VALID_JTYPE_IMM before we write it back to the code, and relative branches... https://github.com/bminor/binutils-gdb/blob/master/gas/config/tc-riscv.c#L4849

Yup, that's exactly what I've fixed in my branch and thanks for letting me know about VALID_JTYPE_IMM that would simplify the fix.

@lenary
Copy link
Contributor Author

lenary commented May 13, 2025

The .option relax/norelax is only for linker relaxations. The assembler branch conversion/relaxation doesn't belongs to the linker relaxation, they are different. The .option exact/noexact seems only work for the assembler branch conversion/relaxation, and try to let assembler keep the original branch or not, this makes sense.

They are specifically not separate, as written. .option exact is supposed to be end-to-end - so should disable linker relaxation the conventional way (not emitting R_RISCV_RELAX markers) as well as disabling branch conversion/relaxation and compression.

The idea is that the linked output (executable/library) should get exactly what's written (but that instruction might have relocated fields).

a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 13, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122>.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

The main purpose of this mode is to emit desired machine code as the
user writes, assuming the user knows constraints of their code.  So,
macros like "li" (known to be expanded into possibly complex sequences)
without single instruction encoding are not fully aware of this mode.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively).

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
@a4lg
Copy link

a4lg commented May 13, 2025

Initial submission of the exact mode implementation for GNU Binutils is out!

Edit (2025-05-17): now linked to PATCH v7.

  1. PATCH v7 0/1 (cover letter)
  2. PATCH v7 1/1 (exact mode implementation and corresponding assembler directives)

a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 14, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122>.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

The main purpose of this mode is to emit desired machine code as the
user writes, assuming the user knows constraints of their code.  So,
macros like "li" (known to be expanded into possibly complex sequences)
without single instruction encoding are not fully aware of this mode.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively).

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 14, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122> and
already implemented in LLVM.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

The main purpose of this mode is to emit desired machine code as the
user writes, assuming the user knows constraints of their code.  So,
macros like "li" (known to be expanded into possibly complex sequences)
without single instruction encoding are not fully aware of this mode.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively).

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 14, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122> and
already implemented in LLVM.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

The main purpose of this mode is to emit desired machine code as the
user writes, assuming the user knows constraints of their code.  So,
macros like "li" (known to be expanded into possibly complex sequences)
without single instruction encoding are not fully aware of this mode.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively).

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 14, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122> and
already implemented in LLVM.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

The main purpose of this mode is to emit desired machine code as the
user writes, assuming the user knows constraints of their code.  So,
macros like "li" (known to be expanded into possibly complex sequences)
without single instruction encoding are not fully aware of this mode.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively)
but considered flaky and strongly discouraged from using both.

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 14, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122> and
already implemented in LLVM.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

The main purpose of this mode is to emit desired machine code as the
user writes, assuming the user knows constraints of their code.  So,
macros like "li" (known to be expanded into possibly complex sequences)
without single instruction encoding are not fully aware of this mode.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively)
but considered flaky and strongly discouraged from using both.

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 16, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122> and
already implemented in LLVM.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

The main purpose of this mode is to emit desired machine code as the
user writes, assuming the user knows constraints of their code.  So,
macros like "li" (known to be expanded into possibly complex sequences)
are not guaranteed to be fully aware of this mode.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively)
but considered flaky and strongly discouraged from using both.

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
Copy link

@MaskRay MaskRay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! (I can't approve. I'll leave approval to maintainers)


=== `exact`/`noexact`

In RISC-V, the assembler and linker can do several things to change the code
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel that the text is overly verbose. Grok suggests a concise one:

In RISC-V, the assembler and linker may modify user code to optimize the final executable:

  • Compression: Converts longer instructions to shorter equivalents, e.g., lw a0, 16(a1) to c.lw a0, 16(a1) with C or Zca extensions.
  • Linker Relaxation: Replaces long symbol references with shorter sequences (see psABI document).
  • Branch Relaxation: Converts short branches with insufficient range, e.g., beq a0, a1, sym to bne a0, a1, 4; j sym for longer range.

Programmers may want unmodified assembly, but only linker relaxation can be disabled with .option norelax. Compression requires disabling extensions like .option norvc, which restricts smaller instructions and affects larger ones.

The .option exact prevents assembler or linker modifications, controlling both without altering enabled extensions, allowing flexible instruction lengths.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know if RISC-V International takes a view at the moment on using LLM-derived content in its specifications. Probably best not to get into that.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the bullet points from Sam's version are clear enough. Maybe Sam can rework / reduce a bit the 4 paragraphs that follow the bullet points with a bullet point for the limitations of no-relax and norvc options and the flexibility introduced with the exact option, to address Maskray's concern.

a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 16, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122> and
already implemented in LLVM.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

The main purpose of this mode is to emit desired machine code as the
user writes, assuming the user knows constraints of their code.  So,
macros like "li" (known to be expanded into possibly complex sequences)
are not guaranteed to be fully aware of this mode.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively)
but considered flaky and strongly discouraged from using both.

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 17, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122> and
already implemented in LLVM.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

Macros like "li" (known to be expanded into possibly complex sequences)
may still expand to complex instruction sequences but at least each
instruction emitted by macros is still subject to the behavior above.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively)
but considered flaky and strongly discouraged from using both.

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/li32.s: Enable exact mode by external option.
	* testsuite/gas/riscv/li64.s: Likewise.
	* testsuite/gas/riscv/exact-li32.d: li32.d but enable exact mode
    to make sure that no automatic instruction compression occurs.
	* testsuite/gas/riscv/exact-li64.d: Likewise.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 26, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122> and
already implemented in LLVM.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

Macros like "li" (known to be expanded into possibly complex sequences)
may still expand to complex instruction sequences but at least each
instruction emitted by macros is still subject to the behavior above.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively)
but considered flaky and strongly discouraged from using both.

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/li32.s: Enable exact mode by external option.
	* testsuite/gas/riscv/li64.s: Likewise.
	* testsuite/gas/riscv/exact-li32.d: li32.d but enable exact mode
    to make sure that no automatic instruction compression occurs.
	* testsuite/gas/riscv/exact-li64.d: Likewise.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 26, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122> and
already implemented in LLVM.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

Macros like "li" (known to be expanded into possibly complex sequences)
may still expand to complex instruction sequences but at least each
instruction emitted by macros is still subject to the behavior above.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively)
but considered flaky and strongly discouraged from using both.

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/li32.s: Enable exact mode by external option.
	* testsuite/gas/riscv/li64.s: Likewise.
	* testsuite/gas/riscv/exact-li32.d: li32.d but enable exact mode
    to make sure that no automatic instruction compression occurs.
	* testsuite/gas/riscv/exact-li64.d: Likewise.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
a4lg added a commit to a4lg/binutils-gdb that referenced this pull request May 29, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122> and
already implemented in LLVM.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

Macros like "li" (known to be expanded into possibly complex sequences)
may still expand to complex instruction sequences but at least each
instruction emitted by macros is still subject to the behavior above.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively)
but considered flaky and strongly discouraged from using both.

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/li32.s: Enable exact mode by external option.
	* testsuite/gas/riscv/li64.s: Likewise.
	* testsuite/gas/riscv/exact-li32.d: li32.d but enable exact mode
    to make sure that no automatic instruction compression occurs.
	* testsuite/gas/riscv/exact-li64.d: Likewise.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
@a4lg
Copy link

a4lg commented May 29, 2025

@Nelson1225 @kito-cheng
Ping!

The patch set implementing those assembler directives to GNU Binutils (PATCH v8; only containing a grammar fix compared to PATCH v7) is out (since I received no responses for nearly two weeks, I thought I'd better send a ping).

  1. PATCH v8+PING 0/1
  2. PATCH v8+PING 1/1

directly writing the smaller instructions, and will have further issues with
larger-than-32-bit instructions that compress to instructions in the base
architecture.
`.option exact` can be seen as a version of `.option relax` which also affects
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer .option norelax rather than .option relax. Otherwise looks fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree this would be better. Done.

a4lg added a commit to a4lg/binutils-gdb that referenced this pull request Jul 12, 2025
This commit adds two assembler directives: ".option exact" and
".option noexact" (enable/disable the exact mode) as discussed in
<riscv-non-isa/riscv-asm-manual#122> and
already implemented in LLVM.

When the exact mode is enabled,

1.  Linker relaxations are turned off,
2.  Instruction aliases that will change the encoding from the
    (likely non-alias) instruction with the same name are disabled
    (e.g. "addi" will never turn into "c.addi" even if optimizable) and
3.  Assembler relaxation of branch instructions are disabled
    (e.g. "blt" with a long offset will not turn into "bge + j").

Macros like "li" (known to be expanded into possibly complex sequences)
may still expand to complex instruction sequences but at least each
instruction emitted by macros is still subject to the behavior above.

Currently, interactions between ".option relax/norelax" and
".option exact/noexact" are designed to be LLVM-compatible (i.e.
".option exact/noexact" imply ".option norelax/relax", respectively)
but considered flaky and strongly discouraged from using both.

cf. <llvm/llvm-project#122483>

gas/ChangeLog:

	* config/tc-riscv.c (struct riscv_set_options): Add exact option.
	(RELAX_BRANCH_ENCODE): Encode exact option.
	(RELAX_BRANCH_LENGTH): Reflect RELAX_BRANCH_ENCODE changes.
	(RELAX_BRANCH_EXACT): New predicate macro.
	(relaxed_branch_length): Handle exact mode cases.
	(append_insn): Pass exact option to RELAX_BRANCH_ENCODE.
	(riscv_ip): Skip instructions that would change the encoding
	when the exact mode is enabled.
	(s_riscv_option): Parse ".option exact" and ".option noexact"
	assembler directives.
	* doc/c-riscv.texi: Document new assembler directives.
	* testsuite/gas/riscv/exact.s: Test exact mode basics.
	* testsuite/gas/riscv/exact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local.s: Test conditional
	branches and unconditional jumps relative to a local symbol.
	* testsuite/gas/riscv/exact-branch-local-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-ok.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.d: Ditto.
	* testsuite/gas/riscv/exact-branch-local-exact-fail.l: Ditto.
	* testsuite/gas/riscv/exact-branch-extern.s: Test conditional
	branches and unconditional jumps relative to an external symbol.
	* testsuite/gas/riscv/exact-branch-extern-noexact.d: Ditto.
	* testsuite/gas/riscv/exact-branch-extern-exact.d: Ditto.
	* testsuite/gas/riscv/li32.s: Enable exact mode by external option.
	* testsuite/gas/riscv/li64.s: Likewise.
	* testsuite/gas/riscv/exact-li32.d: li32.d but enable exact mode
    to make sure that no automatic instruction compression occurs.
	* testsuite/gas/riscv/exact-li64.d: Likewise.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.s: Use exact
	mode to test various configurations and instructions.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.d: Ditto.
	* testsuite/gas/riscv/no-relax-branch-offset-fail.l: Ditto.

include/ChangeLog:

	* opcode/riscv.h (INSN_NON_EXACT): New flag to represent aliases
	to reject on the exact mode.

opcodes/ChangeLog:

	* riscv-opc.c (riscv_opcodes): Add INSN_NON_EXACT flag to all
	instructions that should be rejected on the exact mode.
@apazos
Copy link
Collaborator

apazos commented Aug 27, 2025

LLVM and GAS patches have been merged and the text has been updated, I think this is ready to be merged, please @kito-cheng and @lenary confirm.

@a4lg
Copy link

a4lg commented Aug 28, 2025

@apazos No, GAS approval has stalled somehow. Besides that I have been busy for non-RISC-V things, I think I need to talk people involved from my side.

@apazos
Copy link
Collaborator

apazos commented Aug 28, 2025

thanks for the update, @a4lg

@a4lg
Copy link

a4lg commented Aug 28, 2025

Current status on GAS

There's tentative approval from Nelson Chu (@Nelson1225) if there's no regression but since this field is also maintained by Jan Beulich so I'm asking whether he's okay with the changes. However, Jan's not responding for some reason.

My Actions Planned on This Week + Next Week

  1. I will directly email to Jan to make sure that there's no technical e-mail problems and ask for opinions.
  2. I will talk to Nelson (again) to ask whether his tentative approval alone is sufficient to make progress (merge things) once no regression is confirmed on riscv-gnu-toolchain.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.