Skip to content

Conversation

@theotherjimmy
Copy link
Contributor

Previously, cranelift did not emit memory operations using 20 bit immedates for reg + reg memory modes on s390x. This patch enables that in a type safe manner by exporting a few extractors for 20 bit immediates and changing the argument to the MemArg constructor for reg + reg addressing mode to accept that 20 bit offset instead of the u8 offset.

@theotherjimmy theotherjimmy requested a review from a team as a code owner December 24, 2025 15:30
@theotherjimmy theotherjimmy requested review from abrown and removed request for a team December 24, 2025 15:30
@theotherjimmy theotherjimmy force-pushed the s390x-bxd20 branch 2 times, most recently from 3a04e06 to 72377e6 Compare December 24, 2025 15:52
@github-actions github-actions bot added cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language labels Dec 24, 2025
@github-actions
Copy link

Subscribe to Label Action

cc @cfallin, @fitzgen

Details This issue or pull request has been labeled: "cranelift", "isle"

Thus the following users have been cc'd because of the following labels:

  • cfallin: isle
  • fitzgen: isle

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

@cfallin
Copy link
Member

cfallin commented Dec 27, 2025

cc @uweigand -- would you mind taking a look?

@uweigand
Copy link
Member

uweigand commented Jan 7, 2026

Hi @theotherjimmy this makes sense in principle, but I have a couple of comments:

  • The bias naming was intended to reflect small offset generated by the back-end to access part of a value (e.g. when accessing a 16-byte value as two 8-byte values, the second value needs to have the address biased by 8); it was not intended to act as a generic displacement. If we want that, we should probably rename the function to reg_plus_reg_plus_off or something.
  • With your code, we're now greedily generating reg+reg+simm20 always, and retroactively fix things if the instruction doesn't have that addressing mode. That is probably the right thing to do in general, but in many cases we already know that the addressing mode isn't available (e.g. vector instructions always only have uimm12 displacements). We'll likely get better code if we leave the addition separately from the beginning in those cases.

@theotherjimmy
Copy link
Contributor Author

Thanks for the review @uweigand

I think I'll rework this to write a new extractor with your suggested name, and migrate to that new extractor where we know the instructions support the extended offset. That should resolve your concerns.

@theotherjimmy theotherjimmy force-pushed the s390x-bxd20 branch 2 times, most recently from e1832b6 to 96ad60a Compare January 12, 2026 19:57
@theotherjimmy
Copy link
Contributor Author

@uweigand Is this more in line with what you expected?

Copy link
Member

@uweigand uweigand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks generally good to me. As a small enhancement it might be interesting to also extend the lower_address_bias version with a memarg_reg_plus_reg_plus_off clause (which would add the bias to the offset); this probably doesn't have much effect on performance but would preserve the symmetry.

A much bigger (future) enhancement could be to be more specific in generating proper addressing modes for the target instruction in the first place:

  • Avoid using the generic RegOffset mode but rather generate explicit code to begin with (e.g. along the lines of the AArch64 amode rules).
  • Use more specific flavors of lower_address (e.g. lower_address_bdaddr12, lower_address_bdaddr20, lower_address_bdxaddr12, lower_address_bdxaddr20 to follow LLWM nomenclature) and update the various rules to use the specific flavors instead of the generic, as appropriate for the target instruction of each rule.

Previously, cranelift did not emit memory operations using
20 bit immedates for reg + reg memory modes. This patch
enables that in a type safe manner by exporting a few
extractors for 20 bit immediates and changing the argument
to the MemArg constructor for reg + reg addressing mode to
accept that 20 bit offset instead of the u8 offset.

Note that the '_bias' suffixed rule is not modified, as it's
primarily used by vector instructions that don't have BXD20
addressing, only BXD12.
Copy link
Member

@uweigand uweigand left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM now, thanks!

Copy link
Member

@cfallin cfallin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Ulrich for the review on this!

@cfallin cfallin added this pull request to the merge queue Jan 13, 2026
Merged via the queue into bytecodealliance:main with commit e82a7ab Jan 13, 2026
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cranelift Issues related to the Cranelift code generator isle Related to the ISLE domain-specific language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants