Skip to content

Conversation

pmur
Copy link
Contributor

@pmur pmur commented Sep 23, 2025

This should address the last(?) missing pieces of inline asm for ppc:

  • Explicit VSX register support. ISA 2.06 (POWER7) added a 64x128b register overlay extending the fpr's to 128b, and unifies them with the vmx (altivec) registers. Implementations details within gcc/llvm percolate up, and require using the x template modifier. I have updated the inline asm to implicitly include this for vsx arguments which do not specify it. Support for the gcc codegen backend is still a todo.

  • Implement the preserves_flags option. All ABI's, and all ISAs store their flags in cr, and the carry bit lives inside xer. The other status registers hold sticky bits or control bits which do not affect branch instructions.

There is some interest in the e500 (powerpcspe) port. Architecturally, it has a very different FP ISA, and includes a simd extension called SPR (which is not IBM's cell SPE). Notably, it does not have altivec/fpr/vsx registers. It also has an SPE accumulator register which its ABI marks as volatile, but I am not sure if the compiler uses it.

@rustbot rustbot added A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. labels Sep 23, 2025
@rustbot
Copy link
Collaborator

rustbot commented Sep 23, 2025

r? @jackh726

rustbot has assigned @jackh726.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

@pmur
Copy link
Contributor Author

pmur commented Sep 23, 2025

r? @Amanieu
cc @taiki-e

@rustbot rustbot assigned Amanieu and unassigned jackh726 Sep 23, 2025
@rust-log-analyzer

This comment has been minimized.

@pmur pmur force-pushed the murp/improve-ppc-inline-asm branch from a258f60 to b63ecb2 Compare September 24, 2025 16:41
@rustbot
Copy link
Collaborator

rustbot commented Sep 24, 2025

Some changes occurred in compiler/rustc_codegen_gcc

cc @antoyo, @GuillaumeGomez

@rustbot

This comment has been minimized.

@pmur
Copy link
Contributor Author

pmur commented Sep 24, 2025

For the gcc backend, what is the preferred way to test these changes?

@antoyo
Copy link
Contributor

antoyo commented Sep 24, 2025

For now, we don't have gcc assembly or codegen tests, so I would leave this as is.

@pmur
Copy link
Contributor Author

pmur commented Sep 30, 2025

Ping.

extended_asm.add_clobber("cc");
match asm_arch {
InlineAsmArch::PowerPC | InlineAsmArch::PowerPC64 => {
let clobbers = ["cr0", "cr1", "cr2", "cr3", "cr4", "cr5", "cr6", "cr7", "xer"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not clear whether this is the approach we want to go with for PPC assembly. Currently we have the ability to explicitly mark cr* and xer as clobbers with out("cr0") _. Is this something that we want to preserve? Do we want these to always be marked as clobbered by default?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The choice here is very conservative. I think ISA is fairly explicit about which instructions clobber crs and/or xer. It's mostly cr0 (gpr . ops), cr6 (vsx/vmx . ops), and xer (the something and carry ops, and some shift ops).

For ergonomics, limiting this to cr0 and xer would capture the most common cases. Likewise if preserve_flags is not used when it should, it's not clobbering callee-save crs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

limiting this to cr0 and xer would capture the most common cases

Agreed. Most of cr* registers are not modified as a side effect of another instruction without explicitly specifying a register name. (Btw, in C/C++, "cc" is mapped to "cr0" (Clang, GCC).)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

most of the FP ops clobber cr1 if you use the . variant.

| PowerPC | `reg_nonzero` | `r[3-12]`, `r[14-28]` | `b` |
| PowerPC | `freg` | `f[0-31]` | `f` |
| PowerPC | `vreg` | `v[0-31]` | `v` |
| PowerPC | `vsreg | `vs[0-63]` | `vs` |
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| PowerPC | `vsreg | `vs[0-63]` | `vs` |
| PowerPC | `vsreg | `vs[0-63]` | `wa` |

@pmur pmur force-pushed the murp/improve-ppc-inline-asm branch from b63ecb2 to 57b0d79 Compare October 7, 2025 15:31
@rustbot
Copy link
Collaborator

rustbot commented Oct 7, 2025

Some changes occurred in src/doc/unstable-book/src/compiler-flags/branch-protection.md

cc @rust-lang/project-exploit-mitigations, @rcvalle

@rustbot rustbot added A-CI Area: Our Github Actions CI A-run-make Area: port run-make Makefiles to rmake.rs A-testsuite Area: The testsuite used to check the correctness of rustc PG-exploit-mitigations Project group: Exploit mitigations T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue. labels Oct 7, 2025
@rustbot

This comment has been minimized.

@cuviper
Copy link
Member

cuviper commented Oct 7, 2025

It seems your rebase dragged a few other commits with it...

@pmur pmur force-pushed the murp/improve-ppc-inline-asm branch from 57b0d79 to 47d8404 Compare October 7, 2025 16:36
@rustbot
Copy link
Collaborator

rustbot commented Oct 7, 2025

This PR was rebased onto a different master commit. Here's a range-diff highlighting what actually changed.

Rebasing is a normal part of keeping PRs up to date, so no action is needed—this note is just to help reviewers.

@taiki-e
Copy link
Member

taiki-e commented Oct 8, 2025

cc @programmerjake

extended_asm.add_clobber("cc");
match asm_arch {
InlineAsmArch::PowerPC | InlineAsmArch::PowerPC64 => {
let clobbers = ["cr0", "xer"];
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should probably include cr1 for the FP . instructions and cr6 for the vector instructions. The rest of the cr fields should need explicit instruction arguments so are much less likely to be forgotten.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using a dotted fp or vector op is almost always by explicit choice. I think in all cases, a non-dotted form available. The dotted fp usage is only useful when checking for fp exceptions.

xer and cr0 are used in common instructions with only dotted forms available, or always clobber xer[ca], subfic andi. srad come to mind.

},
// VSX is a superset of altivec.
Self::vsreg => types! {
vsx: F32, F64, VecI8(16), VecI16(8), VecI32(4), VecI64(2), VecF32(4), VecF64(2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

technically this should also include 128-bit integer/fp types and f16 since there are instructions for operating on those, but we can leave those out for now.

vs63 : v31;
}
// f0-f31 (vsr0-vsr31) and v0-v31 (vsr32-vsr63) do not conflict.
// For more detail, see ISA 3.1, Book I, Section 7.2.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// For more detail, see ISA 3.1, Book I, Section 7.2.
// For more detail, see Power ISA v3.1C, Book I, Section 7.2.1.1 and 7.2.1.2 on page 486 (522):
// https://files.openpower.foundation/s/9izgC5Rogi5Ywmm

Where supported, VSX is a 64x128b register set which encompasses
both the floating point and vector registers.

In the type tests, xvsqrtdp is used as it is the only two-argument
vsx opcode supported by all targets on llvm. If you need to copy
a vsx register, the preferred way is "xxlor xt, xa, xa".
@pmur pmur force-pushed the murp/improve-ppc-inline-asm branch from 47d8404 to 9d51dd0 Compare October 8, 2025 20:24
@Amanieu
Copy link
Member

Amanieu commented Oct 9, 2025

I'm inclined to only have this cover cr0 for consistency with cc in GCC/Clang. Inline assembly users will already be used to marking xer and other cr* registers as explicit clobbers.

Also we may want to disallow specifying cr0 as an explicit clobber and have the error message point to preserves_flags instead. I would very much prefer not having 2 ways of doing the same thing.

@pmur
Copy link
Contributor Author

pmur commented Oct 9, 2025

Also we may want to disallow specifying cr0 as an explicit clobber and have the error message point to preserves_flags instead. I would very much prefer not having 2 ways of doing the same thing.

I think we want to allow both. As far as I can tell, the cc == cr0 behavior isn't documented, I wasn't aware it did anything on ppc until pointed out earlier.

As for xer, I think it should be included with the flags, but consistency with other toolchains is more important. I'll drop it.

Implemented preserves_flags similar to the `cc` pseudo-register
on gcc and clang.

The gcc inline documentation doesn't actually state what this
does on powerpc* targets, but inspection of the source shows
it is equivalent to condition register field `cr0`.
@pmur pmur force-pushed the murp/improve-ppc-inline-asm branch from 9d51dd0 to 05ad94b Compare October 9, 2025 20:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

A-CI Area: Our Github Actions CI A-LLVM Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues. A-run-make Area: port run-make Makefiles to rmake.rs A-testsuite Area: The testsuite used to check the correctness of rustc PG-exploit-mitigations Project group: Exploit mitigations S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-compiler Relevant to the compiler team, which will review and decide on the PR/issue. T-infra Relevant to the infrastructure team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

9 participants