-
Notifications
You must be signed in to change notification settings - Fork 3k
PPCAnalyst: Split "in use" analysis into reads and writes #14278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
| gpr.Flush(~(js.op->gprWillBeRead | js.op->gprWillBeWritten) & | ||
| (js.op->regsIn | js.op->regsOut), | ||
| RegCache::FlushMode::Full); | ||
| gpr.Flush(~js.op->gprWillBeWritten & js.op->regsOut, RegCache::FlushMode::Undirty); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I meant on Discord is the following:
if (CheckMergedBranch(0))
{
// If an operand to the cmp/rc op we're merging with the branch isn't used anymore, it'd be
// better to flush it here so that we don't have to flush it on both sides of the branch.
gpr.Flush(~js.op->gprWillBeWritten & (js.op->regsIn | js.op->regsOut), RegCache::FlushMode::Undirty);
if (needs_test)
TEST(32, arg, arg);
arg.Unlock();
// Unset from the register cache here. By emitting above, no instructions is emitted between
// the potential TEST and the conditional branch, allowing macro-op fusion.
gpr.Flush(~(js.op->gprWillBeRead | js.op->gprWillBeWritten) &
(js.op->regsIn | js.op->regsOut),
RegCache::FlushMode::Full);
DoMergedBranchCondition();
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. But do you agree that my commit makes doing that unnecessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see it.
For instance:
nand. r1, r2, r3
Assuming that none of the registers are used again: r2 and r3 would have been flushed (with Undirty) already, but r1 wouldn't.
(But mea culpa for the cmpXX comment, which I edited out.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true, for instructions that both store to a regular register and to a CR, your change is useful even if this PR is merged.
I think your change would make sense as a separate PR.
|
Aside from that, it looks fine. |
|
AArch64 has an instruction called STP that stores two registers at once to adjacent memory locations. The JitArm64 register cache makes use of this if you tell it to flush two adjacent dirty registers at once. But with the new PPCAnalyst logic, it's impossible for there to be any dirty registers at an stmw instruction unless the register is going to be written to later, so optimizing the flushing instructions in stmw doesn't matter. The last write to the register can't be the stmw instruction itself, since stmw doesn't write to any registers. And if the last write to the register had been before the stmw instruction, it would have gotten flushed before the stmw instruction. And if the last write to the register is after the stmw instruction, then during the stmw instruction, it's not considered a register we should flush. |
If the last write to a register comes before the last read of it, we can write the register to ppcState after the last write instead of after the last read. This will hopefully help spread out m_ppc_state writes across a code block, improving pipelining. Also, if there's a conditional branch that's after the last write but before the last read, instead of needing to emit one m_ppc_state write on each side of the branch, we now only need to emit one m_ppc_state write. A note about the changes made to stmw and mfcr: These instructions don't write to any GPRs or CRs respectively – they only read from them. With this commit, there are no longer any cases where registers get written back to m_ppc_state after an instruction that just reads from them, so we can get rid of all STP logic from these two instructions. lmw still needs its STP logic, since that one does write to registers.
2e7875c to
40326ee
Compare
If the last write to a register comes before the last read of it, we can write the register to m_ppc_state after the last write instead of after the last read. This will hopefully help spread out m_ppc_state writes across a code block, improving pipelining. Also, if there's a conditional branch that's after the last write but before the last read, instead of needing to emit one m_ppc_state write on each side of the branch, we now only need to emit one m_ppc_state write.
A note about the changes made to stmw and mfcr: These instructions don't write to any GPRs or CRs respectively – they only read from them. With this commit, there are no longer any cases where registers get written back to m_ppc_state after an instruction that just reads from them, so we can get rid of all STP logic from these two instructions. lmw still needs its STP logic, since that one does write to registers.
I've split this out from the JitArm64-centric PR #14189 because this commit is also useful for Jit64.