-
Notifications
You must be signed in to change notification settings - Fork 15.3k
[lldb][RISCV][test] make atomic region stepping test more robust #156506
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[lldb][RISCV][test] make atomic region stepping test more robust #156506
Conversation
|
✅ With the latest revision this PR passed the Python code formatter. |
Currently, the tests that check stepping through atomic sequences use a hardcoded step distance, which is unreliable because this distance depends on LLVM's codegeneration. This patch rewrites the test to avoid this disadvantage. The test now checks the opcode of the instruction after the step instead of the step distance.
46ea86d to
d9ede80
Compare
|
@DavidSpickett, could you take a look, please? There are minor changes related to #127505. |
DavidSpickett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How exactly is it vulnerable?
I think the mnemonic you land on would also be vulnerable to codegen changes as well. Unless you arrange for the step to never leave your inline asm blocks, then you can guarantee what it will be.
Maybe that is what you have done, it's hard to tell just by the diff.
The relocations that clang emits can change the distance of the step. For example, in the review of the main patch I mentioned that the same test has different offsets for dwo and dwarf modes (#127505 (comment)), and without this patch, the test fails in dwo mode. Additionally, I believe it was a poor idea to compute and check the step distance because that is not what we should actually be verifying. In the tests we already know where execution should stop after the step - for example, at a branch instruction - therefore, it is better to check the opcode of the instruction rather than the step distance. The step distance itself is not important and can sometimes be misleading, as in the dwo case where we end up at the correct instruction, but the step distance differs due to different relocations.
Yes, in these tests, execution during the step is inside the inline assembly block, so I hope the mnemonics will be stable. |
|
@llvm/pr-subscribers-lldb @llvm/pr-subscribers-backend-risc-v Author: None (dlav-sc) ChangesCurrently, the tests that check stepping through atomic sequences use a hardcoded step distance, which is unreliable because this distance depends on LLVM's codegeneration. This patch rewrites the test to avoid this disadvantage. The test now checks the opcode of the instruction after the step instead of the step distance. Full diff: https://github.com/llvm/llvm-project/pull/156506.diff 3 Files Affected:
diff --git a/lldb/test/API/riscv/step/TestSoftwareStep.py b/lldb/test/API/riscv/step/TestSoftwareStep.py
index 279c4c1b797e0..0544d2102d0f9 100644
--- a/lldb/test/API/riscv/step/TestSoftwareStep.py
+++ b/lldb/test/API/riscv/step/TestSoftwareStep.py
@@ -26,19 +26,26 @@ def do_sequence_test(self, filename, bkpt_name):
substrs=["stopped", "stop reason = instruction step into"],
)
- pc = cur_thread.GetFrameAtIndex(0).GetPC()
+ # Get the instruction we stopped at
+ pc = cur_thread.GetFrameAtIndex(0).GetPCAddress()
+ inst = target.ReadInstructions(pc, 1).GetInstructionAtIndex(0)
- return pc - entry_pc
+ inst_mnemonic = inst.GetMnemonic(target)
+ inst_operands = inst.GetOperands(target)
+ if not inst_operands:
+ return inst_mnemonic
- @skipIf(archs=no_match("^rv.*"))
+ return f"{inst_mnemonic} {inst_operands}"
+
+ @skipIf(archs=no_match("^riscv.*"))
def test_cas(self):
"""
This test verifies LLDB instruction step handling of a proper lr/sc pair.
"""
- difference = self.do_sequence_test("main", "cas")
- self.assertEqual(difference, 0x1A)
+ instruction = self.do_sequence_test("main", "cas")
+ self.assertEqual(instruction, "nop")
- @skipIf(archs=no_match("^rv.*"))
+ @skipIf(archs=no_match("^riscv.*"))
def test_branch_cas(self):
"""
LLDB cannot predict the actual state of registers within a critical section (i.e., inside an atomic
@@ -51,29 +58,29 @@ def test_branch_cas(self):
test is nearly identical to the previous one, except for the branch condition, which is inverted and
will result in a taken jump.
"""
- difference = self.do_sequence_test("branch", "branch_cas")
- self.assertEqual(difference, 0x1A)
+ instruction = self.do_sequence_test("branch", "branch_cas")
+ self.assertEqual(instruction, "ret")
- @skipIf(archs=no_match("^rv.*"))
+ @skipIf(archs=no_match("^riscv.*"))
def test_incomplete_sequence_without_lr(self):
"""
This test verifies the behavior of a standalone sc instruction without a preceding lr. Since the sc
lacks the required lr pairing, LLDB should treat it as a non-atomic store rather than part of an
atomic sequence.
"""
- difference = self.do_sequence_test(
+ instruction = self.do_sequence_test(
"incomplete_sequence_without_lr", "incomplete_cas"
)
- self.assertEqual(difference, 0x4)
+ self.assertEqual(instruction, "and a5, a2, a4")
- @skipIf(archs=no_match("^rv.*"))
+ @skipIf(archs=no_match("^riscv.*"))
def test_incomplete_sequence_without_sc(self):
"""
This test checks the behavior of a standalone lr instruction without a subsequent sc. Since the lr
lacks its required sc counterpart, LLDB should treat it as a non-atomic load rather than part of an
atomic sequence.
"""
- difference = self.do_sequence_test(
+ instruction = self.do_sequence_test(
"incomplete_sequence_without_sc", "incomplete_cas"
)
- self.assertEqual(difference, 0x4)
+ self.assertEqual(instruction, "and a5, a2, a4")
diff --git a/lldb/test/API/riscv/step/branch.c b/lldb/test/API/riscv/step/branch.c
index 79bf0ca005db1..93d6c51ec75e0 100644
--- a/lldb/test/API/riscv/step/branch.c
+++ b/lldb/test/API/riscv/step/branch.c
@@ -11,6 +11,7 @@ void __attribute__((naked)) branch_cas(int *a, int *b) {
"xor a5, a2, a5\n\t"
"sc.w a5, a1, (a3)\n\t"
"beqz a5, 1b\n\t"
+ "nop\n\t"
"2:\n\t"
"ret\n\t");
}
diff --git a/lldb/test/API/riscv/step/main.c b/lldb/test/API/riscv/step/main.c
index 35e8aee2cae4b..6207954b7e1cb 100644
--- a/lldb/test/API/riscv/step/main.c
+++ b/lldb/test/API/riscv/step/main.c
@@ -1,5 +1,5 @@
void __attribute__((naked)) cas(int *a, int *b) {
- // This atomic sequence implements a copy-and-swap function. This test should
+ // This atomic sequence implements a copy-and-swap function. This test should stop
// at the first instruction, and after step instruction, we should stop at the
// end of the sequence (on the ret instruction).
asm volatile("1:\n\t"
@@ -11,6 +11,7 @@ void __attribute__((naked)) cas(int *a, int *b) {
"xor a5, a2, a5\n\t"
"sc.w a5, a1, (a3)\n\t"
"beqz a5, 1b\n\t"
+ "nop\n\t"
"2:\n\t"
"ret\n\t");
}
|
|
✅ With the latest revision this PR passed the C/C++ code formatter. |
4ece9bb to
aed5258
Compare
Understood.
So please extend the sentence "This patch rewrites the test to avoid this disadvantage." to state that you do this by avoiding checking for any compiler generated locations. |
|
In There is a comment that refers to this but it almost looks like a typo. Please change that to include the operands too and/or add an inline comment on the instruction we expect to stop on. So then if the test ever does fail, we have more chances to realise that I was going to say make it some unique thing like a nop, but this test wants it to be almost an atomic sequence but not quite. So we shouldn't be modifying the rest of the sequence I think. |
| instruction = self.do_sequence_test("main", "cas") | ||
| self.assertEqual(instruction, "nop") | ||
|
|
||
| @skipIf(archs=no_match("^rv.*")) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this regex was never matching and so the test was always skipped, probably also skipped on RISC-V until you fixed it here?
Istr updating this riscv regex in another test in the same way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I didn't notice that we actually changed the regex for RISCV in #130034. Maybe ^rv.* works somehow too, because I was able to run the tests. Or maybe I didn't have the patch on my branch.
Anyway, after #130034, we can use riscv32 and riscv64. For any RISCV target, I use ^riscv.* - for example, in the software watchpoints tests.
I noticed that lldb/test/API/riscv/break-undecoded/TestBreakpointIllegal.py has the old regex, we should change it at some point.
DavidSpickett
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming the comments are updated, this LGTM. Thanks for explaining the rationale, it does make sense to me now.
One of these days I will take inspiration from this and finally fix these sequences for AArch64.
|
And btw does this mean you're running RISC-V tests semi-regularly? If so, glad to know someone is :) |
Yeah :) We have a close-source fork with ci, where RISCV lldb tests run after each commit. |
I've updated the message
addressed |
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/163/builds/25820 Here is the relevant piece of the build log for the reference |
Currently, the tests that check stepping through atomic sequences use a hardcoded step distance, which is unreliable because this distance depends on LLVM's codegeneration. The relocations that clang emits can change the distance of the step.
Additionally, it was a poor idea to compute and check the step distance because that is not what we should actually be verifying. In the tests we already know where execution should stop after the step - for example, at a branch instruction - therefore, it is better to check the opcode of the instruction rather than the step distance. The step distance itself is not important and can sometimes be misleading.
This patch rewrites the tests, so now they checks the opcode of the instruction after the step instead of the step distance.