-
Notifications
You must be signed in to change notification settings - Fork 15.2k
[lldb] Support expectedFlakey decorator in dotest #129817
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The dotest framework had an existing decorator to mark flakey tests. It was not implemented and would simply run the underlying test. This commit modifies the decorator to run the test multiple times in the case of failure and only raises the test failure when all attempts fail.
|
@llvm/pr-subscribers-lldb Author: David Peixotto (dmpots) ChangesThe dotest framework had an existing decorator to mark flakey tests. It was not implemented and would simply run the underlying test. This commit modifies the decorator to run the test multiple times in the case of failure and only raises the test failure when all attempts fail. Full diff: https://github.com/llvm/llvm-project/pull/129817.diff 1 Files Affected:
diff --git a/lldb/packages/Python/lldbsuite/test/decorators.py b/lldb/packages/Python/lldbsuite/test/decorators.py
index c48c0b2f77944..c4de3f8f10751 100644
--- a/lldb/packages/Python/lldbsuite/test/decorators.py
+++ b/lldb/packages/Python/lldbsuite/test/decorators.py
@@ -3,6 +3,7 @@
from packaging import version
import ctypes
import locale
+import logging
import os
import platform
import re
@@ -525,12 +526,24 @@ def expectedFailureWindows(bugnumber=None):
return expectedFailureOS(["windows"], bugnumber)
-# TODO: This decorator does not do anything. Remove it.
-def expectedFlakey(expected_fn, bugnumber=None):
+# This decorator can be used to mark a test that can fail non-deterministically.
+# The decorator causes the underlying test to be re-run if it encounters a
+# failure. After `num_retries` attempts the test will be marked as a failure
+# if it has not yet passed.
+def expectedFlakey(expected_fn, bugnumber=None, num_retries=3):
def expectedFailure_impl(func):
@wraps(func)
def wrapper(*args, **kwargs):
- func(*args, **kwargs)
+ for i in range(1, num_retries+1):
+ try:
+ return func(*args, **kwargs)
+ except Exception:
+ logging.warning(
+ f"expectedFlakey: test {func} failed attempt ({i}/{num_retries})"
+ )
+ # If the last attempt fails then re-raise the original failure.
+ if i == num_retries:
+ raise
return wrapper
|
|
✅ With the latest revision this PR passed the Python code formatter. |
|
I personally don't think we should have such a decorator: test shouldn't be flakey. In my experience, when tests are flakey, they are almost always either poor tests or actual bugs. I'm worried that a decorator like this makes it easy to sweep these issues under the rug. That's not to say that things cannot be flakey sometimes: because of how we test the debugger, we depend on a lot of things, many of which are out of our control and can cause a test to fail. But that's different from a specific test being flakey, which is what this decorator would be used for. |
Thanks, I appreciate your thoughts. For some context here, I am looking to replace some internal scripts that handle failures by re-running tests. I thought we might be able to leverage the built-in features of the dotest to handle some of this. Let me collect some more data to see how much/what kind of flakiness we have. Do you have any suggestions on how we should handle the "expected" flakiness because of how we test the debugger? Do you think this is something we should try to solve as part of the lldb testing framework? |
|
libcxx also has something that re-runs tests I think, worth having a look there. I don't think it's a built in llvm-lit feature though. If the current decorator does not change anything I wonder if we should be converting the existing uses back to normal. Unless someone out there is monkey patching it, these tests would have been flakey for them. ...or they have done what you did and built their own thing, not realising that the decorator was supposed to do that in the first place. |
Super basic tip, in case you didn't notice already, Linaro's experience is that running LLDB testing on a shared machine is a bad idea. For AArch64 we run on a dedicated machine and that has worked well. It's not very large, it's just free of distractions. Edit: Same machine with dedicated core allocations also works. For Arm we are on a shared server with much more resource than we need for LLDB, but we can't always get it because of other workers. So we set a parallelism limit way below the actual core count so that lit doesn't over estimate what it can run. This leaves us with the actually flakey tests that ideally we'd find proper fixes for as Jonas said. |
|
+1 on everything @DavidSpickett said. We're doing pretty much the same thing for our CI. |
|
Thanks for all the suggestions. I'll close the PR and try to make some progress internally on this. |
The dotest framework had an existing decorator to mark flakey tests. It was not implemented and would simply run the underlying test. This commit modifies the decorator to run the test multiple times in the case of failure and only raises the test failure when all attempts fail.