-
Notifications
You must be signed in to change notification settings - Fork 164
Fix catastrophic regex backtracking in shell framework #4658
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
c305a8a
17c2851
45bd1a8
493d9a7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,120 @@ | ||
| import time | ||
| from unittest.mock import MagicMock | ||
|
|
||
| import pytest | ||
|
|
||
| import tmt.utils | ||
| from tmt.frameworks.shell import _extract_failures | ||
| from tmt.utils import Path | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def make_invocation(): | ||
| """Factory fixture creating a mock TestInvocation returning given log content.""" | ||
|
|
||
| def _factory(log_content: str) -> MagicMock: | ||
| mock = MagicMock() | ||
| mock.phase.step.plan.execute.read.return_value = log_content | ||
| return mock | ||
|
|
||
| return _factory | ||
|
|
||
|
|
||
| _FAILURE_MATCH_CASES: list[tuple[str, str, list[str]]] = [ | ||
| ('no failures', 'all good\nnothing wrong here\n', []), | ||
| ('error keyword', 'line1\nsome error occurred\nline3\n', ['some error occurred']), | ||
| ('fail keyword', 'test passed\ntest fail here\ndone\n', ['test fail here']), | ||
| ('case insensitive', 'ERROR: something\nFAIL: test\n', ['ERROR: something', 'FAIL: test']), | ||
| ( | ||
| 'multiple matches', | ||
| 'ok\nerror one\npass\nfail two\nerror three\n', | ||
| ['error one', 'fail two', 'error three'], | ||
| ), | ||
| ('word boundary - no false positives', 'errorless operation\nfailover complete\n', []), | ||
| ] | ||
|
|
||
|
|
||
| @pytest.mark.parametrize( | ||
| ('log_content', 'expected'), | ||
| [(log, expected) for _, log, expected in _FAILURE_MATCH_CASES], | ||
| ids=[name for name, _, _ in _FAILURE_MATCH_CASES], | ||
| ) | ||
| def test_extract_failures( | ||
| make_invocation, | ||
| log_content: str, | ||
| expected: list[str], | ||
| ) -> None: | ||
| """Verify _extract_failures matches the correct lines.""" | ||
| invocation = make_invocation(log_content) | ||
| assert _extract_failures(invocation, Path('dummy.log')) == expected | ||
|
|
||
|
|
||
| def test_extract_failures_file_error() -> None: | ||
| """Verify _extract_failures returns empty list when the log file cannot be read.""" | ||
| invocation = MagicMock() | ||
| invocation.phase.step.plan.execute.read.side_effect = tmt.utils.FileError('not found') | ||
| assert _extract_failures(invocation, Path('dummy.log')) == [] | ||
|
|
||
|
|
||
| _LONG_LINE_CASES: list[tuple[str, str, list[str]]] = [ | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Please no list of tuple of list of ... Make it structured in a dataclass |
||
| ( | ||
| 'long line without match followed by error line', | ||
| 'start\n{long}\nsome error here\nend\n', | ||
| ['some error here'], | ||
| ), | ||
| ( | ||
| 'error embedded in long line without newline separator', | ||
| 'start\n{long} error in the middle {long}\nend\n', | ||
| ['{long} error in the middle {long}'], | ||
| ), | ||
| ( | ||
| 'long line with word boundary - no false positive', | ||
| 'start\n{long}errorless{long}\nend\n', | ||
| [], | ||
| ), | ||
|
Comment on lines
+60
to
+74
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This just shows that what we are doing in Why are we catching any word error and concluding that if it is present it must be relevant. It is a very poorman's guess and the question is if anyone is actually relying on these. Would consider adding a deprecation note and see who actually complains and/or cares about them. A 0 second evaluation is faster still
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @LecrisUT sure, but that is not for this issue, I will file a new one |
||
| ] | ||
|
|
||
|
|
||
| @pytest.mark.parametrize( | ||
| ('log_template', 'expected_template'), | ||
| [(log, expected) for _, log, expected in _LONG_LINE_CASES], | ||
| ids=[name for name, _, _ in _LONG_LINE_CASES], | ||
|
Comment on lines
+79
to
+81
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Use |
||
| ) | ||
| def test_extract_failures_long_lines( | ||
| make_invocation, | ||
| log_template: str, | ||
| expected_template: list[str], | ||
| ) -> None: | ||
| """ | ||
| Verify correct handling of very long lines. | ||
|
|
||
| The original implementation used ``re.findall(r'.*\\b(?:error|fail)\\b.*', ...)`` | ||
| which caused catastrophic backtracking on long lines (O(n^2) or worse), | ||
| hanging tmt processes for hours on 1M+ character lines (e.g. base64-encoded | ||
| in-toto attestation payloads in container build logs). | ||
|
|
||
| The current implementation uses ``str.splitlines()`` and per-line | ||
| ``re.search()`` which processes each line in linear time. | ||
| """ | ||
| long_segment = 'A' * 1_000_000 | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Move this construction outside of the test, and make it more lorem-ipsum-like with more word separators and variations that it could randomly hit. |
||
| log_content = log_template.replace('{long}', long_segment) | ||
| expected = [line.replace('{long}', long_segment) for line in expected_template] | ||
|
|
||
| invocation = make_invocation(log_content) | ||
|
|
||
| # time.monotonic() is the correct choice for elapsed time measurement | ||
| # as it is not affected by system clock adjustments. | ||
| start = time.monotonic() | ||
| result = _extract_failures(invocation, Path('dummy.log')) | ||
| elapsed = time.monotonic() - start | ||
LecrisUT marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
| assert result == expected | ||
|
|
||
| # The old regex would never complete on lines this long — it took 5+ | ||
| # seconds on just 10k characters. The splitlines approach finishes in | ||
| # well under 1 second. Use 30 seconds as a generous ceiling to avoid | ||
| # flakiness on slow CI while still catching catastrophic backtracking. | ||
|
Comment on lines
+113
to
+116
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't like this. The difference is well within a single order of magnitude. It seems that the test is not catching anything significant.
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well, here are the measurements that I got using the 50000 length:
What I would propose is documenting this difference explicitly. Choose whatever size you want and run them fully and document explicitly how much they (can) differ. E.g. in this case all of them are expected to be instantaneous in the new case so cutting in the middle, having a threshold of
Running these does highlight one weird quirk between Case 1 and Case 2. I would not have expected that the latter to be that much faster in the old method given the documentation in the test
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I ran the test locally without the code fix (to make sure it fails) and it hung until I did Ctrl+C. Gemini suggested I add |
||
| assert elapsed < 30.0, ( | ||
| f'_extract_failures took {elapsed:.1f}s on a log with a 1M-char line; ' | ||
| f'likely catastrophic regex backtracking' | ||
| ) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the whole factory? You do not need it, let it provide just the mock object and mock the relevant part in the tests. That would make it more consistent with all other instances