Skip to content

Avocado instrumented timeout handling in thread#6060

Merged
clebergnu merged 2 commits intoavocado-framework:masterfrom
richtja:timeout_handling
Jan 27, 2025
Merged

Avocado instrumented timeout handling in thread#6060
clebergnu merged 2 commits intoavocado-framework:masterfrom
richtja:timeout_handling

Conversation

@richtja
Copy link
Contributor

@richtja richtja commented Oct 31, 2024

This commit changes the way how avocado instrumented runner handles test timeouts. For timeout handling, we used to use signals and raising TestInterrupt error. Such solution has an issue that we can't control where in the code the Error will be raised, and it can be handled before it reaches the runner layer. More info about this issue in #6046.

This change removes the signal handling and uses threading instead. Now each test method will be run in a separated thread and this thread will be terminated if timeout is reached. This solution selves the raising error issue and keeps the current test lifecycle untouched.

Reference: #6046

@richtja richtja added the bug label Oct 31, 2024
@richtja richtja added this to the 109 - Codename TBD milestone Oct 31, 2024
@richtja richtja self-assigned this Oct 31, 2024
@richtja richtja linked an issue Oct 31, 2024 that may be closed by this pull request
@codecov
Copy link

codecov bot commented Oct 31, 2024

Codecov Report

Attention: Patch coverage is 95.23810% with 1 line in your changes missing coverage. Please review.

Project coverage is 67.14%. Comparing base (670b97d) to head (1672f97).
Report is 124 commits behind head on master.

Files with missing lines Patch % Lines
avocado/core/test.py 93.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #6060      +/-   ##
==========================================
- Coverage   68.66%   67.14%   -1.53%     
==========================================
  Files         203      203              
  Lines       21970    21973       +3     
==========================================
- Hits        15085    14753     -332     
- Misses       6885     7220     +335     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@clebergnu
Copy link
Contributor

It would be interesting to run this with external test suites, such as the tests on QEMU or sosreport. I can attempt that as part of my review.

@richtja
Copy link
Contributor Author

richtja commented Jan 8, 2025

@clebergnu ping

Copy link
Contributor

@clebergnu clebergnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @richtja,

The changes look to be very effective. I've tested locally, and couldn't find any issues. Given that we are at the start of a new sprint, let's get this merged!

Please rebase it so that all CI checks pass (FYI, they all pass on my system).

@clebergnu
Copy link
Contributor

Just for future reference, I've tested this with QEMU tests with both a job timeout:

$ avocado run --job-timeout=10 -- tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_i440fx_tcg
JOB ID     : b3aeee9c50825ed06f84ab4e02c0815b985f3c0c
JOB LOG    : /home/cleber/avocado/job-results/job-2025-01-20T10.16-b3aeee9/job.log
 (1/1) tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_i440fx_tcg: STARTED
 (1/1) tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_i440fx_tcg: INTERRUPTED: Test interrupted: Timeout reached (9.61 s)

RESULTS    : PASS 0 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 1 | CANCEL 0
JOB HTML   : /home/cleber/avocado/job-results/job-2025-01-20T10.16-b3aeee9/results.html
JOB TIME   : 11.21 s

Test summary:
1-tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_i440fx_tcg: INTERRUPTED

And a test timeout

$ avocado run -- tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_i440fx_tcg
JOB ID     : 2ede91852056c1fa81258621df9e5325f1dc7e3b
JOB LOG    : /home/cleber/avocado/job-results/job-2025-01-20T10.17-2ede918/job.log
 (1/1) tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_i440fx_tcg: STARTED
 (1/1) tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_i440fx_tcg: INTERRUPTED: Test interrupted: Timeout reached (10.06 s)
RESULTS    : PASS 0 | ERROR 0 | FAIL 0 | SKIP 0 | WARN 0 | INTERRUPT 1 | CANCEL 0
JOB HTML   : /home/cleber/avocado/job-results/job-2025-01-20T10.17-2ede918/results.html
JOB TIME   : 11.59 s

Test summary:
1-tests/avocado/boot_linux.py:BootLinuxX8664.test_pc_i440fx_tcg: INTERRUPTED

@richtja richtja force-pushed the timeout_handling branch 2 times, most recently from d7c0b96 to 26503d5 Compare January 22, 2025 09:19
@richtja richtja marked this pull request as draft January 22, 2025 09:19
This commit changes the way how avocado instrumented runner handles test
timeouts.  For timeout handling, we used to use signals and raising
TestInterrupt error. Such solution has an issue that we can't control
where in the code the Error will be raised, and it can be handled before
it reaches the runner layer. More info about this issue in avocado-framework#6046.

This change removes the signal handling and uses threading instead. Now
each test method will be run in a separated thread and this thread will
be terminated if timeout is reached. This solution selves the raising
error issue and keeps the current test lifecycle untouched.

Reference: avocado-framework#6046
Signed-off-by: Jan Richter <jarichte@redhat.com>
When running the matplotlib test with default configuration, we get:

"UserWarning: Starting a Matplotlib GUI outside of the main thread will
likely fail."

This issue can cause freeze of this test on macOS.

The GUI is not necessary for purpose of this test, therefore we can use
non-interactive backend which will suite well for logging example.

Signed-off-by: Jan Richter <jarichte@redhat.com>
@richtja richtja marked this pull request as ready for review January 23, 2025 10:17
@richtja richtja requested a review from clebergnu January 23, 2025 10:17
@richtja
Copy link
Contributor Author

richtja commented Jan 23, 2025

Hi @clebergnu, thank you for your review. All the CI checks should be passing now. I had to update matplotlib example because it had issues with the new solution on macOS. Please take a look.

Copy link
Contributor

@clebergnu clebergnu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@clebergnu clebergnu merged commit 2729553 into avocado-framework:master Jan 27, 2025
62 checks passed
@richtja richtja deleted the timeout_handling branch February 13, 2025 10:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

Exception handling causes test timeouts issues

2 participants