-
Notifications
You must be signed in to change notification settings - Fork 60
tests: Fix NVMe device timing issues with intelligent polling mechanism #1124
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests: Fix NVMe device timing issues with intelligent polling mechanism #1124
Conversation
tests/utils.py
Outdated
| decoded = decoder.decode(out) | ||
| if not decoded or 'Devices' not in decoded: | ||
| return [] | ||
| # Poll for devices with exponential backoff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but this so unnecessarily overengineered, especially for tests. Simple while with few re-tries and time.sleep(1) is enough.
tests/utils.py
Outdated
| if ret != 0: | ||
| return [] | ||
|
|
||
| try: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this is out of scope for this change, but if you are already changing these function, it would be worth removing the code duplication here.
|
The best approach is to wait for a |
b19969e to
951643e
Compare
I have made the changes according to your suggestions. Please help review it. Thank you. |
951643e to
2cc8e63
Compare
tbzatek
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apart from @vojtechtrefny's suggestions the logic looks good to me. There are multiple possible points of failure, but let's see if it helps.
tests/utils.py
Outdated
| pass | ||
|
|
||
| time.sleep(wait_time) | ||
| wait_time = min(wait_time * 1.2, max_wait) # Slower exponential backoff |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original comment about this being unnecessarily complicated is still valid here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I will update and submit a new version of the code based on your comments.
tests/utils.py
Outdated
| wait_time = min(wait_time * 1.2, max_wait) # Slower exponential backoff | ||
|
|
||
| os.system("udevadm settle") | ||
| return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why this function returns boolean? It's not checked anywhere where it is called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, there is no need to return a boolean type.
Fixes storaged-project#1080 - Remove complex exponential backoff in favor of simple 1-second sleep - Remove unnecessary boolean return value - Keep subsystem NQN filtering for precise controller monitoring - Use reasonable 3-second timeout for test efficiency
2cc8e63 to
0b42cbd
Compare
|
Jenkins, ok to test. |
Fixes #1080
This PR addresses the timing issue where NVMe namespace devices may not be immediately available after controller creation, causing intermittent test failures.
Problem
The current implementation performs immediate device lookup without considering that NVMe devices may need time to appear in the system after target creation and connection.
Solution
Instead of the simple
sleep(1)approach proposed in #1081, this implements an intelligent polling mechanism with exponential backoff:Key Features
udevadm settleto ensure proper device initialization