Skip to content

Conversation

@hjelmn
Copy link
Member

@hjelmn hjelmn commented Feb 27, 2017

Under heavy load the locking code could fail if the underlying btl
module started to return OPAL_ERR_OUT_OF_RESOURCE on atomic
operations. This commit updates the code to gracefully handle btl
errors.

Signed-off-by: Nathan Hjelm [email protected]
(cherry picked from commit 4707c7c)
Signed-off-by: Nathan Hjelm [email protected]

Under heavy load the locking code could fail if the underlying btl
module started to return OPAL_ERR_OUT_OF_RESOURCE on atomic
operations. This commit updates the code to gracefully handle btl
errors.

Signed-off-by: Nathan Hjelm <[email protected]>
(cherry picked from commit 4707c7c)
Signed-off-by: Nathan Hjelm <[email protected]>
@hjelmn hjelmn added the bug label Feb 27, 2017
@hjelmn hjelmn added this to the v2.1.0 milestone Feb 27, 2017
@hjelmn hjelmn requested a review from regrant February 27, 2017 15:53
@hjelmn
Copy link
Member Author

hjelmn commented Feb 27, 2017

@jsquyres Found this bug during stress testing.

@jsquyres jsquyres modified the milestones: v2.1.0, v2.1.1 Feb 27, 2017
Copy link
Contributor

@regrant regrant left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks fine, and solves the stated problem.

Signed-off-by: Nathan Hjelm <[email protected]>
(cherry picked from commit 032bcf9)
Signed-off-by: Nathan Hjelm <[email protected]>
@hjelmn
Copy link
Member Author

hjelmn commented Mar 1, 2017

@jsquyres Added the commit to clean up the warnings. Good to go now.

@hppritcha hppritcha merged commit 88e139f into open-mpi:v2.x Mar 31, 2017
@artpol84 artpol84 mentioned this pull request Apr 1, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants