Skip to content

Conversation

@dzzz2001
Copy link
Collaborator

@dzzz2001 dzzz2001 commented Oct 9, 2024

Background

I find that in some examples if the device is set to gpu and the number of MPI processes is set too high, the program would hang indefinitely. Upon investigation, it was found that the deadlock was caused by an implicit barrier in MPI.
image
Due to the condition max_atom > 0, some processes do not execute the gint_vl_gpu function. However, the gint_vl_gpu function contains an implicit MPI barrier within the set_device_by_rank function, which leads to a deadlock. This PR fixes this bug.

@mohanchen mohanchen added the Bugs Bugs that only solvable with sufficient knowledge of DFT label Oct 10, 2024
@mohanchen mohanchen merged commit 39df3b9 into deepmodeling:develop Oct 10, 2024
14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Bugs Bugs that only solvable with sufficient knowledge of DFT

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants