[OpenMP] Fix race condition in taskgraph with task dependencies #150880
+16
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This commit resolves a datarace that could occur when taskgraph with dependencies was enabled alongside task stealing. The problem occur because, during task stealing, the stealing thread could incorrectly add an extra node to the task dependency graph, while the original thread might fail to create the correct node, leading to a corrupted graph and potential execution issues.
EDIT 07/10/2025:
The above message is incorrect. The issue comes from the same thread setting a successors node to a node that has already been added, meanwhile the predecessor task have been executed and their dependencies released. Leaving the successor node waiting for the already executed task:
Thread 0: sets Task1 as successor of Task0
Thread 1: execute Task0
Thread 1: release successors of Task0
Thread 0: re-sets Task1 as successor of Task0
....
// Hangs due to Task1 not being executed because the Task0 dependency was re-set as predecessor after it was executed and their successors released.
Without taskgraph, the nodes are removed once executed; in contrast, when taskgraph is used, the nodes are maintained during and after the execution of the entire region.
kmp_tasking_flags_t::oncedis used to mark a task as executed, but it was incorrectly set. This commits solve the issue settingoncedbefore releasing the dependencies and checks for the same flag when linking(__kmp_depnode_link_successor).