Skip to content

Commit 6c4552c

Browse files
Venkata krishnan Sowrirajancloud-fan
authored andcommitted
[SPARK-27338][CORE] Fix deadlock in UnsafeExternalSorter.SpillableIterator when locking both UnsafeExternalSorter.SpillableIterator and TaskMemoryManager
## What changes were proposed in this pull request? In `UnsafeExternalSorter.SpillableIterator#loadNext()` takes lock on the `UnsafeExternalSorter` and calls `freePage` once the `lastPage` is consumed which needs to take a lock on `TaskMemoryManager`. At the same time, there can be another MemoryConsumer using `UnsafeExternalSorter` as part of sorting can try to `allocatePage` needs to get lock on `TaskMemoryManager` which can cause spill to happen which requires lock on `UnsafeExternalSorter` again causing deadlock. This is a classic deadlock situation happening similar to the SPARK-26265. To fix this, we can move the `freePage` call in `loadNext` outside of `Synchronized` block similar to the fix in SPARK-26265 ## How was this patch tested? Manual tests were being done and will also try to add a test. Closes apache#24265 from venkata91/deadlock-sorter. Authored-by: Venkata krishnan Sowrirajan <[email protected]> Signed-off-by: Wenchen Fan <[email protected]>
1 parent 69dd44a commit 6c4552c

File tree

1 file changed

+23
-11
lines changed

1 file changed

+23
-11
lines changed

core/src/main/java/org/apache/spark/util/collection/unsafe/sort/UnsafeExternalSorter.java

Lines changed: 23 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -573,19 +573,31 @@ public boolean hasNext() {
573573

574574
@Override
575575
public void loadNext() throws IOException {
576-
synchronized (this) {
577-
loaded = true;
578-
if (nextUpstream != null) {
579-
// Just consumed the last record from in memory iterator
580-
if (lastPage != null) {
581-
freePage(lastPage);
582-
lastPage = null;
576+
MemoryBlock pageToFree = null;
577+
try {
578+
synchronized (this) {
579+
loaded = true;
580+
if (nextUpstream != null) {
581+
// Just consumed the last record from in memory iterator
582+
if(lastPage != null) {
583+
// Do not free the page here, while we are locking `SpillableIterator`. The `freePage`
584+
// method locks the `TaskMemoryManager`, and it's a bad idea to lock 2 objects in
585+
// sequence. We may hit dead lock if another thread locks `TaskMemoryManager` and
586+
// `SpillableIterator` in sequence, which may happen in
587+
// `TaskMemoryManager.acquireExecutionMemory`.
588+
pageToFree = lastPage;
589+
lastPage = null;
590+
}
591+
upstream = nextUpstream;
592+
nextUpstream = null;
583593
}
584-
upstream = nextUpstream;
585-
nextUpstream = null;
594+
numRecords--;
595+
upstream.loadNext();
596+
}
597+
} finally {
598+
if (pageToFree != null) {
599+
freePage(pageToFree);
586600
}
587-
numRecords--;
588-
upstream.loadNext();
589601
}
590602
}
591603

0 commit comments

Comments
 (0)