Skip to content

Commit 98dc763

Browse files
LuciferYangdongjoon-hyun
authored andcommitted
[SPARK-50716][CORE] Fix the cleanup logic for symbolic links in JavaUtils.deleteRecursivelyUsingJavaIO method
### What changes were proposed in this pull request? To address the cleanup logic for symbolic links in the `JavaUtils.deleteRecursivelyUsingJavaIO` method, the following changes have been made in this pr: 1. Change to use `Files.readAttributes(file.toPath(), BasicFileAttributes.class, LinkOption.NOFOLLOW_LINKS)` to read the `BasicFileAttributes` of the file. By specifying `LinkOption.NOFOLLOW_LINKS`, the attributes of the symbolic link itself are read, rather than those of the file it points to. This allows us to use `fileAttributes.isSymbolicLink()` to check if a file is a symbolic link. 2. After the above change, it is no longer possible for `fileAttributes.isDirectory()` and `fileAttributes.isSymbolicLink()` to be true simultaneously. Therefore, when `fileAttributes.isDirectory()` is true, there is no need to check `!fileAttributes.isSymbolicLink()`. 3. When `fileAttributes.isSymbolicLink()` is true, deletion behavior for the symbolic link has been added. 4. When `!file.exists()` is true, an additional check for `!fileAttributes.isSymbolicLink()` has been added. This is because for a broken symbolic link, `file.exists()` will also return false, but in such cases, we should proceed with the cleanup. 5. The previously handwritten `isSymlink` method in JavaUtils has been removed, as it is no longer needed after the above changes. ### Why are the changes needed? Fix the cleanup logic for symbolic links in `JavaUtils.deleteRecursivelyUsingJavaIO` method. ### Does this PR introduce _any_ user-facing change? No ### How was this patch tested? - Pass GitHub Actions - New test cases have been added - Check with existing test cases which named `PipedRDDSuite`: Run `build/sbt "core/testOnly org.apache.spark.rdd.PipedRDDSuite"` Before ``` git status On branch upmaster Your branch is up to date with 'upstream/master'. Untracked files: (use "git add <file>..." to include in what will be committed) core/tasks/ ls -l core/tasks total 0 drwxr-xr-x 5 yangjie01 staff 160 1 3 18:15 099f2492-acef-4556-8a34-1318dccf7ad2 drwxr-xr-x 5 yangjie01 staff 160 1 3 18:15 47d46196-2f7b-4c7b-acf3-7e1d26584c12 drwxr-xr-x 5 yangjie01 staff 160 1 3 18:15 5e23fe20-1e3f-49b8-8404-5cd3b1033e37 drwxr-xr-x 5 yangjie01 staff 160 1 3 18:15 a2cbf5a9-3ebf-4332-be87-c9501830750e drwxr-xr-x 5 yangjie01 staff 160 1 3 18:15 ddf45bf5-d0fa-4970-9094-930f382b675c drwxr-xr-x 5 yangjie01 staff 160 1 3 18:15 e25fe5ad-a0be-48d0-81f6-605542f447b5 ls -l core/tasks/099f2492-acef-4556-8a34-1318dccf7ad2 total 0 lrwxr-xr-x 1 yangjie01 staff 59 1 3 18:15 benchmarks -> /Users/yangjie01/SourceCode/git/spark-sbt/core/./benchmarks lrwxr-xr-x 1 yangjie01 staff 52 1 3 18:15 src -> /Users/yangjie01/SourceCode/git/spark-sbt/core/./src lrwxr-xr-x 1 yangjie01 staff 55 1 3 18:15 target -> /Users/yangjie01/SourceCode/git/spark-sbt/core/./target ``` We noticed that symbolic links are left behind after the tests, even though manual cleanup has been invoked in the test code: https://github.com/apache/spark/blob/b210f422b0078d535eddc696ebba8d92f67b81fb/core/src/test/scala/org/apache/spark/rdd/PipedRDDSuite.scala#L214-L232 After ``` git status On branch deleteRecursivelyUsingJavaIO-SymbolicLink Your branch is up to date with 'origin/deleteRecursivelyUsingJavaIO-SymbolicLink'. nothing to commit, working tree clean ``` We observe that there are no residual symbolic links left after the tests. ### Was this patch authored or co-authored using generative AI tooling? No Closes #49347 from LuciferYang/deleteRecursivelyUsingJavaIO-SymbolicLink. Lead-authored-by: yangjie01 <yangjie01@baidu.com> Co-authored-by: YangJie <yangjie01@baidu.com> Signed-off-by: Dongjoon Hyun <dongjoon@apache.org>
1 parent dccb129 commit 98dc763

File tree

2 files changed

+45
-16
lines changed

2 files changed

+45
-16
lines changed

common/utils/src/main/java/org/apache/spark/network/util/JavaUtils.java

Lines changed: 7 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@
2222
import java.nio.channels.ReadableByteChannel;
2323
import java.nio.charset.StandardCharsets;
2424
import java.nio.file.Files;
25+
import java.nio.file.LinkOption;
2526
import java.nio.file.attribute.BasicFileAttributes;
2627
import java.util.*;
2728
import java.util.concurrent.TimeUnit;
@@ -125,10 +126,11 @@ public static void deleteRecursively(File file, FilenameFilter filter) throws IO
125126
private static void deleteRecursivelyUsingJavaIO(
126127
File file,
127128
FilenameFilter filter) throws IOException {
128-
if (!file.exists()) return;
129129
BasicFileAttributes fileAttributes =
130-
Files.readAttributes(file.toPath(), BasicFileAttributes.class);
131-
if (fileAttributes.isDirectory() && !isSymlink(file)) {
130+
Files.readAttributes(file.toPath(), BasicFileAttributes.class, LinkOption.NOFOLLOW_LINKS);
131+
// SPARK-50716: If the file does not exist and not a broken symbolic link, return directly.
132+
if (!file.exists() && !fileAttributes.isSymbolicLink()) return;
133+
if (fileAttributes.isDirectory()) {
132134
IOException savedIOException = null;
133135
for (File child : listFilesSafely(file, filter)) {
134136
try {
@@ -143,8 +145,8 @@ private static void deleteRecursivelyUsingJavaIO(
143145
}
144146
}
145147

146-
// Delete file only when it's a normal file or an empty directory.
147-
if (fileAttributes.isRegularFile() ||
148+
// Delete file only when it's a normal file, a symbolic link, or an empty directory.
149+
if (fileAttributes.isRegularFile() || fileAttributes.isSymbolicLink() ||
148150
(fileAttributes.isDirectory() && listFilesSafely(file, null).length == 0)) {
149151
boolean deleted = file.delete();
150152
// Delete can also fail if the file simply did not exist.
@@ -192,17 +194,6 @@ private static File[] listFilesSafely(File file, FilenameFilter filter) throws I
192194
}
193195
}
194196

195-
private static boolean isSymlink(File file) throws IOException {
196-
Objects.requireNonNull(file);
197-
File fileInCanonicalDir = null;
198-
if (file.getParent() == null) {
199-
fileInCanonicalDir = file;
200-
} else {
201-
fileInCanonicalDir = new File(file.getParentFile().getCanonicalFile(), file.getName());
202-
}
203-
return !fileInCanonicalDir.getCanonicalFile().equals(fileInCanonicalDir.getAbsoluteFile());
204-
}
205-
206197
private static final Map<String, TimeUnit> timeSuffixes;
207198

208199
private static final Map<String, ByteUnit> byteSuffixes;

core/src/test/scala/org/apache/spark/util/UtilsSuite.scala

Lines changed: 38 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ import java.lang.reflect.Field
2222
import java.net.{BindException, ServerSocket, URI}
2323
import java.nio.{ByteBuffer, ByteOrder}
2424
import java.nio.charset.StandardCharsets.UTF_8
25+
import java.nio.file.{Files => JFiles}
2526
import java.text.DecimalFormatSymbols
2627
import java.util.Locale
2728
import java.util.concurrent.TimeUnit
@@ -731,6 +732,43 @@ class UtilsSuite extends SparkFunSuite with ResetSystemProperties {
731732
assert(!sourceFile2.exists())
732733
}
733734

735+
test("SPARK-50716: deleteRecursively - SymbolicLink To File") {
736+
val tempDir = Utils.createTempDir()
737+
val sourceFile = new File(tempDir, "foo.txt")
738+
JFiles.write(sourceFile.toPath, "Some content".getBytes)
739+
assert(sourceFile.exists())
740+
741+
val symlinkFile = new File(tempDir, "bar.txt")
742+
JFiles.createSymbolicLink(symlinkFile.toPath, sourceFile.toPath)
743+
744+
// Check that the symlink was created successfully
745+
assert(JFiles.isSymbolicLink(symlinkFile.toPath))
746+
Utils.deleteRecursively(tempDir)
747+
748+
// Verify that everything is deleted
749+
assert(!tempDir.exists)
750+
}
751+
752+
test("SPARK-50716: deleteRecursively - SymbolicLink To Dir") {
753+
val tempDir = Utils.createTempDir()
754+
val sourceDir = new File(tempDir, "sourceDir")
755+
assert(sourceDir.mkdir())
756+
val sourceFile = new File(sourceDir, "file.txt")
757+
JFiles.write(sourceFile.toPath, "Some content".getBytes)
758+
759+
val symlinkDir = new File(tempDir, "targetDir")
760+
JFiles.createSymbolicLink(symlinkDir.toPath, sourceDir.toPath)
761+
762+
// Check that the symlink was created successfully
763+
assert(JFiles.isSymbolicLink(symlinkDir.toPath))
764+
765+
// Now delete recursively
766+
Utils.deleteRecursively(tempDir)
767+
768+
// Verify that everything is deleted
769+
assert(!tempDir.exists)
770+
}
771+
734772
test("loading properties from file") {
735773
withTempDir { tmpDir =>
736774
val outFile = File.createTempFile("test-load-spark-properties", "test", tmpDir)

0 commit comments

Comments
 (0)