Skip to content

Commit c3f285c

Browse files
wangyumMarcelo Vanzin
authored andcommitted
[SPARK-24149][YARN][FOLLOW-UP] Only get the delegation tokens of the filesystem explicitly specified by the user
## What changes were proposed in this pull request? Our HDFS cluster configured 5 nameservices: `nameservices1`, `nameservices2`, `nameservices3`, `nameservices-dev1` and `nameservices4`, but `nameservices-dev1` unstable. So sometimes an error occurred and causing the entire job failed since [SPARK-24149](https://issues.apache.org/jira/browse/SPARK-24149): ![image](https://user-images.githubusercontent.com/5399861/42434779-f10c48fc-8386-11e8-98b0-4d9786014744.png) I think it's best to add a switch here. ## How was this patch tested? manual tests Closes apache#21734 from wangyum/SPARK-24149. Authored-by: Yuming Wang <[email protected]> Signed-off-by: Marcelo Vanzin <[email protected]>
1 parent 810d59c commit c3f285c

File tree

1 file changed

+5
-9
lines changed

1 file changed

+5
-9
lines changed

resource-managers/yarn/src/main/scala/org/apache/spark/deploy/yarn/YarnSparkHadoopUtil.scala

Lines changed: 5 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -27,11 +27,8 @@ import org.apache.hadoop.yarn.api.ApplicationConstants
2727
import org.apache.hadoop.yarn.api.records.{ApplicationAccessType, ContainerId, Priority}
2828
import org.apache.hadoop.yarn.util.ConverterUtils
2929

30-
import org.apache.spark.{SecurityManager, SparkConf, SparkException}
31-
import org.apache.spark.deploy.SparkHadoopUtil
30+
import org.apache.spark.{SecurityManager, SparkConf}
3231
import org.apache.spark.deploy.yarn.config._
33-
import org.apache.spark.deploy.yarn.security.YARNHadoopDelegationTokenManager
34-
import org.apache.spark.internal.config._
3532
import org.apache.spark.launcher.YarnCommandBuilderUtils
3633
import org.apache.spark.util.Utils
3734

@@ -193,8 +190,7 @@ object YarnSparkHadoopUtil {
193190
sparkConf: SparkConf,
194191
hadoopConf: Configuration): Set[FileSystem] = {
195192
val filesystemsToAccess = sparkConf.get(FILESYSTEMS_TO_ACCESS)
196-
.map(new Path(_).getFileSystem(hadoopConf))
197-
.toSet
193+
val requestAllDelegationTokens = filesystemsToAccess.isEmpty
198194

199195
val stagingFS = sparkConf.get(STAGING_DIR)
200196
.map(new Path(_).getFileSystem(hadoopConf))
@@ -203,8 +199,8 @@ object YarnSparkHadoopUtil {
203199
// Add the list of available namenodes for all namespaces in HDFS federation.
204200
// If ViewFS is enabled, this is skipped as ViewFS already handles delegation tokens for its
205201
// namespaces.
206-
val hadoopFilesystems = if (stagingFS.getScheme == "viewfs") {
207-
Set.empty
202+
val hadoopFilesystems = if (!requestAllDelegationTokens || stagingFS.getScheme == "viewfs") {
203+
filesystemsToAccess.map(new Path(_).getFileSystem(hadoopConf)).toSet
208204
} else {
209205
val nameservices = hadoopConf.getTrimmedStrings("dfs.nameservices")
210206
// Retrieving the filesystem for the nameservices where HA is not enabled
@@ -222,7 +218,7 @@ object YarnSparkHadoopUtil {
222218
(filesystemsWithoutHA ++ filesystemsWithHA).toSet
223219
}
224220

225-
filesystemsToAccess ++ hadoopFilesystems + stagingFS
221+
hadoopFilesystems + stagingFS
226222
}
227223

228224
}

0 commit comments

Comments
 (0)