@@ -42,148 +42,10 @@ import org.apache.spark.util.Utils
42
42
* should access it from that. There are some cases where the SparkEnv hasn't been
43
43
* initialized yet and this class must be instantiated directly.
44
44
*
45
- * Spark currently supports authentication via a shared secret.
46
- * Authentication can be configured to be on via the 'spark.authenticate' configuration
47
- * parameter. This parameter controls whether the Spark communication protocols do
48
- * authentication using the shared secret. This authentication is a basic handshake to
49
- * make sure both sides have the same shared secret and are allowed to communicate.
50
- * If the shared secret is not identical they will not be allowed to communicate.
51
- *
52
- * The Spark UI can also be secured by using javax servlet filters. A user may want to
53
- * secure the UI if it has data that other users should not be allowed to see. The javax
54
- * servlet filter specified by the user can authenticate the user and then once the user
55
- * is logged in, Spark can compare that user versus the view acls to make sure they are
56
- * authorized to view the UI. The configs 'spark.acls.enable', 'spark.ui.view.acls' and
57
- * 'spark.ui.view.acls.groups' control the behavior of the acls. Note that the person who
58
- * started the application always has view access to the UI.
59
- *
60
- * Spark has a set of individual and group modify acls (`spark.modify.acls`) and
61
- * (`spark.modify.acls.groups`) that controls which users and groups have permission to
62
- * modify a single application. This would include things like killing the application.
63
- * By default the person who started the application has modify access. For modify access
64
- * through the UI, you must have a filter that does authentication in place for the modify
65
- * acls to work properly.
66
- *
67
- * Spark also has a set of individual and group admin acls (`spark.admin.acls`) and
68
- * (`spark.admin.acls.groups`) which is a set of users/administrators and admin groups
69
- * who always have permission to view or modify the Spark application.
70
- *
71
- * Starting from version 1.3, Spark has partial support for encrypted connections with SSL.
72
- *
73
- * At this point spark has multiple communication protocols that need to be secured and
74
- * different underlying mechanisms are used depending on the protocol:
75
- *
76
- * - HTTP for broadcast and file server (via HttpServer) -> Spark currently uses Jetty
77
- * for the HttpServer. Jetty supports multiple authentication mechanisms -
78
- * Basic, Digest, Form, Spnego, etc. It also supports multiple different login
79
- * services - Hash, JAAS, Spnego, JDBC, etc. Spark currently uses the HashLoginService
80
- * to authenticate using DIGEST-MD5 via a single user and the shared secret.
81
- * Since we are using DIGEST-MD5, the shared secret is not passed on the wire
82
- * in plaintext.
83
- *
84
- * We currently support SSL (https) for this communication protocol (see the details
85
- * below).
86
- *
87
- * The Spark HttpServer installs the HashLoginServer and configures it to DIGEST-MD5.
88
- * Any clients must specify the user and password. There is a default
89
- * Authenticator installed in the SecurityManager to how it does the authentication
90
- * and in this case gets the user name and password from the request.
91
- *
92
- * - BlockTransferService -> The Spark BlockTransferServices uses java nio to asynchronously
93
- * exchange messages. For this we use the Java SASL
94
- * (Simple Authentication and Security Layer) API and again use DIGEST-MD5
95
- * as the authentication mechanism. This means the shared secret is not passed
96
- * over the wire in plaintext.
97
- * Note that SASL is pluggable as to what mechanism it uses. We currently use
98
- * DIGEST-MD5 but this could be changed to use Kerberos or other in the future.
99
- * Spark currently supports "auth" for the quality of protection, which means
100
- * the connection does not support integrity or privacy protection (encryption)
101
- * after authentication. SASL also supports "auth-int" and "auth-conf" which
102
- * SPARK could support in the future to allow the user to specify the quality
103
- * of protection they want. If we support those, the messages will also have to
104
- * be wrapped and unwrapped via the SaslServer/SaslClient.wrap/unwrap API's.
105
- *
106
- * Since the NioBlockTransferService does asynchronous messages passing, the SASL
107
- * authentication is a bit more complex. A ConnectionManager can be both a client
108
- * and a Server, so for a particular connection it has to determine what to do.
109
- * A ConnectionId was added to be able to track connections and is used to
110
- * match up incoming messages with connections waiting for authentication.
111
- * The ConnectionManager tracks all the sendingConnections using the ConnectionId,
112
- * waits for the response from the server, and does the handshake before sending
113
- * the real message.
114
- *
115
- * The NettyBlockTransferService ensures that SASL authentication is performed
116
- * synchronously prior to any other communication on a connection. This is done in
117
- * SaslClientBootstrap on the client side and SaslRpcHandler on the server side.
118
- *
119
- * - HTTP for the Spark UI -> the UI was changed to use servlets so that javax servlet filters
120
- * can be used. Yarn requires a specific AmIpFilter be installed for security to work
121
- * properly. For non-Yarn deployments, users can write a filter to go through their
122
- * organization's normal login service. If an authentication filter is in place then the
123
- * SparkUI can be configured to check the logged in user against the list of users who
124
- * have view acls to see if that user is authorized.
125
- * The filters can also be used for many different purposes. For instance filters
126
- * could be used for logging, encryption, or compression.
127
- *
128
- * The exact mechanisms used to generate/distribute the shared secret are deployment-specific.
129
- *
130
- * For YARN deployments, the secret is automatically generated. The secret is placed in the Hadoop
131
- * UGI which gets passed around via the Hadoop RPC mechanism. Hadoop RPC can be configured to
132
- * support different levels of protection. See the Hadoop documentation for more details. Each
133
- * Spark application on YARN gets a different shared secret.
134
- *
135
- * On YARN, the Spark UI gets configured to use the Hadoop YARN AmIpFilter which requires the user
136
- * to go through the ResourceManager Proxy. That proxy is there to reduce the possibility of web
137
- * based attacks through YARN. Hadoop can be configured to use filters to do authentication. That
138
- * authentication then happens via the ResourceManager Proxy and Spark will use that to do
139
- * authorization against the view acls.
140
- *
141
- * For other Spark deployments, the shared secret must be specified via the
142
- * spark.authenticate.secret config.
143
- * All the nodes (Master and Workers) and the applications need to have the same shared secret.
144
- * This again is not ideal as one user could potentially affect another users application.
145
- * This should be enhanced in the future to provide better protection.
146
- * If the UI needs to be secure, the user needs to install a javax servlet filter to do the
147
- * authentication. Spark will then use that user to compare against the view acls to do
148
- * authorization. If not filter is in place the user is generally null and no authorization
149
- * can take place.
150
- *
151
- * When authentication is being used, encryption can also be enabled by setting the option
152
- * spark.authenticate.enableSaslEncryption to true. This is only supported by communication
153
- * channels that use the network-common library, and can be used as an alternative to SSL in those
154
- * cases.
155
- *
156
- * SSL can be used for encryption for certain communication channels. The user can configure the
157
- * default SSL settings which will be used for all the supported communication protocols unless
158
- * they are overwritten by protocol specific settings. This way the user can easily provide the
159
- * common settings for all the protocols without disabling the ability to configure each one
160
- * individually.
161
- *
162
- * All the SSL settings like `spark.ssl.xxx` where `xxx` is a particular configuration property,
163
- * denote the global configuration for all the supported protocols. In order to override the global
164
- * configuration for the particular protocol, the properties must be overwritten in the
165
- * protocol-specific namespace. Use `spark.ssl.yyy.xxx` settings to overwrite the global
166
- * configuration for particular protocol denoted by `yyy`. Currently `yyy` can be only`fs` for
167
- * broadcast and file server.
168
- *
169
- * Refer to [[org.apache.spark.SSLOptions ]] documentation for the list of
170
- * options that can be specified.
171
- *
172
- * SecurityManager initializes SSLOptions objects for different protocols separately. SSLOptions
173
- * object parses Spark configuration at a given namespace and builds the common representation
174
- * of SSL settings. SSLOptions is then used to provide protocol-specific SSLContextFactory for
175
- * Jetty.
176
- *
177
- * SSL must be configured on each node and configured for each component involved in
178
- * communication using the particular protocol. In YARN clusters, the key-store can be prepared on
179
- * the client side then distributed and used by the executors as the part of the application
180
- * (YARN allows the user to deploy files before the application is started).
181
- * In standalone deployment, the user needs to provide key-stores and configuration
182
- * options for master and workers. In this mode, the user may allow the executors to use the SSL
183
- * settings inherited from the worker which spawned that executor. It can be accomplished by
184
- * setting `spark.ssl.useNodeLocalConf` to `true`.
45
+ * This class implements all of the configuration related to security features described
46
+ * in the "Security" document. Please refer to that document for specific features implemented
47
+ * here.
185
48
*/
186
-
187
49
private [spark] class SecurityManager (
188
50
sparkConf : SparkConf ,
189
51
val ioEncryptionKey : Option [Array [Byte ]] = None )
0 commit comments