@@ -74,57 +74,56 @@ identifier
74
74
| Delegation Token | A token which can be passed to another process. |
75
75
76
76
77
- ### Authentication Tokens
77
+ ### Authentication Tokens
78
78
79
- Authentication Tokens are explicitly issued by services to allow the caller to
80
- interact with the service without having to re-request tickets from the TGT.
81
-
82
- When an Authentication Tokens expires, the caller must request a new one off the service.
83
- If the Kerberos ticket to interact with the service has expired, this may include
84
- re-requesting a ticket off the TGS, or even re-logging in to kerberos to obtain a new TGT.
85
-
86
- As such, they are almost equivalent to Kerberos Tickets -except that it is the
87
- distributed services themselves issuing the Authentication Token, not the TGS.
88
-
89
- ### Delegation Tokens
90
-
91
- A delegation token is requested by a client of a service; they can be passed to
92
- other processes.
93
-
94
- When the token expires, the original client must request a new delegation token
95
- and pass it on to the other process, again.
96
-
97
- What is more important is: *
98
-
99
- 1 . delegation tokens can be renewed before they expire.
79
+ Authentication Tokens are explicitly issued by services to allow the caller to
80
+ interact with the service without having to re-request tickets from the TGT.
81
+
82
+ When an Authentication Tokens expires, the caller must request a new one off the service.
83
+ If the Kerberos ticket to interact with the service has expired, this may include
84
+ re-requesting a ticket off the TGS, or even re-logging in to kerberos to obtain a new TGT.
85
+
86
+ As such, they are almost equivalent to Kerberos Tickets -except that it is the
87
+ distributed services themselves issuing the Authentication Token, not the TGS.
88
+
89
+ ### Delegation Tokens
90
+
91
+ A delegation token is requested by a client of a service; they can be passed to
92
+ other processes.
93
+
94
+ When the token expires, the original client must request a new delegation token
95
+ and pass it on to the other process, again.
96
+
97
+ What is more important is:
98
+
99
+ * delegation tokens can be renewed before they expire.*
100
+
101
+ This is a fundamental difference between Kerberos Tickets and Hadoop Delegation Tokens.
102
+
103
+ Holders of delegation tokens may renew them with a token-specific ` TokenRenewer ` service,
104
+ so refresh them without needing the Kerberos credentials to log in to kerberos.
105
+
106
+ More subtly
107
+
108
+ 1 . The tokens must be renewed before they expire: once expired, a token is worthless.
109
+ 1 . Token renewers can be implemented as a Hadoop RPC service, or by other means, * including HTTP* .
110
+ 1 . Token renewal may simply be the updating of an expiry time in the server, without pushing
111
+ out new tokens to the clients. This scales well when there are many processes across
112
+ the cluster associated with a single application..
113
+
114
+ For the HDFS Client protocol, the client protocol itself is the token renewer. A client may
115
+ talk to the Namenode using its current token, and request a new one, so refreshing it.
116
+
117
+ In contrast, the YARN timeline service is a pure REST API, which implements its token renewal over
118
+ HTTP/HTTPS. To refresh the token, the client must issue an HTTP request (a PUT operation, interestingly
119
+ enough), receiving a new token as a response.
120
+
121
+ Other delegation token renewal mechanisms alongside Hadoop RPC and HTTP could be implemented,
122
+ that is a detail which client applications do not need to care about. All the matters is that
123
+ they have the code to refresh tokens, usually code which lives alongside the RPC/REST client,
124
+ * and keep renewing the tokens on a regularl basis* . Generally this is done by starting
125
+ a thread in the background.
100
126
101
-
102
- This is a fundamental difference between Kerberos Tickets and Hadoop Delegation Tokens.
103
-
104
- Holders of delegation tokens may renew them with a token-specific ` TokenRenewer ` service,
105
- so refresh them without needing the Kerberos credentials to log in to kerberos.
106
-
107
- More subtly
108
-
109
- 1 . The tokens must be renewed before they expire: once expired, a token is worthless.
110
- 1 . Token renewers can be implemented as a Hadoop RPC service, or by other means, * including HTTP* .
111
- 1 . Token renewal may simply be the updating of an expiry time in the server, without pushing
112
- out new tokens to the clients. This scales well when there are many processes across
113
- the cluster associated with a single application..
114
-
115
- For the HDFS Client protocol, the client protocol itself is the token renewer. A client may
116
- talk to the Namenode using its current token, and request a new one, so refreshing it.
117
-
118
- In contrast, the YARN timeline service is a pure REST API, which implements its token renewal over
119
- HTTP/HTTPS. To refresh the token, the client must issue an HTTP request (a PUT operation, interestingly
120
- enough), receiving a new token as a response.
121
-
122
- Other delegation token renewal mechanisms alongside Hadoop RPC and HTTP could be implemented,
123
- that is a detail which client applications do not need to care about. All the matters is that
124
- they have the code to refresh tokens, usually code which lives alongside the RPC/REST client,
125
- * and keep renewing the tokens on a regularl basis* . Generally this is done by starting
126
- a thread in the background.
127
-
128
127
129
128
# Delegation Token revocation
130
129
@@ -145,41 +144,41 @@ can be granted to applications running in the cluster. This explicitly avoid the
145
144
offer that feature?).
146
145
147
146
## Example
148
-
149
- Imagine a user deploying a YARN application in a cluster, one which needs
150
- access to the user's data stored in HDFS. The user would be required to be authenticated with
151
- the KDC, and have been granted a * Ticket Granting Ticket* ; the ticket needed to work with
152
- the TGS.
153
-
154
- The client-side launcher of the YARN application would be able to talk to HDFS and the YARN
155
- resource manager, because the user was logged in to Kerberos. This would be managed in the Hadoop
156
- RPC layer, requesting tickets to talk to the HDFS NameNode and YARN ResourceManager, if needed.
157
-
158
- To give the YARN application the same rights to HDFS, the client-side application must
159
- request a Delegation Token to talk to HDFS, a key which is then passed to the YARN application in
160
- the ` ContainerLaunchContext ` within the ` ApplicationSubmissionContext ` used to define the
161
- application to launch: its required container resources, artifacts to download, "localize",
162
- environment to set up and command to run.
163
-
164
- The YARN resource manager finds a location for the Application Master, and requests that
165
- hosts' Node Manager start the container/application.
166
-
167
- The Node Manager uses the "delegated HDFS token" to download the launch-time resources into
168
- a local directory space, then executes the application.
169
-
170
- * Somehow* , the HDFS token (and any other supplied tokens) are passed to the application that
171
- has been launched.
172
-
173
- The launched application master can use this token to interact with HDFS * as the original user* .
174
-
175
- The AM can also pass token(s) on to launched containers, so that they too have access to HDFS.
176
-
177
-
178
- The Hadoop Name Node does not need to care whether the caller is the user themselves, the Node Manager
179
- localizing the container, the launched application or any launched containers. All it does is verify
180
- that when a caller requests access to the HDFS filesystem metadata or the contents of a file,
181
- it must have a ticket/token which declares that they are the specific user, and that the token
182
- is currently considered valid (based on the expiry time and the clock value of the Name Node)
147
+
148
+ Imagine a user deploying a YARN application in a cluster, one which needs
149
+ access to the user's data stored in HDFS. The user would be required to be authenticated with
150
+ the KDC, and have been granted a * Ticket Granting Ticket* ; the ticket needed to work with
151
+ the TGS.
152
+
153
+ The client-side launcher of the YARN application would be able to talk to HDFS and the YARN
154
+ resource manager, because the user was logged in to Kerberos. This would be managed in the Hadoop
155
+ RPC layer, requesting tickets to talk to the HDFS NameNode and YARN ResourceManager, if needed.
156
+
157
+ To give the YARN application the same rights to HDFS, the client-side application must
158
+ request a Delegation Token to talk to HDFS, a key which is then passed to the YARN application in
159
+ the ` ContainerLaunchContext ` within the ` ApplicationSubmissionContext ` used to define the
160
+ application to launch: its required container resources, artifacts to download, "localize",
161
+ environment to set up and command to run.
162
+
163
+ The YARN resource manager finds a location for the Application Master, and requests that
164
+ hosts' Node Manager start the container/application.
165
+
166
+ The Node Manager uses the "delegated HDFS token" to download the launch-time resources into
167
+ a local directory space, then executes the application.
168
+
169
+ * Somehow* , the HDFS token (and any other supplied tokens) are passed to the application that
170
+ has been launched.
171
+
172
+ The launched application master can use this token to interact with HDFS * as the original user* .
173
+
174
+ The AM can also pass token(s) on to launched containers, so that they too have access to HDFS.
175
+
176
+
177
+ The Hadoop NameNode does not need to care whether the caller is the user themselves, the Node Manager
178
+ localizing the container, the launched application or any launched containers. All it does is verify
179
+ that when a caller requests access to the HDFS filesystem metadata or the contents of a file,
180
+ it must have a ticket/token which declares that they are the specific user, and that the token
181
+ is currently considered valid (based on the expiry time and the clock value of the Name Node)
183
182
184
183
185
184
@@ -330,4 +329,4 @@ work to Oozie to have a keytab and to pass it to Oozie.
330
329
331
330
## Weaknesses
332
331
333
- 1 . Any compromised DN can create block tokens.
332
+ 1 . Any compromised DN can create block tokens.
0 commit comments