Skip to content

Commit fd66f1d

Browse files
authored
Configure namenode HA (#33)
* Add namenode HA design doc * Switch images to png * Refer to png imaegs * Add journalnode * Hack HA * Both NNs start up * Use FQDNs for services * Clean up * Let datanode know HA * Enable automatic fail-over * Avoid interactive format * Avoid multi homed config * Add FIXME * Add http port * Simplfy start script writing * Use PV * Increase version #, make HA the only option * Fix bugs * Address FIXMEs * Support journal node quorum size * Updated READMEs * Reintroduce hdfs-simple-namenode-k8s as non-HA, non-Kerberos setup * Fix a typo in README * Add a TODO for kerberizing journal nodes * Turn off datanode hostname check * Support non-root uid, gid * Make journalnode volme size optional * Add a client pod chart * Specify disruption budget * Use tolerate-unready-endpoints for stable DNS entries * Address review comments * Use a config map to support different namenode start scripts
1 parent fa377c6 commit fd66f1d

File tree

19 files changed

+751
-63
lines changed

19 files changed

+751
-63
lines changed

charts/README.md

Lines changed: 10 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -4,13 +4,17 @@ Requires Kubernetes 1.6 as the `namenode` and `datanodes` are using `ClusterFirs
44

55
### Usage
66

7-
Helm charts for launching HDFS in a K8s cluster. They should be launched in
8-
the following order.
7+
Helm charts for launching HDFS daemons in a K8s cluster.
8+
The daemons should be launched in the following order.
99

10-
1. `hdfs-namenode-k8s`: Launches the hdfs namenode. See
11-
`hdfs-namenode-k8s/README.md` for how to launch.
12-
2. `hdfs-datanode-k8s`: Launches the hdfs datanode daemons. See
13-
`hdfs-datanode-k8s/README.md` for how to launch.
10+
1. hdfs namenode daemons. For the High Availity (HA)
11+
setup, follow instructions in `hdfs-namenode-k8s/README.md`. Or if you do
12+
not want the HA setup, follow `hdfs-simple-namenode-k8s/README.md` instead.
13+
2. hdfs datanode daemons. See `hdfs-datanode-k8s/README.md`
14+
for how to launch.
1415

1516
Kerberos is supported. See the `kerberosEnabled` option in the namenode and
1617
datanode charts.
18+
19+
There is also a HDFS client chart `hdfs-client` that can be convenient for
20+
testing.

charts/hdfs-client/Chart.yaml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one or more
2+
# contributor license agreements. See the NOTICE file distributed with
3+
# this work for additional information regarding copyright ownership.
4+
# The ASF licenses this file to You under the Apache License, Version 2.0
5+
# (the "License"); you may not use this file except in compliance with
6+
# the License. You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
name: hdfs-client-k8s
16+
version: 0.2
17+
description: Hadoop Distributed File System (HDFS) hosted by Kubernetes.
Lines changed: 90 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,90 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one or more
2+
# contributor license agreements. See the NOTICE file distributed with
3+
# this work for additional information regarding copyright ownership.
4+
# The ASF licenses this file to You under the Apache License, Version 2.0
5+
# (the "License"); you may not use this file except in compliance with
6+
# the License. You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
apiVersion: apps/v1
17+
apiVersion: extensions/v1beta1
18+
kind: Deployment
19+
metadata:
20+
name: hdfs-client
21+
labels:
22+
app: hdfs-client
23+
spec:
24+
replicas: 1
25+
selector:
26+
matchLabels:
27+
app: hdfs-client
28+
template:
29+
metadata:
30+
labels:
31+
app: hdfs-client
32+
spec:
33+
containers:
34+
- name: hdfs-client
35+
image: uhopper/hadoop:2.7.2
36+
env:
37+
# The following env vars are listed according to low-to-high precedence order.
38+
# i.e. Whoever comes last will override the earlier value of the same variable.
39+
{{- if .Values.kerberosEnabled }}
40+
- name: CORE_CONF_hadoop_security_authentication
41+
value: kerberos
42+
- name: CORE_CONF_hadoop_security_authorization
43+
value: "true"
44+
- name: CORE_CONF_hadoop_rpc_protection
45+
value: privacy
46+
{{- end }}
47+
{{- range $key, $value := .Values.customHadoopConfig }}
48+
- name: {{ $key | quote }}
49+
value: {{ $value | quote }}
50+
{{- end }}
51+
{{- if .Values.namenodeHAEnabled }}
52+
- name: CORE_CONF_fs_defaultFS
53+
value: hdfs://hdfs-k8s
54+
- name: HDFS_CONF_dfs_nameservices
55+
value: hdfs-k8s
56+
- name: HDFS_CONF_dfs_ha_namenodes_hdfs___k8s
57+
value: nn0,nn1
58+
- name: HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn0
59+
value: hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
60+
- name: HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn1
61+
value: hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:8020
62+
- name: HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn0
63+
value: hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:50070
64+
- name: HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn1
65+
value: hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:50070
66+
- name: HDFS_CONF_dfs_client_failover_proxy_provider_hdfs___k8s
67+
value: org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider
68+
{{- else }}
69+
- name: CORE_CONF_fs_defaultFS
70+
value: hdfs://hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
71+
{{- end }}
72+
- name: MULTIHOMED_NETWORK
73+
value: "0"
74+
command: ['/bin/sh', '-c']
75+
args:
76+
- /entrypoint.sh /usr/bin/tail -f /var/log/dmesg
77+
volumeMounts:
78+
{{- if .Values.kerberosEnabled }}
79+
- name: kerberos-config
80+
mountPath: /etc/krb5.conf
81+
subPath: {{ .Values.kerberosConfigFileName }}
82+
readOnly: true
83+
{{- end }}
84+
restartPolicy: Always
85+
volumes:
86+
{{- if .Values.kerberosEnabled }}
87+
- name: kerberos-config
88+
configMap:
89+
name: {{ .Values.kerberosConfigMap }}
90+
{{- end }}

charts/hdfs-client/values.yaml

Lines changed: 44 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,44 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one or more
2+
# contributor license agreements. See the NOTICE file distributed with
3+
# this work for additional information regarding copyright ownership.
4+
# The ASF licenses this file to You under the Apache License, Version 2.0
5+
# (the "License"); you may not use this file except in compliance with
6+
# the License. You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
16+
# Default values for template variables.
17+
18+
# Custom hadoop config keys passed through env variables to hadoop uhopper images.
19+
# See https://hub.docker.com/r/uhopper/hadoop/ to get more details
20+
# Please note that these are not hadoop env variables, but docker env variables that
21+
# will be transformed into hadoop config keys
22+
# HDFS_CONF_dfs_datanode_data_dir and CORE_CONF_fs_defaultFS need special handling and
23+
# they're already set by the chart so any value coming from below config will be ignored
24+
customHadoopConfig: {}
25+
# Set variables through a hash where env variable is the key, e.g.
26+
# HDFS_CONF_dfs_datanode_use_datanode_hostname: "false"
27+
28+
# Whether or not Kerberos support is enabled.
29+
kerberosEnabled: false
30+
31+
# Required to be non-empty if Kerberos is enabled. Specify your Kerberos realm name.
32+
# This should match the realm name in your Kerberos config file.
33+
kerberosRealm: ""
34+
35+
# Effective only if Kerberos is enabled. Name of the k8s config map containing
36+
# the kerberos config file.
37+
kerberosConfigMap: kerberos-config
38+
39+
# Effective only if Kerberos is enabled. Name of the kerberos config file inside
40+
# the config map.
41+
kerberosConfigFileName: krb5.conf
42+
43+
# Whether or not to expect namenodes in the HA setup.
44+
namenodeHAEnabled: true

charts/hdfs-datanode-k8s/Chart.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,5 +13,5 @@
1313
# See the License for the specific language governing permissions and
1414
# limitations under the License.
1515
name: hdfs-datanode-k8s
16-
version: 0.1
16+
version: 0.2
1717
description: Hadoop Distributed File System (HDFS) hosted by Kubernetes.

charts/hdfs-datanode-k8s/README.md

Lines changed: 24 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -5,36 +5,39 @@ HDFS `datanodes` running inside a kubernetes cluster. See the other chart for
55

66
1. In some setup, the master node may launch a datanode. To prevent this,
77
label the master node with `hdfs-datanode-exclude`.
8-
```
9-
$ kubectl label node YOUR-MASTER-NAME hdfs-datanode-exclude=yes
10-
```
8+
```
9+
$ kubectl label node YOUR-MASTER-NAME hdfs-datanode-exclude=yes
10+
```
1111
1212
2. (Skip this if you do not plan to enable Kerberos)
1313
Conduct the Kerberos setups described in the namenode
1414
[README.md](../hdfs-namenode-k8s/README.md), if you have not done that
1515
already.
1616
1717
3. Launch this helm chart, `hdfs-datanode-k8s`.
18-
19-
```
20-
$ helm install -n my-hdfs-datanode hdfs-datanode-k8s
21-
```
22-
23-
If enabling Kerberos, specify necessary options. For instance,
24-
25-
```
26-
$ helm install -n my-hdfs-datanode \
27-
--set kerberosEnabled=true,kerberosRealm=MYCOMPANY.COM hdfs-datanode-k8s
28-
```
29-
The two variables above are required. For other variables, see values.yaml.
18+
```
19+
$ helm install -n my-hdfs-datanode hdfs-datanode-k8s
20+
```
21+
If enabling Kerberos, specify necessary options. For instance,
22+
```
23+
$ helm install -n my-hdfs-datanode \
24+
--set kerberosEnabled=true,kerberosRealm=MYCOMPANY.COM hdfs-datanode-k8s
25+
```
26+
The two variables above are required. For other variables, see values.yaml.
27+
If you have launched the non-HA namenode using
28+
the `hdfs-simple-namenode-k8s` chart, set the namenodeHAEnabled option to
29+
false.
30+
```
31+
$ helm install -n my-hdfs-datanode \
32+
--set namenodeHAEnabled=false hdfs-datanode-k8s
33+
```
3034
3135
4. Confirm the daemons are launched.
32-
33-
```
34-
$ kubectl get pods | grep hdfs-datanode-
35-
hdfs-datanode-ajdcz 1/1 Running 0 7m
36-
hdfs-datanode-f1w24 1/1 Running 0 7m
37-
```
36+
```
37+
$ kubectl get pods | grep hdfs-datanode-
38+
hdfs-datanode-ajdcz 1/1 Running 0 7m
39+
hdfs-datanode-f1w24 1/1 Running 0 7m
40+
```
3841
3942
`Datanode` daemons run on every cluster node. They also mount k8s `hostPath`
4043
local disk volumes. You may want to restrict access of `hostPath`

charts/hdfs-datanode-k8s/templates/datanode-daemonset.yaml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -72,8 +72,27 @@ spec:
7272
- name: {{ $key | quote }}
7373
value: {{ $value | quote }}
7474
{{- end }}
75+
{{- if .Values.namenodeHAEnabled }}
76+
- name: CORE_CONF_fs_defaultFS
77+
value: hdfs://hdfs-k8s
78+
- name: HDFS_CONF_dfs_nameservices
79+
value: hdfs-k8s
80+
- name: HDFS_CONF_dfs_ha_namenodes_hdfs___k8s
81+
value: nn0,nn1
82+
- name: HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn0
83+
value: hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
84+
- name: HDFS_CONF_dfs_namenode_rpc___address_hdfs___k8s_nn1
85+
value: hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:8020
86+
- name: HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn0
87+
value: hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:50070
88+
- name: HDFS_CONF_dfs_namenode_http___address_hdfs___k8s_nn1
89+
value: hdfs-namenode-1.hdfs-namenode.default.svc.cluster.local:50070
90+
{{- else }}
7591
- name: CORE_CONF_fs_defaultFS
7692
value: hdfs://hdfs-namenode-0.hdfs-namenode.default.svc.cluster.local:8020
93+
{{- end }}
94+
- name: MULTIHOMED_NETWORK
95+
value: "0"
7796
# The below uses two loops to make sure the last item does not have comma. It uses index 0
7897
# for the last item since that is the only special index that helm template gives us.
7998
- name: HDFS_CONF_dfs_datanode_data_dir

charts/hdfs-datanode-k8s/values.yaml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -59,3 +59,6 @@ kerberosKeytabsSecret: hdfs-kerberos-keytabs
5959
# the jsvc utility. See the reference doc at
6060
# https://hadoop.apache.org/docs/r2.7.2/hadoop-project-dist/hadoop-common/SecureMode.html#Secure_DataNode
6161
jsvcEnabled: true
62+
63+
# Whether or not to expect namenodes in the HA setup.
64+
namenodeHAEnabled: true
Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,17 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one or more
2+
# contributor license agreements. See the NOTICE file distributed with
3+
# this work for additional information regarding copyright ownership.
4+
# The ASF licenses this file to You under the Apache License, Version 2.0
5+
# (the "License"); you may not use this file except in compliance with
6+
# the License. You may obtain a copy of the License at
7+
#
8+
# http://www.apache.org/licenses/LICENSE-2.0
9+
#
10+
# Unless required by applicable law or agreed to in writing, software
11+
# distributed under the License is distributed on an "AS IS" BASIS,
12+
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
13+
# See the License for the specific language governing permissions and
14+
# limitations under the License.
15+
name: hdfs-journalnode-k8s
16+
version: 0.2
17+
description: Hadoop Distributed File System (HDFS) hosted by Kubernetes.

0 commit comments

Comments
 (0)