@@ -33,3 +33,313 @@ Please open Git Issues if you would like to see updates/other plugin integration
3333 - Apache Ranger: https://ranger.apache.org/
3434 - Apache Ranger + Amazon EMR Blog: https://aws.amazon.com/blogs/big-data/implementing-authorization-and-auditing-using-apache-ranger-on-amazon-emr/
3535 - Apache Ranger Presto Plugin: https://cwiki.apache.org/confluence/display/RANGER/Presto+Plugin
36+
37+ ---
38+
39+ # Sub Project RANGER-EMR-CLI-INSTALLER: A CLI Tool for Ranger Self Installing and Integrating with AWS EMR Cluster and AD/LDAP
40+
41+ This is a command line tool which is used to install ranger and integrate a AWS EMR cluster and a windows AD or Open LDAP server as authentication channel. There is another closely related project: ** [ ranger-emr-cfn-installer] ( https://github.com/bluishglc/ranger-emr-cfn-installer ) ** which does the same job via aws cloudformation. The two projects are very close, but can work independently,you can pick anyone as you wish.
42+
43+ ## 1. Ranger Introduction
44+
45+ Let’s check out Ranger's architecture:
46+
47+ ![ ranger-architecture] ( https://user-images.githubusercontent.com/5539582/99872048-f0c24480-2c19-11eb-8c0f-43df2552837c.png )
48+
49+ Ranger has 5 parts:
50+
51+ 1 . Ranger Admin Service
52+ 2 . Ranger UserSync Service
53+ 3 . A Backend RDB for Storing User's Authorization
54+ 4 . A Solr Server for Storing Audit Log
55+ 5 . A Series of Plugins for Big Data Components/Services
56+
57+ Besides above, there are 2 external dependencies For Ranger to integrate:
58+
59+ 6 . A Windows AD or Open LDAD Server as Authentication Channel
60+ 7 . A Hadoop (AWS EMR) Cluster to Be Managed by Ranger
61+
62+ So, a fully Ranger installation will cover following jobs:
63+
64+ 1 . Install JDK (Required by Ranger Admin and Solr)
65+ 2 . Install MySQL (As Ranger Backend RDB)
66+ 3 . Install Solr (As Ranger Audit Store)
67+ 4 . Install Ranger Admin (and Integrate with AD/LDAP Server)
68+ 5 . Install Ranger UserSync (and Integrate with AD/LDAP Server)
69+ 6 . Install Ranger Plugins (i.e. HDFS, Hive, HBase and so on)
70+
71+ ## 2. Prerequisites
72+
73+ Before installing, make sure following items are ready or done:
74+
75+ 1 . Make sure the EMR cluster is in waiting status, no any job is running
76+ 2 . Upload your private SSH key (the pem file) to ranger server, for example ` /home/ec2-user/key.pem `
77+ 3 . It's recommanded to explore users and groups on Windows AD or Open LDAP via GUI tool, for example LDAP Admin, so as to detemine AD/LDAP related parameters
78+ 4 . Check network connectivities among Ranger server, Windows AD or Open LDAP server and EMR nodes
79+
80+ ## 3. Download
81+
82+ 1 . First of all, setup a clean linux server, login and switch to ` root ` user.
83+
84+ 2 . Install git and check out this project.
85+
86+ ``` bash
87+ yum -y install git
88+ git clone https://github.com/bluishglc/ranger-emr-cli-installer.git /home/ec2-user/ranger-emr-cli-installer
89+ ```
90+
91+ ## 4. Usage
92+
93+ After download, let's print usage to check if the cli tool is ready to use:
94+
95+ ``` bash
96+ sh /home/ec2-user/ranger-emr-cli-installer/bin/setup.sh help
97+ ```
98+ if goes well, the console will print all actions and options supported by this CLI tool:
99+
100+ ```
101+ ============================= RANGER-EMR-CLI-INSTALLER USAGE =============================
102+
103+ SYNOPSIS
104+
105+ sudo sh ranger-emr-cli-installer/bin/setup.sh [ACTION] [--OPTION1 VALUE1] [--OPTION2 VALUE2]...
106+
107+ ACTIONS:
108+
109+ install Install all components
110+ install-ranger Install ranger only
111+ install-ranger-plugins Install ranger plugin only
112+ test-emr-ssh-connectivity Test EMR ssh connectivity
113+ test-emr-namenode-connectivity Test EMR namenode connectivity
114+ test-ldap-connectivity Test LDAP connectivity
115+ install-mysql Install MySQL
116+ test-mysql-connectivity Test MySQL connectivity
117+ install-mysql-jdbc-driver Install MySQL JDBC driver
118+ install-jdk Install JDK8
119+ download-ranger Download ranger
120+ install-solr Install solr
121+ test-solr-connectivity Test solr connectivity
122+ init-solr-as-ranger-audit-store Test solr connectivity
123+ init-ranger-admin-db Init ranger admin db
124+ install-ranger-admin Install ranger admin
125+ install-ranger-usersync Install ranger usersync
126+ help Print help
127+
128+ OPTIONS:
129+
130+ --auth-type [ad|ldap] Authentication type, optional value: ad or ldap
131+ --ad-domain Specify the domain name of windows ad server
132+ --ad-url Specify the ldap url of windows ad server, i.e. ldap://10.0.0.1
133+ --ad-base-dn Specify the base dn of windows ad server
134+ --ad-bind-dn Specify the bind dn of windows ad server
135+ --ad-bind-password Specify the bind password of windows ad server
136+ --ad-user-object-class Specify the user object class of windows ad server
137+ --ldap-url Specify the ldap url of Open LDAP, i.e. ldap://10.0.0.1
138+ --ldap-user-dn-pattern Specify the user dn pattern of Open LDAP
139+ --ldap-group-search-filter Specify the group search filter of Open LDAP
140+ --ldap-base-dn Specify the base dn of Open LDAP
141+ --ldap-bind-dn Specify the bind dn of Open LDAP
142+ --ldap-bind-password Specify the bind password of Open LDAP
143+ --ldap-user-object-class Specify the user object class of Open LDAP
144+ --java-home Specify the JAVA_HOME path, default value is /usr/lib/jvm/java
145+ --skip-install-mysql [true|false] Specify If skip mysql installing or not, default value is 'false'
146+ --mysql-host Specify the mysql server hostname or IP, default value is current host IP
147+ --mysql-root-password Specify the root password of mysql
148+ --mysql-ranger-db-user-password Specify the ranger db user password of mysql
149+ --solr-host Specify the solr server hostname or IP, default value is current host IP
150+ --skip-install-solr [true|false] Specify If skip solr installing or not, default value is 'false'
151+ --ranger-host Specify the ranger server hostname or IP, default value is current host IP
152+ --ranger-version [2.1.0] Specify the ranger version, now only Ranger 2.1.0 is supported
153+ --ranger-repo-url Specify the ranger repository url
154+ --ranger-plugins [hdfs|hive|hbase] Specify what plugins will be installed(accept multiple comma-separated values), now support hdfs, hive and hbase
155+ --emr-master-nodes Specify master nodes list of EMR cluster(accept multiple comma-separated values), i.e. 10.0.0.1,10.0.0.2,10.0.0.3
156+ --emr-core-nodes Specify core nodes list of EMR cluster(accept multiple comma-separated values), i.e. 10.0.0.4,10.0.0.5,10.0.0.6
157+ --emr-ssh-key Specify the path of ssh key to connect EMR nodes
158+ --restart-interval Specify the restart interval
159+
160+ ```
161+
162+ This means the tool is ready to use.
163+
164+ ## 5. Examples
165+
166+ To explain how to use this cli tool, assume we have following environment:
167+
168+ ** A Windows AD Server:**
169+
170+ Key|Value
171+ ---------:|:-----
172+ &emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ; IP|10.0.0.194
173+ Domain Name|corp.emr.local
174+ Base DN|cn=users,dc=corp,dc=emr,dc=local
175+ Bind DN|cn=ranger,ou=service accounts,dc=example,dc=com
176+ Bind DN Password|Admin1234!
177+ User Object Class|person
178+
179+ ** An Open LDAP Server:**
180+
181+ Key|Value
182+ ---------:|:-----
183+ &emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ;&emsp ; IP|10.0.0.41
184+ Base DN|dc=example,dc=com
185+ Bind DN|cn=ranger,ou=service accounts,dc=example,dc=com
186+ Bind DN Password|Admin1234!
187+ User DN Pattern|uid={0},dc=example,dc=com
188+ Bind Group Search Filter|(member=uid={0},dc=example,dc=com)
189+ User Object Class|inetOrgPerson
190+
191+
192+ ** A Multi-Master EMR Cluster:**
193+
194+ Node|IP
195+ ---:|:---
196+ &emsp ;&emsp ;&emsp ;&emsp ;&emsp ; Master Nodes|10.0.0.177,10.0.0.199,10.0.0.21
197+ Core Nodes|10.0.0.114,10.0.0.136
198+
199+
200+ ** A Normal EMR Cluster:**
201+
202+ Node|IP
203+ ---:|:---
204+ &emsp ;&emsp ;&emsp ;&emsp ;&emsp ; Master Nodes|10.0.0.177,10.0.0.199,10.0.0.21
205+ Core Nodes|10.0.0.114,10.0.0.136
206+
207+ ### 5.1. Install Ranger + Integrate a Window AD Server + Integrate A Multi-Master EMR Cluster
208+
209+ The following diagram illustrates what this example will do:
210+
211+ ![ example1] ( https://user-images.githubusercontent.com/5539582/99872053-fc157000-2c19-11eb-94c4-ee36ed30ce14.png )
212+
213+ The following command line will finish this job:
214+
215+ ``` bash
216+ sudo sh ranger-emr-cli-installer/bin/setup.sh install \
217+ --auth-type ad \
218+ --ad-domain corp.emr.local \
219+ --ad-url ldap://10.0.0.194 \
220+ --ad-base-dn ' cn=users,dc=corp,dc=emr,dc=local' \
221+ --ad-bind-dn ' cn=ranger,ou=service accounts,dc=corp,dc=emr,dc=local' \
222+ --ad-bind-password ' Admin1234!' \
223+ --ad-user-object-class person \
224+ --ranger-plugins hdfs,hive,hbase \
225+ --emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
226+ --emr-core-nodes 10.0.0.114,10.0.0.136 \
227+ --emr-ssh-key /home/ec2-user/key.pem
228+ ```
229+
230+ This cli tool follows the principle of "convention over configuration", most parameters are preset by default values, so a complete equivalent version of above command line is as following:
231+
232+ ``` bash
233+ sudo sh ranger-emr-cli-installer/bin/setup.sh install \
234+ --ranger-host $( hostname -i) \
235+ --java-home /usr/lib/jvm/java \
236+ --skip-install-mysql false \
237+ --mysql-host $( hostname -i) \
238+ --mysql-root-password ' Admin1234!' \
239+ --mysql-ranger-db-user-password ' Admin1234!' \
240+ --skip-install-solr false \
241+ --solr-host $( hostname -i) \
242+ --auth-type ad \
243+ --ad-domain corp.emr.local \
244+ --ad-url ldap://10.0.0.194 \
245+ --ad-base-dn ' cn=users,dc=corp,dc=emr,dc=local' \
246+ --ad-bind-dn ' cn=ranger,ou=service accounts,dc=corp,dc=emr,dc=local' \
247+ --ad-bind-password ' Admin1234!' \
248+ --ad-user-object-class person \
249+ --ranger-version 2.1.0 \
250+ --ranger-repo-url ' http://52.81.173.97:7080/ranger-repo/' \
251+ --ranger-plugins hdfs,hive,hbase \
252+ --emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
253+ --emr-core-nodes 10.0.0.114,10.0.0.136 \
254+ --emr-ssh-key /home/ec2-user/key.pem \
255+ --restart-interval 30
256+ ```
257+
258+ You can adjust more parameters against your demands or environments based on above cli.
259+
260+ ### 5.2. Integrate The Second Normal EMR Cluster
261+
262+ The following diagram illustrates what this example will do:
263+
264+ ![ example2] ( https://user-images.githubusercontent.com/5539582/99872056-0172ba80-2c1a-11eb-9087-ea8e5ef353b7.png )
265+
266+ The following command line will finish this job:
267+
268+ ``` bash
269+ sudo sh ranger-emr-cli-installer/bin/setup.sh install-ranger-plugins \
270+ --ranger-host $( hostname -i) \
271+ --solr-host $( hostname -i) \
272+ --ranger-version 2.1.0 \
273+ --ranger-plugins hdfs,hive,hbase \
274+ --emr-master-nodes 10.0.0.18 \
275+ --emr-core-nodes 10.0.0.69 \
276+ --emr-ssh-key /home/ec2-user/key.pem \
277+ --restart-interval 30
278+ ```
279+
280+ ### 5.3. Install Ranger + Integrate a Open LDAP Server + Integrate A Multi-Master EMR Cluster
281+
282+ The following diagram illustrates what this example will do:
283+
284+ ![ example3] ( https://user-images.githubusercontent.com/5539582/99872059-059ed800-2c1a-11eb-82e7-da5e21949d44.png )
285+
286+ The following command line will finish this job:
287+
288+ ``` bash
289+ sudo sh ranger-emr-cli-installer/bin/setup.sh install \
290+ --auth-type ldap \
291+ --ldap-url ldap://10.0.0.41 \
292+ --ldap-base-dn ' dc=example,dc=com' \
293+ --ldap-bind-dn ' cn=ranger,ou=service accounts,dc=example,dc=com' \
294+ --ldap-bind-password ' Admin1234!' \
295+ --ldap-user-dn-pattern ' uid={0},dc=example,dc=com' \
296+ --ldap-group-search-filter ' (member=uid={0},dc=example,dc=com)' \
297+ --ldap-user-object-class inetOrgPerson \
298+ --ranger-plugins hdfs,hive,hbase \
299+ --emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
300+ --emr-core-nodes 10.0.0.114,10.0.0.136 \
301+ --emr-ssh-key /home/ec2-user/key.pem
302+ ```
303+
304+ Again,a complete equivalent version of above command line is as following:
305+
306+ ``` bash
307+ sudo sh ranger-emr-cli-installer/bin/setup.sh install \
308+ --ranger-host $( hostname -i) \
309+ --java-home /usr/lib/jvm/java \
310+ --skip-install-mysql false \
311+ --mysql-host $( hostname -i) \
312+ --mysql-root-password ' Admin1234!' \
313+ --mysql-ranger-db-user-password ' Admin1234!' \
314+ --skip-install-solr false \
315+ --solr-host $( hostname -i) \
316+ --auth-type ldap \
317+ --ldap-url ldap://10.0.0.41 \
318+ --ldap-base-dn ' dc=example,dc=com' \
319+ --ldap-bind-dn ' cn=ranger,ou=service accounts,dc=example,dc=com' \
320+ --ldap-bind-password ' Admin1234!' \
321+ --ldap-user-dn-pattern ' uid={0},dc=example,dc=com' \
322+ --ldap-group-search-filter ' (member=uid={0},dc=example,dc=com)' \
323+ --ldap-user-object-class inetOrgPerson \
324+ --ranger-version 2.1.0 \
325+ --ranger-repo-url ' http://52.81.173.97:7080/ranger-repo/' \
326+ --ranger-plugins hdfs,hive,hbase \
327+ --emr-master-nodes 10.0.0.177,10.0.0.199,10.0.0.21 \
328+ --emr-core-nodes 10.0.0.114,10.0.0.136 \
329+ --emr-ssh-key /home/ec2-user/key.pem \
330+ --restart-interval 30
331+ ```
332+
333+ You can adjust more parameters against your demands or environments based on above cli.
334+
335+ ## 6. Versions & Compatibility
336+
337+ The following is Ranger and EMR version compatibility form:
338+
339+   ; |Ranger 1.X|Ranger 2.x
340+ ---|---|---
341+ EMR 5.X|Y|N
342+ EMR 6.X|N|Y
343+
344+ For Ranger 1, it works with Hadoop 2, for Ranger 2, it works with Hadoop 3, ** This project is developed against Ranger 2.1.0, so now, it can only integrate EMR 6.X.** For Ranger 1.2 + EMR 5.X, it is to be developed in the next according to demands.
345+
0 commit comments