Skip to content

Commit ed1f687

Browse files
authored
Merge pull request #268354 from abhishjain002/patch-1
Update ranger-policies-for-spark.md
2 parents 2b44083 + ad43858 commit ed1f687

File tree

3 files changed

+16
-9
lines changed

3 files changed

+16
-9
lines changed
239 KB
Loading
236 KB
Loading

articles/hdinsight/spark/ranger-policies-for-spark.md

Lines changed: 16 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Configure Apache Ranger policies for Spark SQL in HDInsight with Enterpri
33
description: This article describes how to configure Ranger policies for Spark SQL with Enterprise Security Package.
44
ms.service: hdinsight-aks
55
ms.topic: how-to
6-
ms.date: 02/12/2024
6+
ms.date: 03/07/2024
77
---
88

99
# Configure Apache Ranger policies for Spark SQL in HDInsight with Enterprise Security Package
@@ -32,7 +32,7 @@ In this article, you learn how to:
3232

3333
## Create domain users
3434

35-
For information on how to create **sparkuser** domain users, see [Create an HDInsight cluster with ESP](../domain-joined/apache-domain-joined-configure-using-azure-adds.md#create-an-hdinsight-cluster-with-esp). In a production scenario, domain users come from your Microsoft Entra tenant.
35+
For information on how to create `sparkuser` domain users, see [Create an HDInsight cluster with ESP](../domain-joined/apache-domain-joined-configure-using-azure-adds.md#create-an-hdinsight-cluster-with-esp). In a production scenario, domain users come from your Microsoft Entra tenant.
3636

3737
## Create a Ranger policy
3838

@@ -61,7 +61,7 @@ In this section, you create two Ranger policies:
6161
| database | default |
6262
| table | hivesampletable |
6363
| column | * |
64-
| Select User | sparkuser |
64+
| Select User | `sparkuser` |
6565
| Permissions | select |
6666

6767
:::image type="content" source="./media/ranger-policies-for-spark/sample-policy-details.png" alt-text="Screenshot that shows sample details for an access policy." lightbox="./media/ranger-policies-for-spark/sample-policy-details.png":::
@@ -101,7 +101,7 @@ The following example shows how to create a policy to mask a column:
101101
|Hive Database|default|
102102
|Hive Table| hivesampletable|
103103
|Hive Column|devicemake|
104-
|Select User|sparkuser|
104+
|Select User|`sparkuser`|
105105
|Access Types|select|
106106
|Select Masking Option|Hash|
107107

@@ -145,7 +145,7 @@ Consider these points:
145145
In such cases, we recommend that you either:
146146

147147
- Use the Hive catalog for both Hive and Spark.
148-
- Maintain different database, table, and column names for both Hive and Spark catalogs so that the policies are not applied to databases across catalogs.
148+
- Maintain different database, table, and column names for both Hive and Spark catalogs so that the policies aren't applied to databases across catalogs.
149149

150150
- If you use the Hive catalog for both Hive and Spark, consider the following example.
151151

@@ -174,9 +174,9 @@ Let's say that you have the policies defined in the Ranger repo already under th
174174

175175
:::image type="content" source="./media/ranger-policies-for-spark/ambari-config-ranger-security.png" alt-text="Screenshot shows Ambari config ranger security." lightbox="./media/ranger-policies-for-spark/ambari-config-ranger-security.png":::
176176

177-
You can also open this configuration in **/etc/spark3/conf** by using SSH.
177+
or You can also open this configuration in **/etc/spark3/conf** by using SSH.
178178

179-
1. Edit two configurations (**ranger.plugin.spark.service.name** and **ranger.plugin.spark.policy.cache.dir**) to point to the old policy repo **oldclustername_hive**, and then save the configurations.
179+
Edit two configurations (**ranger.plugin.spark.service.name** and **ranger.plugin.spark.policy.cache.dir**) to point to the old policy repo **oldclustername_hive**, and then save the configurations.
180180

181181
Ambari:
182182

@@ -188,6 +188,14 @@ Let's say that you have the policies defined in the Ranger repo already under th
188188

189189
1. Restart the Ranger and Spark services from Ambari.
190190

191+
1. Open the Ranger admin UI and click on edit button under **HADOOP SQL** service.
192+
193+
:::image type="content" source="./media/ranger-policies-for-spark/ranger-service-edit.png" alt-text="Screenshot that shows edit option for ranger service." lightbox="./media/ranger-policies-for-spark/ranger-service-edit.png":::
194+
195+
1. For **oldclustername_hive** service, add **rangersparklookup** user in the **policy.download.auth.users** and **tag.download.auth.users** list and click save.
196+
197+
:::image type="content" source="./media/ranger-policies-for-spark/add-new-user-ranger-lookup.png" alt-text="Screenshot that shows how to add user in Ranger service." lightbox="./media/ranger-policies-for-spark/add-new-user-ranger-lookup.png":::
198+
191199
The policies are applied on databases in the Spark catalog. If you want to access the databases in the Hive catalog:
192200

193201
1. In Ambari, go to **Spark3** > **Configs**.
@@ -198,5 +206,4 @@ The policies are applied on databases in the Spark catalog. If you want to acces
198206
## Known issues
199207

200208
- Apache Ranger integration with Spark SQL doesn't work if the Ranger admin is down.
201-
- The Ranger database can be overloaded if more than 20 Spark sessions are started concurrently because of continuous policy pulls.
202-
- In Ranger audit logs, when you hover over the **Resource** column, it doesn't show the entire query that you ran.
209+
- In Ranger audit logs, when you hover over the **Resource** column, it can't show the entire query that you ran.

0 commit comments

Comments
 (0)