You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/kafka/secure-spark-kafka-streaming-integration-scenario.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,12 +5,12 @@ ms.service: hdinsight
5
5
ms.topic: how-to
6
6
ms.author: piyushgupta
7
7
author: piyush-gupta1999
8
-
ms.date: 11/03/2022
8
+
ms.date: 11/23/2023
9
9
---
10
10
11
11
# Secure Spark and Kafka – Spark streaming integration scenario
12
12
13
-
In this document, you'll learn how to execute a Spark job in a secure Spark cluster that reads from a topic in secure Kafka cluster, provided the virtual networks are same/peered.
13
+
In this document, you learn how to execute a Spark job in a secure Spark cluster that reads from a topic in secure Kafka cluster, provided the virtual networks are same/peered.
14
14
15
15
**Pre-requisites**
16
16
@@ -64,7 +64,7 @@ In the Kafka cluster, set up Ranger policies and produce data from Kafka cluster
64
64
65
65
1. Add a Ranger policy for `bobadmin` with all accesses to all topics with wildcard pattern `*`
66
66
67
-
1. Execute the commands below based on your parameter values
67
+
1. Execute the following commands based on your parameter values
68
68
69
69
```
70
70
sshuser@hn0-umasec:~$ sudo apt -y install jq
@@ -141,7 +141,7 @@ In the Spark cluster, add entries in `/etc/hosts` in spark worker nodes, for Kaf
141
141
142
142
1. Create a keytab for user `alicetest` using ktutil tool. Let's call this file `alicetest.keytab`
143
143
144
-
1. Create a `bobadmin_jaas.conf` as shown in below sample
144
+
1. Create a `bobadmin_jaas.conf` as shown in following sample
145
145
146
146
```
147
147
KafkaClient {
@@ -154,7 +154,7 @@ In the Spark cluster, add entries in `/etc/hosts` in spark worker nodes, for Kaf
If you see the below error, which denotes the DNS (Domain Name Server) issue. Make sure to check Kafka worker nodes entry in `/etc/hosts` file in Spark cluster.
210
+
If you see the following error, which denotes the DNS (Domain Name Server) issue. Make sure to check Kafka worker nodes entry in `/etc/hosts` file in Spark cluster.
211
211
212
212
```
213
213
Caused by: GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))
@@ -219,13 +219,13 @@ From Spark cluster, read from kafka topic `alicetopic2` as user `alicetest` is a
219
219
220
220
1. From YARN UI, access the YARN job output you can see the `alicetest` user is able to read from `alicetopic2`. You can see the word count in the output.
221
221
222
-
1. Below are the detailed steps on how to check the application output from YARN UI.
222
+
1. Following are the detailed steps on how to check the application output from YARN UI.
223
223
224
-
1. Go to YARN UI and open your application. Wait for the job to go to RUNNING state. You'll see the application details as below.
224
+
1. Go to YARN UI and open your application. Wait for the job to go to RUNNING state. You'll see the following application details.
225
225
226
-
1. Click on Logs. You'll see the list of logs as shown below.
226
+
1. Click Logs. You'll see the following list of logs.
227
227
228
-
1. Click on 'stdout'. You'll see the output with the count of words from your Kafka topic.
228
+
1. Click 'stdout'. You'll see the following output with the count of words from your Kafka topic.
229
229
230
230
1. On the Kafka cluster’s Ranger UI, audit logs for the same will be shown.
0 commit comments