Skip to content

Commit a5eb558

Browse files
author
Sreekanth Iyer (Ushta Te Consultancy Services)
committed
Improved Acrolinx Score
1 parent 081141f commit a5eb558

File tree

2 files changed

+18
-18
lines changed

2 files changed

+18
-18
lines changed

articles/hdinsight-aks/flink/use-apache-nifi-with-datastream-api.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
---
22
title: Use Apache NiFi with HDInsight on AKS clusters running Apache Flink® to publish into ADLS Gen2
3-
description: Learn how to use Apache NiFi to consume processed Apache Kafka® topic from Apache Flink® on HDInsight on AKS clusters and publish into ADLS Gen2
3+
description: Learn how to use Apache NiFi to consume processed Apache Kafka® topic from Apache Flink® on HDInsight on AKS clusters and publish into ADLS Gen2.
44
ms.service: hdinsight-aks
55
ms.topic: how-to
6-
ms.date: 03/23/2024
6+
ms.date: 03/25/2024
77
---
88

99
# Use Apache NiFi to consume processed Apache Kafka® topics from Apache Flink® and publish into ADLS Gen2
@@ -14,21 +14,21 @@ Apache NiFi is a software project from the Apache Software Foundation designed t
1414

1515
For more information, see [Apache NiFi](https://nifi.apache.org)
1616

17-
In this document, we process streaming data using HDInsight Kafka and perform some transformations on HDInsight Apache Flink on AKS, consume these topics and write the contents into ADLS Gen2 on Apache NiFi.
17+
In this document, we process streaming data using HDInsight Kafka and perform some transformations on HDInsight Apache Flink on AKS, consume these topics, and write the contents into ADLS Gen2 on Apache NiFi.
1818

1919
By combining the low latency streaming features of Apache Flink and the dataflow capabilities of Apache NiFi, you can process events at high volume. This combination helps you to trigger, enrich, filter, to enhance overall user experience. Both these technologies complement each other with their strengths in event streaming and correlation.
2020

2121
## Prerequisites
2222

2323
* [Flink cluster on HDInsight on AKS](../flink/flink-create-cluster-portal.md)
2424
* [Kafka cluster on HDInsight](../../hdinsight/kafka/apache-kafka-get-started.md)
25-
* You're required to ensure the network settings are taken care as described on [Using Kafka on HDInsight](../flink/process-and-consume-data.md); that's to make sure HDInsight on AKS and HDInsight clusters are in the same VNet
25+
* You're required to ensure the network settings taken care as described on [Using Kafka on HDInsight](../flink/process-and-consume-data.md) to make sure HDInsight on AKS and HDInsight clusters are in the same VNet
2626
* For this demonstration, we're using a Window VM as maven project develop env in the same VNET as HDInsight on AKS
2727
* For this demonstration, we're using an Ubuntu VM in the same VNET as HDInsight on AKS, install Apache NiFi 1.22.0 on this VM
2828

2929
## Prepare HDInsight Kafka topic
3030

31-
For purposes of this demonstration, we're using a HDInsight Kafka Cluster, let us prepare HDInsight Kafka topic for the demo.
31+
For purposes of this demonstration, we're using a HDInsight Kafka Cluster. Let us prepare HDInsight Kafka topic for the demo.
3232

3333
> [!NOTE]
3434
> Setup a HDInsight cluster with [Apache Kafka](../../hdinsight/kafka/apache-kafka-get-started.md) and replace broker list with your own list before you get started for both Kafka 2.4 and 3.2.
@@ -73,7 +73,7 @@ Here, we configure NiFi properties in order to be accessed outside the localhost
7373

7474
## Process streaming data from Kafka cluster on HDInsight with Flink cluster on HDInsight on AKS
7575

76-
Let us develop the source code on Maven, to build the jar.
76+
Let us develop the source code on Maven, and build the jar.
7777

7878
**SinkToKafka.java**
7979

@@ -182,7 +182,7 @@ public class ClickSource implements SourceFunction<Event> {
182182
```
183183
**Maven pom.xml**
184184

185-
You can replace 2.4.1 with 3.2.0 in case you're using Kafka 3.2.0 on HDInsight, where applicable on the pom.xml
185+
You can replace 2.4.1 with 3.2.0 in case you're using Kafka 3.2.0 on HDInsight, where applicable on the pom.xml.
186186

187187
``` xml
188188
<?xml version="1.0" encoding="UTF-8"?>
@@ -261,7 +261,7 @@ You can replace 2.4.1 with 3.2.0 in case you're using Kafka 3.2.0 on HDInsight,
261261

262262
## Submit streaming job to Flink cluster on HDInsight on AKS
263263

264-
Now, lets submit streaming job as mentioned in the previous step into Flink cluster
264+
Now, lets submit streaming job as mentioned in the previous step into Flink cluster.
265265

266266
:::image type="content" source="./media/use-apache-nifi-with-datastream-api/step-5-flink-ui-job-submission.png" alt-text="Screenshot showing how to submit the streaming job from FLink UI." border="true" lightbox="./media/use-apache-nifi-with-datastream-api/step-5-flink-ui-job-submission.png":::
267267

@@ -300,13 +300,13 @@ root@hn0-contos:/home/sshuser# /usr/hdp/current/kafka-broker/bin/kafka-console-c
300300
> [!NOTE]
301301
> In this example, we use Azure User Managed Identity to credentials for ADLS Gen2.
302302
303-
In this demonstration, we have used Apache NiFi instance installed on an Ubuntu VM. We're accessing the NiFi web interface from a Windows VM. The Ubuntu VM needs to have a managed identity assigned to it and network security group (NSG) rules configured.
303+
In this demonstration, we use Apache NiFi instance installed on an Ubuntu VM. We're accessing the NiFi web interface from a Windows VM. The Ubuntu VM needs to have a managed identity assigned to it and network security group (NSG) rules configured.
304304

305305
To use Managed Identity authentication with the PutAzureDataLakeStorage processor in NiFi. You're required to ensure Ubuntu VM on which NiFi is installed has a managed identity assigned to it, or assign a managed identity to the Ubuntu VM.
306306

307307
:::image type="content" source="./media/use-apache-nifi-with-datastream-api/step-6-nifi-ui-kafka-consumption.png" alt-text="Screenshot showing how to create a flow in Apache NiFi - Step 1." border="true" lightbox="./media/use-apache-nifi-with-datastream-api/step-6-nifi-ui-kafka-consumption.png":::
308308

309-
Once you have assigned a managed identity to the Azure VM, you need to make sure that the VM can connect to the IMDS (Instance Metadata Service) endpoint. The IMDS endpoint is available at the IP address shown in this example. You need to update your network security group rules to allow outbound traffic from the Ubuntu VM to this IP address.
309+
Once you assign a managed identity to the Azure VM, you need to make sure that the VM can connect to the IMDS (Instance Metadata Service) endpoint. The IMDS endpoint is available at the IP address shown in this example. You need to update your network security group rules to allow outbound traffic from the Ubuntu VM to this IP address.
310310

311311
:::image type="content" source="./media/use-apache-nifi-with-datastream-api/step-6-2-nifi-ui-kafka-consumption.png" alt-text="Screenshot showing how to create a flow in Apache NiFi-Step2." border="true" lightbox="./media/use-apache-nifi-with-datastream-api/step-6-2-nifi-ui-kafka-consumption.png":::
312312

@@ -340,4 +340,4 @@ Once you have assigned a managed identity to the Azure VM, you need to make sure
340340
* [Azure Data Lake Storage](https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-azure-nar/1.12.0/org.apache.nifi.processors.azure.storage.PutAzureDataLakeStorage/index.html)
341341
* [ADLS Credentials Controller Service](https://nifi.apache.org/docs/nifi-docs/components/org.apache.nifi/nifi-azure-nar/1.12.0/org.apache.nifi.services.azure.storage.ADLSCredentialsControllerService/index.html)
342342
* [Download IntelliJ IDEA for development](https://www.jetbrains.com/idea/download/#section=windows)
343-
* Apache, Apache Kafka, Kafka, Apache Flink, Flink,Apache NiFi, NiFi and associated open source project names are [trademarks](../trademarks.md) of the [Apache Software Foundation](https://www.apache.org/) (ASF).
343+
* Apache, Apache Kafka, Kafka, Apache Flink, Flink, Apache NiFi, NiFi, and associated open source project names are [trademarks](../trademarks.md) of the [Apache Software Foundation](https://www.apache.org/) (ASF).

articles/hdinsight-aks/flink/use-flink-to-sink-kafka-message-into-hbase.md

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
---
22
title: Write messages to Apache HBase® with Apache Flink® DataStream API
3-
description: Learn how to write messages to Apache HBase with Apache Flink DataStream API
3+
description: Learn how to write messages to Apache HBase with Apache Flink DataStream API.
44
ms.service: hdinsight-aks
55
ms.topic: how-to
6-
ms.date: 03/23/2024
6+
ms.date: 03/25/2024
77
---
88

99
# Write messages to Apache HBase® with Apache Flink® DataStream API
@@ -14,16 +14,16 @@ In this article, learn how to write messages to HBase with Apache Flink DataStre
1414

1515
## Overview
1616

17-
Apache Flink offers HBase connector as a sink, with this connector with Flink you can store the output of a real-time processing application in HBase. Learn how to process streaming data on HDInsight Kafka as a source, perform transformations, then sink into HDInsight HBase table.
17+
Apache Flink offers HBase connector as a sink, with this connector with Flink you can store the output of a real-time processing application in HBase. Learn how to process streaming data on HDInsight Kafka as a source, perform transformations, then sink into HDInsight HBase table.
1818

19-
In a real world scenario, this example is a stream analytics layer to realize value from Internet of Things (IOT) analytics, which use live sensor data. The Flink Stream can read data from Kafka topic and write it to HBase table. If there is a real time streaming IOT application, the information can be gathered, transformed and optimized.
19+
In a real world scenario, this example is a stream analytics layer to realize value from Internet of Things (IOT) analytics, which use live sensor data. The Flink Stream can read data from Kafka topic and write it to HBase table. If there's a real time streaming IOT application, the information can be gathered, transformed, and optimized.
2020

2121

2222
## Prerequisites
2323

2424
* [Apache Flink cluster on HDInsight on AKS](../flink/flink-create-cluster-portal.md)
2525
* [Apache Kafka cluster on HDInsight](../flink/process-and-consume-data.md)
26-
* [Apache HBase 2.4.11 clusteron HDInsight](../../hdinsight/hbase/apache-hbase-tutorial-get-started-linux.md#create-apache-hbase-cluster)
26+
* [Apache HBase 2.4.11 cluster on HDInsight](../../hdinsight/hbase/apache-hbase-tutorial-get-started-linux.md#create-apache-hbase-cluster)
2727
* You're required to ensure HDInsight on AKS cluster can connect to HDInsight cluster, with same virtual network.
2828
* Maven project on IntelliJ IDEA for development on an Azure VM in the same VNet
2929

@@ -349,13 +349,13 @@ public class KafkaSinkToHbase {
349349

350350
### Submit job on Secure Shell
351351

352-
We use [Flink CLI](./flink-web-ssh-on-portal-to-flink-sql.md) from Azure portal to submit jobs
352+
We use [Flink CLI](./flink-web-ssh-on-portal-to-flink-sql.md) from Azure portal to submit jobs.
353353

354354
:::image type="content" source="./media/use-flink-to-sink-kafka-message-into-hbase/submit-job-on-web-ssh.png" alt-text="Screenshot showing how to submit job on web ssh." lightbox="./media/use-flink-to-sink-kafka-message-into-hbase/submit-job-on-web-ssh.png":::
355355

356356
### Monitor job on Flink UI
357357

358-
We can monitor the jobs on Flink Web UI
358+
We can monitor the jobs on Flink Web UI.
359359

360360
:::image type="content" source="./media/use-flink-to-sink-kafka-message-into-hbase/check-job-on-flink-ui.png" alt-text="Screenshot showing how to check job on Flink UI." lightbox="./media/use-flink-to-sink-kafka-message-into-hbase/check-job-on-flink-ui.png":::
361361

0 commit comments

Comments
 (0)