Skip to content

Commit ec51ca3

Browse files
authored
Merge pull request #256420 from yeturis/patch-20
Update flink-overview.md
2 parents dc1d5fd + 3da0a1d commit ec51ca3

36 files changed

+361
-323
lines changed

articles/hdinsight-aks/TOC.yml

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -153,17 +153,17 @@ items:
153153
href: ./trino/trino-airflow.md
154154
- name: Use AWS S3 and Glue with Trino cluster
155155
href: ./trino/trino-catalog-glue.md
156-
- name: Apache Flink
156+
- name: Apache Flink®
157157
items:
158-
- name: What is Flink in HDInsight on AKS?
158+
- name: What is Apache Flink® in Azure HDInsight on AKS?
159159
href: ./flink/flink-overview.md
160-
- name: Create Flink cluster
160+
- name: Create Apache Flink cluster
161161
href: ./flink/flink-create-cluster-portal.md
162162
- name: Tutorials
163163
items:
164164
- name: Configuration management
165165
href: ./flink/flink-configuration-management.md
166-
- name: Flink job management
166+
- name: Job management
167167
href: ./flink/flink-job-management.md
168168
- name: Hive dialect in Flink
169169
href: ./flink/hive-dialect-flink.md
@@ -195,7 +195,7 @@ items:
195195
href: ./flink/use-flink-delta-connector.md
196196
- name: Elastic and Kibana
197197
href: ./flink/sink-kafka-to-kibana.md
198-
- name: HDInsight Kafka
198+
- name: Apache Kafka on HDInsight
199199
href: ./flink/process-and-consume-data.md
200200
- name: Flink SQL
201201
items:
@@ -207,9 +207,9 @@ items:
207207
href: ./flink/create-kafka-table-flink-kafka-sql-connector.md
208208
- name: Hive Catalog and FlinkSQL
209209
href: ./flink/use-hive-catalog.md
210-
- name: Flink-Iceberg Catalog
210+
- name: Iceberg Catalog
211211
href: ./flink/flink-catalog-iceberg-hive.md
212-
- name: Flink-Delta Catalog
212+
- name: Delta Catalog
213213
href: ./flink/flink-catalog-delta-hive.md
214214
- name: Flink streaming
215215
items:
@@ -237,11 +237,11 @@ items:
237237
items:
238238
- name: Flink cluster configuration
239239
href: ./flink/flink-cluster-configuration.md
240-
- name: Apache Spark
240+
- name: Apache Spark
241241
items:
242-
- name: What is Apache Spark in HDInsight on AKS?
242+
- name: What is Apache Spark in HDInsight on AKS?
243243
href: ./spark/hdinsight-on-aks-spark-overview.md
244-
- name: Create Spark cluster
244+
- name: Create Apache Spark cluster
245245
href: ./spark/create-spark-cluster.md
246246
- name: How-to guides
247247
items:
@@ -253,7 +253,7 @@ items:
253253
href: ./spark/library-management.md
254254
- name: Connect to One Lake Storage
255255
href: ./spark/connect-to-one-lake-storage.md
256-
- name: Use Delta Lake scenario in Azure HDInsight on AKS Spark cluster
256+
- name: Use Delta Lake scenario in Azure HDInsight on AKS cluster
257257
href: ./spark/azure-hdinsight-spark-on-aks-delta-lake.md
258258
- name: Use ML Notebook on Spark
259259
href: ./spark/use-machine-learning-notebook-on-spark.md

articles/hdinsight-aks/flink/assign-kafka-topic-event-message-to-azure-data-lake-storage-gen2.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,22 +1,22 @@
11
---
2-
title: Write event messages into Azure Data Lake Storage Gen2 with DataStream API
3-
description: Learn how to write event messages into Azure Data Lake Storage Gen2 with DataStream API
2+
title: Write event messages into Azure Data Lake Storage Gen2 with Apache Flink® DataStream API
3+
description: Learn how to write event messages into Azure Data Lake Storage Gen2 with Apache Flink® DataStream API
44
ms.service: hdinsight-aks
55
ms.topic: how-to
6-
ms.date: 08/29/2023
6+
ms.date: 10/27/2023
77
---
88

9-
# Write event messages into Azure Data Lake Storage Gen2 with DataStream API
9+
# Write event messages into Azure Data Lake Storage Gen2 with Apache Flink® DataStream API
1010

1111
[!INCLUDE [feature-in-preview](../includes/feature-in-preview.md)]
1212

1313
Apache Flink uses file systems to consume and persistently store data, both for the results of applications and for fault tolerance and recovery. In this article, learn how to write event messages into Azure Data Lake Storage Gen2 with DataStream API.
1414

1515
## Prerequisites
1616

17-
* [HDInsight on AKS Apache Flink 1.16.0](../flink/flink-create-cluster-portal.md)
18-
* [HDInsight Kafka](../../hdinsight/kafka/apache-kafka-get-started.md)
19-
* You're required to ensure the network settings are taken care as described on [Using HDInsight Kafka](../flink/process-and-consume-data.md); that's to make sure HDInsight on AKS Flink and HDInsight Kafka are in the same Virtual Network
17+
* [Apache Flink cluster on HDInsight on AKS ](../flink/flink-create-cluster-portal.md)
18+
* [Apache Kafka cluster on HDInsight](../../hdinsight/kafka/apache-kafka-get-started.md)
19+
* You're required to ensure the network settings are taken care as described on [Using Apache Kafka on HDInsight](../flink/process-and-consume-data.md); that's to make sure HDInsight on AKS and HDInsight clusters are in the same Virtual Network
2020
* Use MSI to access ADLS Gen2
2121
* IntelliJ for development on an Azure VM in HDInsight on AKS Virtual Network
2222

@@ -103,7 +103,7 @@ Flink provides an Apache Kafka connector for reading data from and writing data
103103
*abfsGen2.java*
104104

105105
> [!Note]
106-
> Replace [HDInsight Kafka](../../hdinsight/kafka/apache-kafka-get-started.md)bootStrapServers with your own brokers for Kafka 2.4 or 3.2
106+
> Replace [Apache Kafka on HDInsight cluster](../../hdinsight/kafka/apache-kafka-get-started.md) bootStrapServers with your own brokers for Kafka 2.4 or 3.2
107107
108108
``` java
109109
package contoso.example;
@@ -189,4 +189,6 @@ You can specify a rolling policy that rolls the in-progress part file on any of
189189
## Reference
190190
- [Apache Kafka Connector](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/datastream/kafka)
191191
- [Flink DataStream Filesystem](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/datastream/filesystem)
192+
- [Apache Flink Website](https://flink.apache.org/)
193+
- Apache, Apache Kafka, Kafka, Apache Flink, Flink, and associated open source project names are [trademarks](../trademarks.md) of the [Apache Software Foundation](https://www.apache.org/) (ASF).
192194

articles/hdinsight-aks/flink/azure-databricks.md

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
---
2-
title: Incorporate Flink DataStream into Azure Databricks Delta Lake Table
3-
description: Learn about incorporate Flink DataStream into Azure Databricks Delta Lake Table in HDInsight on AKS - Apache Flink
2+
title: Incorporate Apache Flink® DataStream into Azure Databricks Delta Lake Table
3+
description: Learn about incorporate Apache Flink® DataStream into Azure Databricks Delta Lake Table
44
ms.service: hdinsight-aks
55
ms.topic: how-to
6-
ms.date: 10/05/2023
6+
ms.date: 10/27/2023
77
---
88

9-
# Incorporate Flink DataStream into Azure Databricks Delta Lake Table
9+
# Incorporate Apache Flink® DataStream into Azure Databricks Delta Lake Tables
1010

11-
This example shows how to sink stream data landed into Azure ADLS Gen2 from HDInsight Flink cluster on AKS applications into Delta Lake tables using Azure Databricks Auto Loader.
11+
This example shows how to sink stream data in Azure ADLS Gen2 from Apache Flink cluster on HDInsight on AKS into Delta Lake tables using Azure Databricks Auto Loader.
1212

1313
## Prerequisites
1414

15-
- [HDInsight Flink 1.16.0 on AKS](./flink-create-cluster-portal.md)
16-
- [HDInsight Kafka 3.2.0](../../hdinsight/kafka/apache-kafka-get-started.md)
15+
- [Apache Flink 1.16.0 on HDInsight on AKS](../flink/flink-create-cluster-portal.md)
16+
- [Apache Kafka 3.2 on HDInsight](../../hdinsight/kafka/apache-kafka-get-started.md)
1717
- [Azure Databricks](/azure/databricks/getting-started/) in the same VNET as HDInsight on AKS
1818
- [ADLS Gen2](/azure/databricks/getting-started/connect-to-azure-storage/) and Service Principal
1919

@@ -23,7 +23,7 @@ Databricks Auto Loader makes it easy to stream data land into object storage fro
2323

2424
Here are the steps how you can use data from Flink in Azure Databricks delta live tables.
2525

26-
### Create Kafka table on Flink SQL
26+
### Create Apache Kafka® table on Apache Flink® SQL
2727

2828
In this step, you can create Kafka table and ADLS Gen2 on Flink SQL. For the purpose of this document, we are using a airplanes_state_real_time table, you can use any topic of your choice.
2929

@@ -144,3 +144,7 @@ AS SELECT * FROM cloud_files("dbfs:/mnt/contosoflinkgen2/flink/airplanes_state_r
144144
### Check Delta Live Table on Azure Databricks Notebook
145145

146146
:::image type="content" source="media/azure-databricks/delta-live-table-azure.png" alt-text="Screenshot shows check Delta Live Table on Azure Databricks Notebook." lightbox="media/azure-databricks/delta-live-table-azure.png":::
147+
148+
### Reference
149+
150+
* Apache, Apache Kafka, Kafka, Apache Flink, Flink, and associated open source project names are [trademarks](../trademarks.md) of the [Apache Software Foundation](https://www.apache.org/) (ASF).

articles/hdinsight-aks/flink/azure-iot-hub.md

Lines changed: 9 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,19 +1,19 @@
11
---
2-
title: Process real-time IoT data on Flink with Azure HDInsight on AKS
3-
description: How to integrate Azure IoT Hub and Apache Flink
2+
title: Process real-time IoT data on Apache Flink® with Azure HDInsight on AKS
3+
description: How to integrate Azure IoT Hub and Apache Flink®
44
ms.service: hdinsight-aks
55
ms.topic: how-to
66
ms.date: 10/03/2023
77
---
88

9-
# Process real-time IoT data on Flink with Azure HDInsight on AKS
9+
# Process real-time IoT data on Apache Flink® with Azure HDInsight on AKS
1010

1111
Azure IoT Hub is a managed service hosted in the cloud that acts as a central message hub for communication between an IoT application and its attached devices. You can connect millions of devices and their backend solutions reliably and securely. Almost any device can be connected to an IoT hub.
1212

1313
## Prerequisites
1414

1515
1. [Create an Azure IoTHub](/azure/iot-hub/iot-hub-create-through-portal/)
16-
2. [Create an HDInsight on AKS Flink cluster](./flink-create-cluster-portal.md)
16+
2. [Create Flink cluster on HDInsight on AKS](./flink-create-cluster-portal.md)
1717

1818
## Configure Flink cluster
1919

@@ -207,3 +207,8 @@ public class StreamingJob {
207207
Submit job using HDInsight on AKS's [Flink job submission API](./flink-job-management.md)
208208

209209
:::image type="content" source="./media/azure-iot-hub/create-new-job.png" alt-text="Screenshot shows create a new job." lightbox="./media/azure-iot-hub/create-new-job.png":::
210+
211+
### Reference
212+
213+
- [Apache Flink Website](https://flink.apache.org/)
214+
- Apache, Apache Kafka, Kafka, Apache Flink, Flink, and associated open source project names are [trademarks](../trademarks.md) of the [Apache Software Foundation](https://www.apache.org/) (ASF).

articles/hdinsight-aks/flink/change-data-capture-connectors-for-apache-flink.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,12 @@
11
---
2-
title: How to perform Change Data Capture of SQL Server with DataStream API and DataStream Source.
3-
description: Learn how to perform Change Data Capture of SQL Server with DataStream API and DataStream Source.
2+
title: How to perform Change Data Capture of SQL Server with Apache Flink® DataStream API and DataStream Source.
3+
description: Learn how to perform Change Data Capture of SQL Server with Apache Flink® DataStream API and DataStream Source.
44
ms.service: hdinsight-aks
55
ms.topic: how-to
66
ms.date: 08/29/2023
77
---
88

9-
# Change Data Capture of SQL Server with DataStream API and DataStream Source
9+
# Change Data Capture of SQL Server with Apache Flink® DataStream API and DataStream Source on HDInsight on AKS
1010

1111
[!INCLUDE [feature-in-preview](../includes/feature-in-preview.md)]
1212

@@ -16,11 +16,10 @@ In this article, learn how to perform Change Data Capture of SQL Server using Da
1616

1717
## Prerequisites
1818

19-
* [HDInsight on AKS Apache Flink 1.16.0](../flink/flink-create-cluster-portal.md)
20-
* [HDInsight Kafka](../../hdinsight/kafka/apache-kafka-get-started.md)
21-
* You're required to ensure the network settings are taken care as described on [Using HDInsight Kafka](../flink/process-and-consume-data.md); that's to make sure HDInsight on AKS Flink and HDInsight Kafka are in the same VNet
19+
* [Apache Flink cluster on HDInsight on AKS](../flink/flink-create-cluster-portal.md)
20+
* [Apache Kafka cluster on HDInsight](../../hdinsight/kafka/apache-kafka-get-started.md)
21+
* You're required to ensure the network settings are taken care as described on [Using HDInsight Kafka](../flink/process-and-consume-data.md); that's to make sure HDInsight on AKS and HDInsight clusters are in the same VNet
2222
* Azure SQLServer
23-
* HDInsight Kafka cluster and HDInsight on AKS Flink clusters are located in the same VNet
2423
* Install [IntelliJ IDEA](https://www.jetbrains.com/idea/download/#section=windows) for development on an Azure VM, which locates in HDInsight VNet
2524

2625
### SQLServer CDC Connector
@@ -325,4 +324,5 @@ public class mssqlSinkToKafka {
325324

326325
* [SQLServer CDC Connector](https://github.com/ververica/flink-cdc-connectors/blob/master/docs/content/connectors/sqlserver-cdc.md) is licensed under [Apache 2.0 License](https://github.com/ververica/flink-cdc-connectors/blob/master/LICENSE)
327326
* [Apache Kafka in Azure HDInsight](../../hdinsight/kafka/apache-kafka-introduction.md)
328-
* [Flink Kafka Connector](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/datastream/kafka/#behind-the-scene)
327+
* [Kafka Connector](https://nightlies.apache.org/flink/flink-docs-release-1.16/docs/connectors/datastream/kafka/#behind-the-scene)
328+
* Apache, Apache Kafka, Kafka, Apache Flink, Flink, and associated open source project names are [trademarks](../trademarks.md) of the [Apache Software Foundation](https://www.apache.org/) (ASF).

articles/hdinsight-aks/flink/cosmos-db-for-apache-cassandra.md

Lines changed: 13 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,25 @@
11
---
2-
title: Using Azure Cosmos DB (Apache Cassandra) with HDInsight on AKS - Flink
3-
description: Learn how to Sink HDInsight Kafka message into Azure Cosmos DB for Apache Cassandra, with Apache Flink running on HDInsight on AKS.
2+
title: Using Azure Cosmos DB for Apache Cassandra® with HDInsight on AKS for Apache Flink®
3+
description: Learn how to Sink Apache Kafka® message into Azure Cosmos DB for Apache Cassandra®, with Apache Flink® running on HDInsight on AKS.
44
ms.service: hdinsight-aks
55
ms.topic: how-to
66
ms.date: 08/29/2023
77
---
88

9-
# Sink Kafka messages into Azure Cosmos DB for Apache Cassandra, with HDInsight on AKS - Flink
9+
# Sink Apache Kafka® messages into Azure Cosmos DB for Apache Cassandra, with Apache Flink® on HDInsight on AKS
1010

1111
[!INCLUDE [feature-in-preview](../includes/feature-in-preview.md)]
1212

13-
This example uses [HDInsight on AKS Flink 1.16.0](../flink/flink-overview.md) to sink [HDInsight Kafka 3.2.0](/azure/hdinsight/kafka/apache-kafka-introduction) messages into [Azure Cosmos DB for Apache Cassandra](/azure/cosmos-db/cassandra/introduction)
13+
This example uses [Apache Flink](../flink/flink-overview.md) to sink [HDInsight for Apache Kafka](/azure/hdinsight/kafka/apache-kafka-introduction) messages into [Azure Cosmos DB for Apache Cassandra](/azure/cosmos-db/cassandra/introduction)
14+
15+
This example is prominent when Engineers prefer real-time aggregated data for analysis. With access to historical aggregated data, you can build machine learning (ML) models to build insights or actions. You can also ingest IoT data into Apache Flink to aggregate data in real-time and store it in Apache Cassandra.
1416

1517
## Prerequisites
1618

17-
* [HDInsight on AKS Flink 1.16.0](../flink/flink-create-cluster-portal.md)
18-
* [HDInsight 5.1 Kafka 3.2](../../hdinsight/kafka/apache-kafka-get-started.md)
19+
* [Apache Flink 1.16.0 on HDInsight on AKS](../flink/flink-create-cluster-portal.md)
20+
* [Apache Kafka 3.2 on HDInsight](../../hdinsight/kafka/apache-kafka-get-started.md)
1921
* [Azure Cosmos DB for Apache Cassandra](../../cosmos-db/cassandra/index.yml)
20-
* Prepare an Ubuntu VM as maven project development env in the same VNet as HDInsight on AKS.
22+
* An Ubuntu VM for maven project development environment in the same VNet as HDInsight on AKS cluster.
2123

2224
## Azure Cosmos DB for Apache Cassandra
2325

@@ -376,7 +378,7 @@ public class CassandraSink implements SinkFunction<Tuple3<Integer, String, Strin
376378
**main class: CassandraDemo.java**
377379

378380
> [!Note]
379-
> * Replace Kafka Broker IPs with your cluster broker IPs
381+
> * Replace Kafka Broker IPs with your Kafka cluster broker IPs
380382
> * Prepare topic
381383
> * user `/usr/hdp/current/kafka-broker/bin/kafka-topics.sh --create --replication-factor 2 --partitions 3 --topic user --bootstrap-server wn0-flinkd:9092`
382384
@@ -461,7 +463,7 @@ Run UserProfile class in /azure-cosmos-db-cassandra-java-getting-started-main/sr
461463
bin/flink run -c com.azure.cosmosdb.cassandra.examples.UserProfile -j cosmosdb-cassandra-examples.jar
462464
```
463465

464-
## Sink Kafka Topics into Cosmos DB (Apache Cassandra)
466+
## Sink Kafka Topics into Cosmos DB for Apache Cassandra
465467

466468
Run CassandraDemo class to sink Kafka topic into Cosmos DB for Apache Cassandra
467469

@@ -473,7 +475,7 @@ bin/flink run -c com.azure.cosmosdb.cassandra.examples.CassandraDemo -j cosmosdb
473475

474476
## Validate Apache Flink Job Submission
475477

476-
Check job on HDInsight on AKS Flink UI
478+
Check job on Flink Web UI on HDInsight on AKS Cluster
477479

478480
:::image type="content" source="./media/cosmos-db-for-apache-cassandra/check-output-on-flink-ui.png" alt-text="Screenshot showing how to check the job on HDInsight on AKS Flink UI." lightbox="./media/cosmos-db-for-apache-cassandra/check-output-on-flink-ui.png":::
479481

@@ -548,3 +550,4 @@ sshuser@hn0-flinkd:~$ python user.py | /usr/hdp/current/kafka-broker/bin/kafka-c
548550
* [Azure Cosmos DB for Apache Cassandra](../../cosmos-db/cassandra/introduction.md).
549551
* [Create a API for Cassandra account in Azure Cosmos DB](../../cosmos-db/cassandra/create-account-java.md)
550552
* [Azure Samples ](https://github.com/Azure-Samples/azure-cosmos-db-cassandra-java-getting-started)
553+
* Apache, Apache Kafka, Kafka, Apache Flink, Flink, Apache Cassandra, Cassandra and associated open source project names are [trademarks](../trademarks.md) of the [Apache Software Foundation](https://www.apache.org/) (ASF).

articles/hdinsight-aks/flink/create-kafka-table-flink-kafka-sql-connector.md

Lines changed: 9 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,27 @@
11
---
2-
title: How to create Kafka table on Apache FlinkSQL - Azure portal
3-
description: Learn how to create Kafka table on Apache FlinkSQL
2+
title: How to create Apache Kafka table on an Apache Flink® on HDInsight on AKS
3+
description: Learn how to create Apache Kafka table on Apache Flink®
44
ms.service: hdinsight-aks
55
ms.topic: how-to
6-
ms.date: 10/06/2023
6+
ms.date: 10/27/2023
77
---
88

9-
# Create Kafka table on Apache FlinkSQL
9+
# Create Apache Kafka® table on Apache Flink® on HDInsight on AKS
1010

1111
[!INCLUDE [feature-in-preview](../includes/feature-in-preview.md)]
1212

1313
Using this example, learn how to Create Kafka table on Apache FlinkSQL.
1414

1515
## Prerequisites
1616

17-
* [HDInsight Kafka](../../hdinsight/kafka/apache-kafka-get-started.md)
18-
* [HDInsight on AKS Apache Flink 1.16.0](../flink/flink-create-cluster-portal.md)
17+
* [Apache Kafka cluster on HDInsight](../../hdinsight/kafka/apache-kafka-get-started.md)
18+
* [Apache Flink cluster on HDInsight on AKS](../flink/flink-create-cluster-portal.md)
1919

2020
## Kafka SQL connector on Apache Flink
2121

2222
The Kafka connector allows for reading data from and writing data into Kafka topics. For more information, refer [Apache Kafka SQL Connector](https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/kafka)
2323

24-
## Create a Kafka table on Apache Flink SQL
24+
## Create a Kafka table on Flink SQL
2525

2626
### Prepare topic and data on HDInsight Kafka
2727

@@ -123,7 +123,7 @@ Detailed instructions are provided on how to use Secure Shell for [Flink SQL cli
123123

124124
### Download Kafka SQL Connector & Dependencies into SSH
125125

126-
We're using the **Kafka 3.2.0** dependencies in the below step, You're required to update the command based on your Kafka version on HDInsight.
126+
We're using the **Kafka 3.2.0** dependencies in the below step, You're required to update the command based on your Kafka version on HDInsight cluster.
127127
```
128128
wget https://repo1.maven.org/maven2/org/apache/kafka/kafka-clients/3.2.0/kafka-clients-3.2.0.jar
129129
wget https://repo1.maven.org/maven2/org/apache/flink/flink-connector-kafka/1.16.0/flink-connector-kafka-1.16.0.jar
@@ -181,3 +181,4 @@ Here are the streaming jobs on Flink Web UI
181181
## Reference
182182

183183
* [Apache Kafka SQL Connector](https://nightlies.apache.org/flink/flink-docs-master/docs/connectors/table/kafka)
184+
* Apache, Apache Kafka, Kafka, Apache Flink, Flink, and associated open source project names are [trademarks](../trademarks.md) of the [Apache Software Foundation](https://www.apache.org/) (ASF).

0 commit comments

Comments
 (0)