Skip to content

Commit 4b90882

Browse files
authored
Merge pull request #269818 from iemejia/rename-managed-airflow-mentions-hdinsight
Rename Data Factory Managed Airflow service to Workflow Orchestration Manager
2 parents e9e47d6 + aaa7d1d commit 4b90882

File tree

5 files changed

+20
-20
lines changed

5 files changed

+20
-20
lines changed

articles/energy-data-services/how-to-integrate-airflow-logs-with-azure-monitor.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,7 +27,7 @@ In this article, you'll learn how to start collecting Airflow Logs for your Micr
2727

2828

2929
## Enabling diagnostic settings to collect logs in a storage account
30-
Every Azure Data Manager for Energy instance comes inbuilt with an Azure Data Factory-managed Airflow instance. We collect Airflow logs for internal troubleshooting and debugging purposes. Airflow logs can be integrated with Azure Monitor in the following ways:
30+
Every Azure Data Manager for Energy instance comes inbuilt with an Azure Data Factory Workflow Orchestration Manager (powered by Apache Airflow) instance. We collect Airflow logs for internal troubleshooting and debugging purposes. Airflow logs can be integrated with Azure Monitor in the following ways:
3131

3232
* Storage account
3333
* Log Analytics workspace

articles/hdinsight-aks/TOC.yml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -211,7 +211,7 @@ items:
211211
href: ./flink/azure-iot-hub.md
212212
- name: Azure Pipelines
213213
href: ./flink/use-azure-pipelines-to-run-flink-jobs.md
214-
- name: Azure Data Factory Managed Airflow
214+
- name: Azure Data Factory Workflow Orchestration Manager
215215
href: ./flink/flink-job-orchestration.md
216216
- name: Delta connectors
217217
href: ./flink/use-flink-delta-connector.md
@@ -267,7 +267,7 @@ items:
267267
href: ./spark/create-spark-cluster.md
268268
- name: How-to guides
269269
items:
270-
- name: Azure Data Factory Managed Airflow
270+
- name: Azure Data Factory Workflow Orchestration Manager
271271
href: ./spark/spark-job-orchestration.md
272272
- name: Configuration management
273273
href: ./spark/configuration-management.md

articles/hdinsight-aks/flink/flink-job-orchestration.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
---
2-
title: Azure Data Factory Managed Airflow with Apache Flink® on HDInsight on AKS
3-
description: Learn how to perform Apache Flink® job orchestration using Azure Data Factory Managed Airflow
2+
title: Azure Data Factory Workflow Orchestration Manager (powered by Apache Airflow) with Apache Flink® on HDInsight on AKS
3+
description: Learn how to perform Apache Flink® job orchestration using Azure Data Factory Workflow Orchestration Manager
44
ms.service: hdinsight-aks
55
ms.topic: how-to
66
ms.date: 10/28/2023
77
---
88

9-
# Apache Flink® job orchestration using Azure Data Factory Managed Airflow
9+
# Apache Flink® job orchestration using Azure Data Factory Workflow Orchestration Manager (powered by Apache Airflow)
1010

1111
[!INCLUDE [feature-in-preview](../includes/feature-in-preview.md)]
1212

13-
This article covers managing a Flink job using [Azure REST API](flink-job-management.md#arm-rest-api) and orchestration data pipeline with Azure Data Factory Managed Airflow. [Azure Data Factory Managed Airflow](/azure/data-factory/concept-managed-airflow) service is a simple and efficient way to create and manage [Apache Airflow](https://airflow.apache.org/) environments, enabling you to run data pipelines at scale easily.
13+
This article covers managing a Flink job using [Azure REST API](flink-job-management.md#arm-rest-api) and orchestration data pipeline with Azure Data Factory Workflow Orchestration Manager. [Azure Data Factory Workflow Orchestration Manager](/azure/data-factory/concepts-workflow-orchestration-manager) service is a simple and efficient way to create and manage [Apache Airflow](https://airflow.apache.org/) environments, enabling you to run data pipelines at scale easily.
1414

1515
Apache Airflow is an open-source platform that programmatically creates, schedules, and monitors complex data workflows. It allows you to define a set of tasks, called operators that can be combined into directed acyclic graphs (DAGs) to represent data pipelines.
1616

@@ -37,7 +37,7 @@ It is recommended to rotate access keys or secrets periodically.
3737
```
3838
3939
40-
1. Create Managed Airflow enable with [Azure Key Vault](/azure/data-factory/enable-azure-key-vault-for-managed-airflow) to store and manage your sensitive information in a secure and centralized manner. By doing this, you can use variables and connections, and they automatically be stored in Azure Key Vault. The name of connections and variables need to be prefixed by variables_prefix  defined in AIRFLOW__SECRETS__BACKEND_KWARGS. For example, If variables_prefix has a value as  hdinsight-aks-variables then for a variable key of hello, you would want to store your Variable at hdinsight-aks-variable -hello.
40+
1. Enable [Azure Key Vault for Workflow Orchestration Manager](/azure/data-factory/enable-azure-key-vault) to store and manage your sensitive information in a secure and centralized manner. By doing this, you can use variables and connections, and they automatically be stored in Azure Key Vault. The name of connections and variables need to be prefixed by variables_prefix  defined in AIRFLOW__SECRETS__BACKEND_KWARGS. For example, If variables_prefix has a value as  hdinsight-aks-variables then for a variable key of hello, you would want to store your Variable at hdinsight-aks-variable -hello.
4141
4242
- Add the following settings for the Airflow configuration overrides in integrated runtime properties:
4343
@@ -101,7 +101,7 @@ You can read more details about DAGs, Control Flow, SubDAGs, TaskGroups, etc. di
101101
102102
## DAG execution
103103
104-
Example code is available on the [git](https://github.com/Azure-Samples/hdinsight-aks/blob/main/flink/airflow-python-sample-code); download the code locally on your computer and upload the wordcount.py to a blob storage. Follow the [steps](/azure/data-factory/how-does-managed-airflow-work#steps-to-import) to import DAG into your Managed Airflow created during setup.
104+
Example code is available on the [git](https://github.com/Azure-Samples/hdinsight-aks/blob/main/flink/airflow-python-sample-code); download the code locally on your computer and upload the wordcount.py to a blob storage. Follow the [steps](/azure/data-factory/how-does-workflow-orchestration-manager-work#steps-to-import) to import DAG into your workflow created during setup.
105105
106106
The wordcount.py is an example of orchestrating a Flink job submission using Apache Airflow with HDInsight on AKS. The example is based on the wordcount example provided on [Apache Flink](https://nightlies.apache.org/flink/flink-docs-master/docs/dev/dataset/examples/).
107107
@@ -115,9 +115,9 @@ The DAG expects to have setup for the Service Principal, as described during the
115115
116116
### Execution steps
117117
118-
1. Execute the DAG from the [Airflow UI](https://airflow.apache.org/docs/apache-airflow/stable/ui.html), you can open the Azure Data Factory Managed Airflow UI by clicking on Monitor icon.
118+
1. Execute the DAG from the [Airflow UI](https://airflow.apache.org/docs/apache-airflow/stable/ui.html), you can open the Azure Data Factory Workflow Orchestration Manager UI by clicking on Monitor icon.
119119
120-
:::image type="content" source="./media/flink-job-orchestration/airflow-user-interface-step-1.png" alt-text="Screenshot shows open the Azure data factory managed airflow UI by clicking on monitor icon." lightbox="./media/flink-job-orchestration/airflow-user-interface-step-1.png":::
120+
:::image type="content" source="./media/flink-job-orchestration/airflow-user-interface-step-1.png" alt-text="Screenshot shows open the Azure Data Factory Workflow Orchestration Manager UI by clicking on monitor icon." lightbox="./media/flink-job-orchestration/airflow-user-interface-step-1.png":::
121121
122122
1. Select the “FlinkWordCountExample” DAG from the “DAGs” page.
123123

articles/hdinsight-aks/spark/spark-job-orchestration.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,16 +1,16 @@
11
---
2-
title: Azure Data Factory Managed Airflow with Apache Spark® on HDInsight on AKS
3-
description: Learn how to perform Apache Spark® job orchestration using Azure Data Factory Managed Airflow
2+
title: Azure Data Factory Workflow Orchestration Manager (powered by Apache Airflow) with Apache Spark® on HDInsight on AKS
3+
description: Learn how to perform Apache Spark® job orchestration using Azure Data Factory Workflow Orchestration Manager
44
ms.service: hdinsight-aks
55
ms.topic: how-to
66
ms.date: 11/28/2023
77
---
88

9-
# Apache Spark® job orchestration using Azure Data Factory Managed Airflow
9+
# Apache Spark® job orchestration using Azure Data Factory Workflow Orchestration Manager (powered by Apache Airflow)
1010

1111
[!INCLUDE [feature-in-preview](../includes/feature-in-preview.md)]
1212

13-
This article covers managing a Spark job using [Apache Spark Livy API](https://livy.incubator.apache.org/docs/latest/rest-api.html) and orchestration data pipeline with Azure Data Factory Managed Airflow. [Azure Data Factory Managed Airflow](/azure/data-factory/concept-managed-airflow) service is a simple and efficient way to create and manage [Apache Airflow](https://airflow.apache.org/) environments, enabling you to run data pipelines at scale easily.
13+
This article covers managing a Spark job using [Apache Spark Livy API](https://livy.incubator.apache.org/docs/latest/rest-api.html) and orchestration data pipeline with Azure Data Factory Workflow Orchestration Manager. [Azure Data Factory Workflow Orchestration Manager](/azure/data-factory/concepts-workflow-orchestration-manager) service is a simple and efficient way to create and manage [Apache Airflow](https://airflow.apache.org/) environments, enabling you to run data pipelines at scale easily.
1414

1515
Apache Airflow is an open-source platform that programmatically creates, schedules, and monitors complex data workflows. It allows you to define a set of tasks, called operators that can be combined into directed acyclic graphs (DAGs) to represent data pipelines.
1616

@@ -37,7 +37,7 @@ It is recommended to rotate access keys or secrets periodically (you can use va
3737
```
3838
3939
40-
1. Create Managed Airflow enable with [Azure Key Vault](/azure/data-factory/enable-azure-key-vault-for-managed-airflow) to store and manage your sensitive information in a secure and centralized manner. By doing this, you can use variables and connections, and they automatically be stored in Azure Key Vault. The name of connections and variables need to be prefixed by variables_prefix  defined in AIRFLOW__SECRETS__BACKEND_KWARGS. For example, If variables_prefix has a value as  hdinsight-aks-variables then for a variable key of hello, you would want to store your Variable at hdinsight-aks-variable -hello.
40+
1. Enable [Azure Key Vault for Workflow Orchestration Manager](/azure/data-factory/enable-azure-key-vault) to store and manage your sensitive information in a secure and centralized manner. By doing this, you can use variables and connections, and they automatically be stored in Azure Key Vault. The name of connections and variables need to be prefixed by variables_prefix  defined in AIRFLOW__SECRETS__BACKEND_KWARGS. For example, If variables_prefix has a value as  hdinsight-aks-variables then for a variable key of hello, you would want to store your Variable at hdinsight-aks-variable -hello.
4141
4242
- Add the following settings for the Airflow configuration overrides in integrated runtime properties:
4343
@@ -101,7 +101,7 @@ You can read more details about DAGs, Control Flow, SubDAGs, TaskGroups, etc. di
101101
102102
## DAG execution
103103
104-
Example code is available on the [git](https://github.com/sethiaarun/hdinsight-aks/blob/spark-airflow-example/spark/Airflow/airflow-python-example-code.py); download the code locally on your computer and upload the wordcount.py to a blob storage. Follow the [steps](/azure/data-factory/how-does-managed-airflow-work#steps-to-import) to import DAG into your Managed Airflow created during setup.
104+
Example code is available on the [git](https://github.com/sethiaarun/hdinsight-aks/blob/spark-airflow-example/spark/Airflow/airflow-python-example-code.py); download the code locally on your computer and upload the wordcount.py to a blob storage. Follow the [steps](/azure/data-factory/how-does-workflow-orchestration-manager-work#steps-to-import) to import DAG into your workflow created during setup.
105105
106106
The airflow-python-example-code.py is an example of orchestrating a Spark job submission using Apache Spark with HDInsight on AKS. The example is based on [SparkPi](https://github.com/apache/spark/blob/master/examples/src/main/scala/org/apache/spark/examples/SparkPi.scala) example provided on Apache Spark.
107107
@@ -115,9 +115,9 @@ The DAG expects to have setup for the Service Principal, as described during the
115115
116116
### Execution steps
117117
118-
1. Execute the DAG from the [Airflow UI](https://airflow.apache.org/docs/apache-airflow/stable/ui.html), you can open the Azure Data Factory Managed Airflow UI by clicking on Monitor icon.
118+
1. Execute the DAG from the [Airflow UI](https://airflow.apache.org/docs/apache-airflow/stable/ui.html), you can open the Azure Data Factory Workflow Orchestration Manager UI by clicking on Monitor icon.
119119
120-
:::image type="content" source="./media/spark-job-orchestration/airflow-user-interface-step-1.png" alt-text="Screenshot shows open the Azure data factory managed airflow UI by clicking on monitor icon." lightbox="./media/spark-job-orchestration/airflow-user-interface-step-1.png":::
120+
:::image type="content" source="./media/spark-job-orchestration/airflow-user-interface-step-1.png" alt-text="Screenshot shows open the Azure Data Factory Workflow Orchestration Manager UI by clicking on monitor icon." lightbox="./media/spark-job-orchestration/airflow-user-interface-step-1.png":::
121121
122122
1. Select the “SparkWordCountExample” DAG from the “DAGs” page.
123123

articles/hdinsight-aks/whats-new.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -39,7 +39,7 @@ The following table list shows the features of HDInsight on AKS that are current
3939
| Auto Scale | Load based [Auto Scale](hdinsight-on-aks-autoscale-clusters.md#create-a-cluster-with-load-based-auto-scale), and Schedule based [Auto Scale](hdinsight-on-aks-autoscale-clusters.md#create-a-cluster-with-schedule-based-auto-scale) |
4040
| Customize and Configure Clusters | Support for [script actions](./manage-script-actions.md) during cluster creation, Support for [library management](./spark/library-management.md), [Service configuration](./service-configuration.md) settings after cluster creation |
4141
| Trino | Support for [Trino catalogs](./trino/trino-add-catalogs.md), [Trino CLI Support](./trino/trino-ui-command-line-interface.md), [DBeaver](./trino/trino-ui-dbeaver.md) support for query submission, Add or remove [plugins](./trino/trino-custom-plugins.md) and [connectors](./trino/trino-connectors.md), Support for [logging query](./trino/trino-query-logging.md) events, Support for [scan query statistics](./trino/trino-scan-stats.md) for any [Connector](./trino/trino-connectors.md) in Trino dashboard, Support for Trino [dashboard](./trino/trino-ui.md) to monitor queries, [Query Caching](./trino/trino-caching.md), Integration with Power BI, Integration with [Apache Superset](./trino/trino-superset.md), Redash, Support for multiple [connectors](./trino/trino-connectors.md) |
42-
| Flink | Support for Flink native web UI, Flink support with HMS for [DStream](./flink/use-hive-metastore-datastream.md), Submit jobs to the cluster using [REST API and Azure portal](./flink/flink-job-management.md), Run programs packaged as JAR files via the [Flink CLI](./flink/use-flink-cli-to-submit-jobs.md), Support for persistent Savepoints, Support for update the configuration options when the job is running, Connecting to multiple Azure services: [Azure Cosmos DB](./flink/cosmos-db-for-apache-cassandra.md), [Azure Databricks](./flink/azure-databricks.md), [Azure Data Explorer](./flink/integration-of-azure-data-explorer.md), [Azure Event Hubs](./flink/flink-how-to-setup-event-hub.md), [Azure IoT Hub](./flink/azure-iot-hub.md), [Azure Pipelines](./flink/use-azure-pipelines-to-run-flink-jobs.md), [Azure Data Factory Managed Airflow](./flink/flink-job-orchestration.md), [HDInsight Kafka](./flink/process-and-consume-data.md), Submit jobs to the cluster using [Flink CLI](./flink/use-flink-cli-to-submit-jobs.md) and [CDC](./flink/monitor-changes-postgres-table-flink.md) with Flink |
42+
| Flink | Support for Flink native web UI, Flink support with HMS for [DStream](./flink/use-hive-metastore-datastream.md), Submit jobs to the cluster using [REST API and Azure portal](./flink/flink-job-management.md), Run programs packaged as JAR files via the [Flink CLI](./flink/use-flink-cli-to-submit-jobs.md), Support for persistent Savepoints, Support for update the configuration options when the job is running, Connecting to multiple Azure services: [Azure Cosmos DB](./flink/cosmos-db-for-apache-cassandra.md), [Azure Databricks](./flink/azure-databricks.md), [Azure Data Explorer](./flink/integration-of-azure-data-explorer.md), [Azure Event Hubs](./flink/flink-how-to-setup-event-hub.md), [Azure IoT Hub](./flink/azure-iot-hub.md), [Azure Pipelines](./flink/use-azure-pipelines-to-run-flink-jobs.md), [Azure Data Factory Workflow Orchestration Manager](./flink/flink-job-orchestration.md), [HDInsight Kafka](./flink/process-and-consume-data.md), Submit jobs to the cluster using [Flink CLI](./flink/use-flink-cli-to-submit-jobs.md) and [CDC](./flink/monitor-changes-postgres-table-flink.md) with Flink |
4343
| Spark | [Jupyter Notebook](./spark/submit-manage-jobs.md), Support for [Delta lake](./spark/azure-hdinsight-spark-on-aks-delta-lake.md) 2.0, Zeppelin Support, Support ATS, Support for Yarn History server interface, Job submission using SSH, Job submission using SDK and [Machine Learning Notebook](./spark/azure-hdinsight-spark-on-aks-delta-lake.md) |
4444

4545
## Roadmap of Features

0 commit comments

Comments
 (0)