Skip to content

Commit 22c4e81

Browse files
authored
Merge pull request #112483 from dagiro/freshness_c31
freshness_c31
2 parents 11fbb97 + f340599 commit 22c4e81

File tree

1 file changed

+17
-24
lines changed

1 file changed

+17
-24
lines changed

articles/hdinsight/hdinsight-use-oozie-linux-mac.md

Lines changed: 17 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.author: omidm
66
ms.reviewer: jasonh
77
ms.service: hdinsight
88
ms.topic: conceptual
9-
ms.date: 10/30/2019
9+
ms.date: 04/23/2020
1010
---
1111

1212
# Use Apache Oozie with Apache Hadoop to define and run a workflow on Linux-based Azure HDInsight
@@ -21,7 +21,7 @@ Learn how to use Apache Oozie with Apache Hadoop on Azure HDInsight. Oozie is a
2121
You can also use Oozie to schedule jobs that are specific to a system, like Java programs or shell scripts.
2222

2323
> [!NOTE]
24-
> Another option to define workflows with HDInsight is to use Azure Data Factory. To learn more about Data Factory, see [Use Apache Pig and Apache Hive with Data Factory][azure-data-factory-pig-hive]. To use Oozie on clusters with Enterprise Security Package please see [Run Apache Oozie in HDInsight Hadoop clusters with Enterprise Security Package](domain-joined/hdinsight-use-oozie-domain-joined-clusters.md).
24+
> Another option to define workflows with HDInsight is to use Azure Data Factory. To learn more about Data Factory, see [Use Apache Pig and Apache Hive with Data Factory](../data-factory/transform-data.md). To use Oozie on clusters with Enterprise Security Package please see [Run Apache Oozie in HDInsight Hadoop clusters with Enterprise Security Package](domain-joined/hdinsight-use-oozie-domain-joined-clusters.md).
2525
2626
## Prerequisites
2727

@@ -31,7 +31,7 @@ You can also use Oozie to schedule jobs that are specific to a system, like Java
3131

3232
* **An Azure SQL Database**. See [Create an Azure SQL database in the Azure portal](../sql-database/sql-database-get-started.md). This article uses a database named **oozietest**.
3333

34-
* The [URI scheme](./hdinsight-hadoop-linux-information.md#URI-and-scheme) for your clusters primary storage. This would be `wasb://` for Azure Storage, `abfs://` for Azure Data Lake Storage Gen2 or `adl://` for Azure Data Lake Storage Gen1. If secure transfer is enabled for Azure Storage, the URI would be `wasbs://`. See also, [secure transfer](../storage/common/storage-require-secure-transfer.md).
34+
* The URI scheme for your clusters primary storage. `wasb://` for Azure Storage, `abfs://` for Azure Data Lake Storage Gen2 or `adl://` for Azure Data Lake Storage Gen1. If secure transfer is enabled for Azure Storage, the URI would be `wasbs://`. See also, [secure transfer](../storage/common/storage-require-secure-transfer.md).
3535

3636
## Example workflow
3737

@@ -49,10 +49,10 @@ The workflow used in this document contains two actions. Actions are definitions
4949

5050
For more information about Hive, see [Use Apache Hive with HDInsight][hdinsight-use-hive].
5151

52-
2. A Sqoop action exports the contents of the new Hive table to a table created in Azure SQL Database. For more information about Sqoop, see [Use Apache Sqoop with HDInsight][hdinsight-use-sqoop].
52+
2. A Sqoop action exports the contents of the new Hive table to a table created in Azure SQL Database. For more information about Sqoop, see [Use Apache Sqoop with HDInsight](hadoop/apache-hadoop-use-sqoop-mac-linux.md).
5353

5454
> [!NOTE]
55-
> For supported Oozie versions on HDInsight clusters, see [What's new in the Hadoop cluster versions provided by HDInsight][hdinsight-versions].
55+
> For supported Oozie versions on HDInsight clusters, see [What's new in the Hadoop cluster versions provided by HDInsight](hdinsight-component-versioning.md).
5656
5757
## Create the working directory
5858

@@ -84,7 +84,7 @@ Oozie expects you to store all the resources required for a job in the same dire
8484

8585
## Add a database driver
8686

87-
Because this workflow uses Sqoop to export data to the SQL database, you must provide a copy of the JDBC driver used to interact with the SQL database. To copy the JDBC driver to the working directory, use the following command from the SSH session:
87+
This workflow uses Sqoop to export data to the SQL database. So you must provide a copy of the JDBC driver used to interact with the SQL database. To copy the JDBC driver to the working directory, use the following command from the SSH session:
8888

8989
```bash
9090
hdfs dfs -put /usr/share/java/sqljdbc_7.0/enu/mssql-jdbc*.jar /tutorials/useoozie/
@@ -269,7 +269,7 @@ Oozie workflow definitions are written in Hadoop Process Definition Language (hP
269269

270270
## Create the job definition
271271

272-
The job definition describes where to find the workflow.xml. It also describes where to find other files used by the workflow, such as `useooziewf.hql`. In addition, it defines the values for properties used within the workflow and the associated files.
272+
The job definition describes where to find the workflow.xml. It also describes where to find other files used by the workflow, such as `useooziewf.hql`. Also, it defines the values for properties used within the workflow and the associated files.
273273

274274
1. To get the full address of the default storage, use the following command. This address is used in the configuration file you create in the next step.
275275

@@ -400,7 +400,7 @@ The following steps use the Oozie command to submit and manage Oozie workflows o
400400
export OOZIE_URL=http://HOSTNAMEt:11000/oozie
401401
```
402402
403-
3. To submit the job, use the following:
403+
3. To submit the job, use the following code:
404404
405405
```bash
406406
oozie job -config job.xml -submit
@@ -471,7 +471,7 @@ For more information on the Oozie command, see [Apache Oozie command-line tool](
471471
472472
## Oozie REST API
473473
474-
With the Oozie REST API, you can build your own tools that work with Oozie. The following is HDInsight-specific information about the use of the Oozie REST API:
474+
With the Oozie REST API, you can build your own tools that work with Oozie. The following HDInsight-specific information about the use of the Oozie REST API:
475475
476476
* **URI**: You can access the REST API from outside the cluster at `https://CLUSTERNAME.azurehdinsight.net/oozie`.
477477
@@ -521,7 +521,7 @@ To access the Oozie web UI, complete the following steps:
521521
522522
* **Job DAG**: The DAG is a graphical overview of the data paths taken through the workflow.
523523
524-
![HDInsight Apache Oozie job dag](./media/hdinsight-use-oozie-linux-mac/hdinsight-oozie-job-dag.png)
524+
![`HDInsight Apache Oozie job dag`](./media/hdinsight-use-oozie-linux-mac/hdinsight-oozie-job-dag.png)
525525
526526
7. If you select one of the actions from the **Job Info** tab, it brings up information for the action. For example, select the **RunSqoopExport** action.
527527
@@ -648,9 +648,9 @@ With the Oozie UI, you can view Oozie logs. The Oozie UI also contains links to
648648

649649
3. If available, use the URL from the action to view more details, such as the JobTracker logs, for the action.
650650

651-
The following are specific errors you might encounter and how to resolve them.
651+
The following are specific errors you might come across and how to resolve them.
652652

653-
### JA009: Cannot initialize cluster
653+
### JA009: Can't initialize cluster
654654

655655
**Symptoms**: The job status changes to **SUSPENDED**. Details for the job show the `RunHiveScript` status as **START_MANUAL**. Selecting the action displays the following error message:
656656

@@ -660,15 +660,15 @@ The following are specific errors you might encounter and how to resolve them.
660660
661661
**Resolution**: Change the Blob storage addresses that the job uses.
662662
663-
### JA002: Oozie is not allowed to impersonate <USER>
663+
### JA002: Oozie isn't allowed to impersonate <USER>
664664

665665
**Symptoms**: The job status changes to **SUSPENDED**. Details for the job show the `RunHiveScript` status as **START_MANUAL**. If you select the action, it shows the following error message:
666666

667667
JA002: User: oozie is not allowed to impersonate <USER>
668668

669669
**Cause**: The current permission settings don't allow Oozie to impersonate the specified user account.
670670
671-
**Resolution**: Oozie can impersonate users in the **users** group. Use the `groups USERNAME` to see the groups that the user account is a member of. If the user isn't a member of the **users** group, use the following command to add the user to the group:
671+
**Resolution**: Oozie can impersonate users in the **`users`** group. Use the `groups USERNAME` to see the groups that the user account is a member of. If the user isn't a member of the **`users`** group, use the following command to add the user to the group:
672672

673673
sudo adduser USERNAME users
674674

@@ -703,13 +703,6 @@ For example, for the job in this document, you would use the following steps:
703703

704704
In this article, you learned how to define an Oozie workflow and how to run an Oozie job. To learn more about how to work with HDInsight, see the following articles:
705705

706-
* [Upload data for Apache Hadoop jobs in HDInsight][hdinsight-upload-data]
707-
* [Use Apache Sqoop with Apache Hadoop in HDInsight][hdinsight-use-sqoop]
708-
* [Use Apache Hive with Apache Hadoop on HDInsight][hdinsight-use-hive]
709-
* [Develop Java MapReduce programs for HDInsight](hadoop/apache-hadoop-develop-deploy-java-mapreduce-linux.md)
710-
711-
[azure-data-factory-pig-hive]: ../data-factory/transform-data.md
712-
[hdinsight-versions]: hdinsight-component-versioning.md
713-
[hdinsight-use-sqoop]:hadoop/apache-hadoop-use-sqoop-mac-linux.md
714-
[hdinsight-upload-data]: hdinsight-upload-data.md
715-
[hdinsight-use-hive]:hadoop/hdinsight-use-hive.md
706+
* [Upload data for Apache Hadoop jobs in HDInsight](hdinsight-upload-data.md)
707+
* [Use Apache Sqoop with Apache Hadoop in HDInsight](hadoop/apache-hadoop-use-sqoop-mac-linux.md)
708+
* [Use Apache Hive with Apache Hadoop on HDInsight](hadoop/hdinsight-use-hive.md)

0 commit comments

Comments
 (0)