Skip to content

Commit ae9a640

Browse files
authored
Merge pull request #98251 from dagiro/freshness100
freshness100
2 parents 74f2be4 + 9f452f4 commit ae9a640

File tree

1 file changed

+7
-5
lines changed

1 file changed

+7
-5
lines changed

articles/hdinsight/hadoop/hdinsight-use-sqoop.md

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ ms.author: hrasheed
66
ms.reviewer: jasonh
77
ms.service: hdinsight
88
ms.topic: conceptual
9-
ms.date: 04/12/2019
9+
ms.date: 12/06/2019
1010
---
1111

1212
# Use Apache Sqoop with Hadoop in HDInsight
@@ -17,7 +17,7 @@ Learn how to use Apache Sqoop in HDInsight to import and export data between an
1717

1818
Although Apache Hadoop is a natural choice for processing unstructured and semi-structured data, such as logs and files, there may also be a need to process structured data that is stored in relational databases.
1919

20-
[Apache Sqoop](https://sqoop.apache.org/docs/1.99.7/user.html) is a tool designed to transfer data between Hadoop clusters and relational databases. You can use it to import data from a relational database management system (RDBMS) such as SQL Server, MySQL, or Oracle into the Hadoop distributed file system (HDFS), transform the data in Hadoop with MapReduce or Apache Hive, and then export the data back into an RDBMS. In this article, you are using a SQL Server database for your relational database.
20+
[Apache Sqoop](https://sqoop.apache.org/docs/1.99.7/user.html) is a tool designed to transfer data between Hadoop clusters and relational databases. You can use it to import data from a relational database management system (RDBMS) such as SQL Server, MySQL, or Oracle into the Hadoop distributed file system (HDFS), transform the data in Hadoop with MapReduce or Apache Hive, and then export the data back into an RDBMS. In this article, you're using a SQL Server database for your relational database.
2121

2222
> [!IMPORTANT]
2323
> This article sets up a test environment to perform the data transfer. You then choose a data transfer method for this environment from one of the methods in section [Run Sqoop jobs](#run-sqoop-jobs), further below.
@@ -56,6 +56,7 @@ HDInsight cluster comes with some sample data. You use the following two samples
5656
In this article, you use these two datasets to test Sqoop import and export.
5757

5858
## <a name="create-cluster-and-sql-database"></a>Set up test environment
59+
5960
The cluster, SQL database, and other objects are created through the Azure portal using an Azure Resource Manager template. The template can be found in [Azure quickstart templates](https://azure.microsoft.com/resources/templates/101-hdinsight-linux-with-sql-database/). The Resource Manager template calls a bacpac package to deploy the table schemas to a SQL database. The bacpac package is located in a public blob container, https://hditutorialdata.blob.core.windows.net/usesqoop/SqoopTutorial-2016-2-23-11-2.bacpac. If you want to use a private container for the bacpac files, use the following values in the template:
6061

6162
```json
@@ -107,12 +108,13 @@ HDInsight can run Sqoop jobs by using a variety of methods. Use the following ta
107108

108109
## Limitations
109110

110-
* Bulk export - With Linux-based HDInsight, the Sqoop connector used to export data to Microsoft SQL Server or Azure SQL Database does not currently support bulk inserts.
111+
* Bulk export - With Linux-based HDInsight, the Sqoop connector used to export data to Microsoft SQL Server or Azure SQL Database doesn't currently support bulk inserts.
111112
* Batching - With Linux-based HDInsight, When using the `-batch` switch when performing inserts, Sqoop performs multiple inserts instead of batching the insert operations.
112113

113114
## Next steps
114-
Now you have learned how to use Sqoop. To learn more, see:
115+
116+
Now you've learned how to use Sqoop. To learn more, see:
115117

116118
* [Use Apache Hive with HDInsight](../hdinsight-use-hive.md)
117-
* [Use Apache Pig with HDInsight](../hdinsight-use-pig.md)
118119
* [Upload data to HDInsight](../hdinsight-upload-data.md): Find other methods for uploading data to HDInsight/Azure Blob storage.
120+
* [Use Apache Sqoop to import and export data between Apache Hadoop on HDInsight and SQL Database](./apache-hadoop-use-sqoop-mac-linux.md)

0 commit comments

Comments
 (0)