You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/hadoop/apache-hadoop-use-sqoop-mac-linux.md
+83-23Lines changed: 83 additions & 23 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,15 +1,13 @@
1
1
---
2
2
title: Apache Sqoop with Apache Hadoop - Azure HDInsight
3
3
description: Learn how to use Apache Sqoop to import and export between Apache Hadoop on HDInsight and an Azure SQL Database.
4
-
keywords: hadoop sqoop,sqoop
5
-
6
4
author: hrasheed-msft
7
5
ms.author: hrasheed
8
6
ms.reviewer: jasonh
9
7
ms.service: hdinsight
10
-
ms.custom: hdinsightactive,hdiseo17may2017
11
8
ms.topic: conceptual
12
-
ms.date: 04/15/2019
9
+
ms.custom: hdinsightactive,hdiseo17may2017
10
+
ms.date: 11/28/2019
13
11
---
14
12
15
13
# Use Apache Sqoop to import and export data between Apache Hadoop on HDInsight and SQL Database
@@ -22,58 +20,120 @@ Learn how to use Apache Sqoop to import and export between an Apache Hadoop clus
22
20
23
21
* Completion of [Set up test environment](./hdinsight-use-sqoop.md#create-cluster-and-sql-database) from [Use Apache Sqoop with Hadoop in HDInsight](./hdinsight-use-sqoop.md).
24
22
25
-
* A client to query the Azure SQL database. Consider using [SQL Server Management Studio](../../sql-database/sql-database-connect-query-ssms.md) or [Visual Studio Code](../../sql-database/sql-database-connect-query-vscode.md).
26
-
27
23
* An SSH client. For more information, see [Connect to HDInsight (Apache Hadoop) using SSH](../hdinsight-hadoop-linux-use-ssh-unix.md).
28
24
29
-
## Sqoop export
25
+
* Familiarity with Sqoop. For more information, see [Sqoop User Guide](https://sqoop.apache.org/docs/1.4.7/SqoopUserGuide.html).
30
26
31
-
From Hive to SQL Server.
27
+
## Set up
32
28
33
-
1. Use SSH to connect to the HDInsight cluster. Replace `CLUSTERNAME` with the name of your cluster, then enter the command:
29
+
1. Use [ssh command](../hdinsight-hadoop-linux-use-ssh-unix.md)to connect to your cluster. Edit the command below by replacing CLUSTERNAME with the name of your cluster, and then enter the command:
2. Replace `MYSQLSERVER` with the name of your SQL Server. To verify that Sqoop can see your SQL Database, enter the command below in your open SSH connection. Enter the password for the SQL Server login when prompted. This command returns a list of databases.
35
+
1. For ease of use, set variables. Replace `PASSWORD`, `MYSQLSERVER`, and `MYDATABASE` with the relevant values, and then enter the commands below:
3. Replace `MYSQLSERVER` with the name of your SQL Server, and `MYDATABASE` with the name of your SQL database. To export data from the Hive `hivesampletable` table to the `mobiledata` table in SQL Database, enter the command below in your open SSH connection. Enter the password for the SQL Server login when prompted
47
+
## Sqoop export
48
+
49
+
From Hive to SQL Server.
50
+
51
+
1. To verify that Sqoop can see your SQL Database, enter the command below in your open SSH connection. This command returns a list of databases.
52
+
53
+
```bash
54
+
sqoop list-databases --connect $serverConnect
55
+
```
56
+
57
+
1. Enter the following command to see a list of tables for the specified database:
4. To verify that data was exported, use the following queries from your SQL client to view the exported data:
63
+
1. To export data from the Hive `hivesampletable` table to the `mobiledata` table in SQL Database, enter the command below in your open SSH connection:
52
64
53
-
```sql
54
-
SELECT COUNT(*) FROM [dbo].[mobiledata] WITH (NOLOCK);
55
-
SELECT TOP(25) * FROM [dbo].[mobiledata] WITH (NOLOCK);
65
+
```bash
66
+
sqoop export --connect $serverDbConnect \
67
+
-table mobiledata \
68
+
--hcatalog-table hivesampletable
69
+
```
70
+
71
+
1. To verify that data was exported, use the following queries from your SSH connection to view the exported data:
72
+
73
+
```bash
74
+
sqoop eval --connect $serverDbConnect \
75
+
--query "SELECT COUNT(*) from dbo.mobiledata WITH (NOLOCK)"
76
+
77
+
78
+
sqoop eval --connect $serverDbConnect \
79
+
--query "SELECT TOP(10) * from dbo.mobiledata WITH (NOLOCK)"
56
80
```
57
81
58
82
## Sqoop import
59
83
60
84
From SQL Server to Azure storage.
61
85
62
-
1. Replace `MYSQLSERVER` with the name of your SQL Server, and `MYDATABASE` with the name of your SQL database. Enter the command below in your open SSH connection to import data from the `mobiledata` table in SQL Database, to the `wasb:///tutorials/usesqoop/importeddata` directory on HDInsight. Enter the password for the SQL Server login when prompted. The fields in the data are separated by a tab character, and the lines are terminated by a new-line character.
86
+
1. Enter the command below in your open SSH connection to import data from the `mobiledata` table in SQL Database, to the `wasbs:///tutorials/usesqoop/importeddata` directory on HDInsight. The fields in the data are separated by a tab character, and the lines are terminated by a new-line character.
1. Execute each query below one at a time and review the output:
124
+
125
+
```hql
126
+
show tables;
127
+
describe mobiledata_imported2;
128
+
SELECT COUNT(*) FROM mobiledata_imported2;
129
+
SELECT * FROM mobiledata_imported2 LIMIT 10;
130
+
```
131
+
132
+
1. Exit beeline with `!exit`.
133
+
74
134
## Limitations
75
135
76
-
* Bulk export - With Linux-based HDInsight, the Sqoop connector used to export data to Microsoft SQL Server or Azure SQL Database does not support bulk inserts.
136
+
* Bulk export - With Linux-based HDInsight, the Sqoop connector used to export data to Microsoft SQL Server or Azure SQL Database doesn't support bulk inserts.
77
137
78
138
* Batching - With Linux-based HDInsight, When using the `-batch` switch when performing inserts, Sqoop makes multiple inserts instead of batching the insert operations.
79
139
@@ -91,7 +151,7 @@ From SQL Server to Azure storage.
91
151
92
152
## Next steps
93
153
94
-
Now you have learned how to use Sqoop. To learn more, see:
154
+
Now you've learned how to use Sqoop. To learn more, see:
95
155
96
156
* [Use Apache Oozie with HDInsight](../hdinsight-use-oozie-linux-mac.md): Use Sqoop action in an Oozie workflow.
97
157
* [Analyze flight delay data using HDInsight](../interactive-query/interactive-query-tutorial-analyze-flight-data.md): Use Interactive Query to analyze flight delay data, and then use Sqoop to export data to an Azure SQL database.
0 commit comments