You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/interactive-query/apache-hive-warehouse-connector.md
+13-13Lines changed: 13 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -31,7 +31,7 @@ Some of the operations supported by the Hive Warehouse Connector are:
31
31
32
32
## Hive Warehouse Connector setup
33
33
34
-
Follow these steps to setup the Hive Warehouse Connector between a Spark and Interactive Query cluster in Azure HDInsight:
34
+
Follow these steps to set up the Hive Warehouse Connector between a Spark and Interactive Query cluster in Azure HDInsight:
35
35
36
36
### Create clusters
37
37
@@ -41,7 +41,7 @@ Follow these steps to setup the Hive Warehouse Connector between a Spark and Int
41
41
42
42
### Modify hosts file
43
43
44
-
Copy the node information from the `/etc/hosts` file on headnode0 of your Interactive Query cluster and concatenate the information to the `/etc/hosts` file on the headnode0 of your Spark cluster. This step will allow your Spark cluster to resolve IP addresses of the nodes in Interactive Query cluster. View the contents of the updated file with `cat /etc/hosts`. The output should look something like what is shown in the screenshot below.
44
+
Copy the node information from the `/etc/hosts` file on headnode0 of your Interactive Query cluster and concatenate the information to the `/etc/hosts` file on the headnode0 of your Spark cluster. This step will allow your Spark cluster to resolve IP addresses of the nodes in Interactive Query cluster. View the contents of the updated file with `cat /etc/hosts`. The final output should look something like what is shown in the screenshot below.
@@ -69,14 +69,14 @@ From your Spark Ambari web UI, navigate to **Spark2** > **CONFIGS** > **Custom s
69
69
70
70
Select **Add Property...** as needed to add/update the following:
71
71
72
-
| Key | Value | Comment |
73
-
|----|----|----|
74
-
|`spark.hadoop.hive.llap.daemon.service.hosts`|The value you obtained earlier from **hive.llap.daemon.service.hosts**.||
75
-
|`spark.sql.hive.hiveserver2.jdbc.url`|`jdbc:hive2://LLAPCLUSTERNAME.azurehdinsight.net:443/;user=admin;password=PWD;ssl=true;transportMode=http;httpPath=/hive2`|Set to the JDBC connection string, which connects to Hiveserver2 on the Interactive Query cluster. REPLACE `LLAPCLUSTERNAME` with the name of your Interactive Query cluster. Replace `PWD` with the actual password.|
76
-
|`spark.datasource.hive.warehouse.load.staging.dir`|`wasbs://STORAGE_CONTAINER_NAME@STORAGE_ACCOUNT_NAME.blob.core.windows.net/tmp`|Set to a suitable HDFS-compatible staging directory. If you have two different clusters, the staging directory should be a folder in the staging directory of the LLAP cluster’s storage account so that HiveServer2 has access to it. Replace `STORAGE_ACCOUNT_NAME` with the name of the storage account being used by the cluster, and `STORAGE_CONTAINER_NAME` with the name of the storage container.|
77
-
|`spark.datasource.hive.warehouse.metastoreUri`|The value you obtained earlier from **hive.metastore.uris**.||
78
-
|`spark.security.credentials.hiveserver2.enabled`|`false`|`false` for YARN client deploy mode.|
79
-
|`spark.hadoop.hive.zookeeper.quorum`|The value you obtained earlier from **hive.zookeeper.quorum**.||
72
+
| Key | Value |
73
+
|----|----|
74
+
|`spark.hadoop.hive.llap.daemon.service.hosts`|The value you obtained earlier from **hive.llap.daemon.service.hosts**.|
75
+
|`spark.sql.hive.hiveserver2.jdbc.url`|`jdbc:hive2://LLAPCLUSTERNAME.azurehdinsight.net:443/;user=admin;password=PWD;ssl=true;transportMode=http;httpPath=/hive2`. Set to the JDBC connection string, which connects to Hiveserver2 on the Interactive Query cluster. REPLACE `LLAPCLUSTERNAME` with the name of your Interactive Query cluster. Replace `PWD` with the actual password.|
76
+
|`spark.datasource.hive.warehouse.load.staging.dir`|`wasbs://STORAGE_CONTAINER_NAME@STORAGE_ACCOUNT_NAME.blob.core.windows.net/tmp`. Set to a suitable HDFS-compatible staging directory. If you have two different clusters, the staging directory should be a folder in the staging directory of the LLAP cluster’s storage account so that HiveServer2 has access to it. Replace `STORAGE_ACCOUNT_NAME` with the name of the storage account being used by the cluster, and `STORAGE_CONTAINER_NAME` with the name of the storage container.|
77
+
|`spark.datasource.hive.warehouse.metastoreUri`|The value you obtained earlier from **hive.metastore.uris**.|
78
+
|`spark.security.credentials.hiveserver2.enabled`|`false` for YARN client deploy mode.|
79
+
|`spark.hadoop.hive.zookeeper.quorum`|The value you obtained earlier from **hive.zookeeper.quorum**.|
80
80
81
81
Save changes and restart components as needed.
82
82
@@ -167,7 +167,7 @@ Spark doesn’t natively support writing to Hive’s managed ACID tables. Using
167
167
1. View the results with the following command:
168
168
169
169
```scala
170
-
hive.table("sampletable_colorado2").show()
170
+
hive.table("sampletable_colorado").show()
171
171
```
172
172
173
173

@@ -190,7 +190,7 @@ Follow the steps below to create a Hive Warehouse Connector example that ingests
190
190
1. Open a second SSH session on the same Spark cluster.
191
191
1. At the command prompt, type`nc -lk 9999`. This command uses the netcat utility to send data from the command line to the specified port.
192
192
193
-
1. Return the the first SSH session and create a new Hive table to hold the streaming data. At the spark-shell, enter the following command:
193
+
1. Return to the first SSH session and create a new Hive table to hold the streaming data. At the spark-shell, enter the following command:
0 commit comments