You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/hadoop/apache-hadoop-visual-studio-tools-get-started.md
+22-18Lines changed: 22 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,23 +1,23 @@
1
1
---
2
2
title: Apache Hadoop & Visual Studio Data Lake Tools - Azure HDInsight
3
-
description: Learn how to install and use Data Lake Tools for Visual Studio to connect to Apache Hadoop clusters in Azure HDInsight, and then run Hive queries.
3
+
description: Learn how to install and use Data Lake Tools for Visual Studio. Use tool to connect to Apache Hadoop clusters in Azure HDInsight, and then run Hive queries.
4
4
keywords: hadoop tools,hive query,visual studio,visual studio hadoop
# Use Data Lake Tools for Visual Studio to connect to Azure HDInsight and run Apache Hive queries
15
15
16
-
Learn how to use Microsoft Azure Data Lake and Stream Analytics Tools for Visual Studio (also called Data Lake Tools) to connect to [Apache Hadoop clusters in Azure HDInsight](apache-hadoop-introduction.md) and submit Hive queries.
16
+
Learn how to use Microsoft Azure Data Lake and Stream Analytics Tools for Visual Studio (Data Lake Tools). Use the tool to connect to [Apache Hadoop clusters in Azure HDInsight](apache-hadoop-introduction.md) and submit Hive queries.
17
17
18
18
For more information about using HDInsight, see [Get started with HDInsight](apache-hadoop-linux-tutorial-get-started.md).
19
19
20
-
For more information about connecting to an Apache Storm cluster, see [Develop C# topologies for Apache Storm by using the Data Lake tools for Visual Studio](../storm/apache-storm-develop-csharp-visual-studio-topology.md).
20
+
For more information on connecting to Apache Storm, see [Develop C# topologies for Apache Storm by using the Data Lake tools](../storm/apache-storm-develop-csharp-visual-studio-topology.md).
21
21
22
22
You can use Data Lake Tools for Visual Studio to access Azure Data Lake Analytics and HDInsight. For information about Data Lake Tools, see [Develop U-SQL scripts by using Data Lake Tools for Visual Studio](../../data-lake-analytics/data-lake-analytics-data-lake-tools-get-started.md).
23
23
@@ -33,15 +33,15 @@ To complete this article and use Data Lake Tools for Visual Studio, you need the
33
33
34
34
Follow the appropriate instructions to install Data Lake Tools for your version of Visual Studio:
35
35
36
-
- For Visual Studio 2017 or Visual Studio 2019:
36
+
* For Visual Studio 2017 or Visual Studio 2019:
37
37
38
38
During Visual Studio installation, make sure you include the **Azure development** workload or the **Data storage and processing** workload.
39
39
40
-
For existing Visual Studio installations, go to the IDE menu bar, and select **Tools** > **Get Tools and Features** to open Visual Studio Installer. In the **Workloads** tab, select at least the **Azure development** workload (under **Web & Cloud**) or the **Data storage and processing** workload (under **Other Toolsets**).
40
+
For existing Visual Studio installations, go to the IDE menu bar, and select **Tools** > **Get Tools and Features** to open Visual Studio Installer. In the **Workloads** tab, select at least the **Azure development** workload (under **Web & Cloud**). Or select the **Data storage and processing** workload (under **Other Toolsets**).
41
41
42
42

43
43
44
-
- For Visual Studio 2015:
44
+
* For Visual Studio 2015:
45
45
46
46
[Download Data Lake Tools](https://www.microsoft.com/download/details.aspx?id=49504). Choose the version of Data Lake Tools that matches your version of Visual Studio.
47
47
@@ -91,7 +91,7 @@ To connect to your Azure subscription:
91
91
92
92

93
93
94
-
6. Expand an HDInsight cluster. The cluster contains nodes for **Hive Databases**, a default storage account, any additional linked storage accounts, and **Hadoop Service Log**. You can further expand the entities.
94
+
6. Expand an HDInsight cluster. The cluster contains nodes for **Hive Databases**. Also, a default storage account, any additional linked storage accounts, and **Hadoop Service Log**. You can further expand the entities.
95
95
96
96
After you've connected to your Azure subscription, you can do the following tasks.
97
97
@@ -105,7 +105,7 @@ To connect to the Azure portal from Visual Studio:
105
105
106
106
### Offer questions and feedback from Visual Studio
107
107
108
-
To ask questions and/or provide feedback from Visual Studio:
108
+
To ask questions and, or provide feedback from Visual Studio:
109
109
110
110
1. From Server Explorer, choose **Azure** > **HDInsight**.
111
111
@@ -120,7 +120,7 @@ To link an HDInsight cluster:
120
120
121
121
1. Right-click **HDInsight**, and then select **Link a HDInsight Cluster** to display the **Link a HDInsight Cluster** dialog box.
122
122
123
-
2. Enter a **Connection Url** in the form *https\://\<cluster name>.azurehdinsight.net*. The **Cluster Name** automatically fills in with the cluster name portion of your URL when you go to another field. Then enter a **Username** and **Password**, and select **Next**.
123
+
2. Enter a **Connection Url** in the form `https://CLUSTERNAME.azurehdinsight.net`. The **Cluster Name** automatically fills in with the cluster name portion of your URL when you go to another field. Then enter a **Username** and **Password**, and select **Next**.
124
124
125
125

126
126
@@ -131,6 +131,7 @@ To update a linked cluster, right-click the cluster and select **Edit**. You can
131
131

132
132
133
133
## Explore linked resources
134
+
134
135
From Server Explorer, you can see the default storage account and any linked storage accounts. If you expand the default storage account, you can see the containers on the storage account. The default storage account and the default container are marked.
135
136
136
137

@@ -140,14 +141,15 @@ Right-click a container and select **View Container** to view the container's co
140
141

141
142
142
143
## Run interactive Apache Hive queries
144
+
143
145
[Apache Hive](https://hive.apache.org) is a data warehouse infrastructure that's built on Hadoop. Hive is used for data summarization, queries, and analysis. You can use Data Lake Tools for Visual Studio to run Hive queries from Visual Studio. For more information about Hive, see [What is Apache Hive and HiveQL on Azure HDInsight?](hdinsight-use-hive.md).
144
146
145
147
[Interactive Query in Azure HDInsight](../interactive-query/apache-interactive-query-get-started.md) uses [Hive on LLAP](https://cwiki.apache.org/confluence/display/Hive/LLAP) in Apache Hive 2.1. Interactive Query brings interactivity to complex, data warehouse-style queries on large, stored datasets. Running Hive queries on Interactive Query is much faster than traditional Hive batch jobs.
146
148
147
149
> [!NOTE]
148
150
> You can run interactive Hive queries only when you connect to an [HDInsight Interactive Query](../interactive-query/apache-interactive-query-get-started.md) cluster.
149
151
150
-
You can also use Data Lake Tools for Visual Studio to see what’s inside a Hive job. Data Lake Tools for Visual Studio collects and surfaces the Yarn logs of certain Hive jobs.
152
+
You can also use Data Lake Tools for Visual Studio to see what's inside a Hive job. Data Lake Tools for Visual Studio collects and surfaces the Yarn logs of certain Hive jobs.
151
153
152
154
From **Server Explorer**, choose **Azure** > **HDInsight** and select your cluster. This node is the starting point in **Server Explorer** for the sections to follow.
153
155
@@ -157,11 +159,11 @@ All HDInsight clusters have a default sample Hive table called `hivesampletable`
157
159
158
160
From your cluster, choose **Hive Databases** > **default** > **hivesampletable**.
159
161
160
-
- To view the `hivesampletable` schema:
162
+
* To view the `hivesampletable` schema:
161
163
162
164
Expand **hivesampletable**. The names and data types of the `hivesampletable` columns are shown.
163
165
164
-
- To view the `hivesampletable` data:
166
+
* To view the `hivesampletable` data:
165
167
166
168
Right-click **hivesampletable**, and select **View Top 100 Rows**. The list of 100 results appears in the **Hive Table: hivesampletable** window. This action is equivalent to running the following Hive query by using the Hive ODBC driver:
167
169
@@ -170,6 +172,7 @@ From your cluster, choose **Hive Databases** > **default** > **hivesampletable**
170
172
You can customize the row count by changing **Number of rows**; you can choose 50, 100, 200, or 1000 rows from the drop-down list.
171
173
172
174
### Create Hive tables
175
+
173
176
To create a Hive table, you can use the GUI or you can use Hive queries. For information about using Hive queries, see [Create and run Hive queries](#create-and-run-hive-queries).
174
177
175
178
1. From your cluster, choose **Hive Databases** > **default**.
@@ -183,6 +186,7 @@ To create a Hive table, you can use the GUI or you can use Hive queries. For inf
@@ -269,7 +273,7 @@ Currently, job graphs are only shown for Hive jobs that use Tez as the execution
269
273
270
274
To view all the operators inside the vertex, double-click the vertices of the job graph. You can also point to a specific operator to see more details about the operator.
271
275
272
-
Even if Tez is specified as the execution engine, the job graph may not appear if no Tez application is launched. This situation might occur because the job doesn't contain DML statements, or because the DML statements can return without launching a Tez application. For example, `SELECT * FROM table1` won't launch the Tez application.
276
+
Even if Tez is specified as the execution engine, the job graph may not appear if no Tez application is launched. This situation might occur because the job doesn't contain DML statements. Or because the DML statements can return without launching a Tez application. For example, `SELECT * FROM table1` won't launch the Tez application.
@@ -284,7 +288,7 @@ From the job graph, you can select **Task Execution Detail** to get structured a
284
288
285
289
You can view job queries, job output, job logs, and Yarn logs for Hive jobs.
286
290
287
-
In the most recent release of the tools, you can see what’s inside your Hive jobs by collecting and surfacing Yarn logs. A Yarn log can help you investigating performance issues. For more information about how HDInsight collects Yarn logs, see [Access Apache Hadoop YARN application logs](../hdinsight-hadoop-access-yarn-app-logs-linux.md).
291
+
In the most recent release of the tools, you can see what's inside your Hive jobs by collecting and surfacing Yarn logs. A Yarn log can help you investigating performance issues. For more information about how HDInsight collects Yarn logs, see [Access Apache Hadoop YARN application logs](../hdinsight-hadoop-access-yarn-app-logs-linux.md).
288
292
289
293
To view Hive jobs:
290
294
@@ -314,11 +318,11 @@ To view Hive jobs:
314
318
315
319
* An issue in which results that are started with nullvalues aren't shown has been fixed. If you're blocked on this issue, contact the support team.
316
320
317
-
* The HQL script that Visual Studio creates is encoded, depending on the user’s local region setting. The script doesn't execute correctly if you upload the script to a cluster as a binary file.
321
+
* The HQL script that Visual Studio creates is encoded, depending on the user's local region setting. The script doesn't execute correctly if you upload the script to a cluster as a binary file.
318
322
319
323
## Next steps
320
324
321
-
In this article, you learned how to use the Data Lake Tools for Visual Studio package to connect to HDInsight clusters from Visual Studio. You also learned how to run a Hive query. For more information, see these articles:
325
+
In this article, you learned how to use the Data Lake Tools for Visual Studio package to connect to HDInsight clusters from Visual Studio. You also learned how to run a Hive query.
322
326
323
327
* [Run Apache Hive queries using the Data Lake tools for Visual Studio](apache-hadoop-use-hive-visual-studio.md)
324
328
* [What is Apache Hive and HiveQL on Azure HDInsight?](hdinsight-use-hive.md)
0 commit comments