Skip to content

Commit 7c85d05

Browse files
authored
Merge pull request #111358 from dagiro/freshness52
freshness52
2 parents 8a7287f + e68eb1d commit 7c85d05

File tree

1 file changed

+22
-18
lines changed

1 file changed

+22
-18
lines changed

articles/hdinsight/hadoop/apache-hadoop-visual-studio-tools-get-started.md

Lines changed: 22 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -1,23 +1,23 @@
11
---
22
title: Apache Hadoop & Visual Studio Data Lake Tools - Azure HDInsight
3-
description: Learn how to install and use Data Lake Tools for Visual Studio to connect to Apache Hadoop clusters in Azure HDInsight, and then run Hive queries.
3+
description: Learn how to install and use Data Lake Tools for Visual Studio. Use tool to connect to Apache Hadoop clusters in Azure HDInsight, and then run Hive queries.
44
keywords: hadoop tools,hive query,visual studio,visual studio hadoop
55
author: hrasheed-msft
66
ms.author: hrasheed
77
ms.reviewer: jasonh
88
ms.service: hdinsight
99
ms.custom: hdinsightactive,hdiseo17may2017,seodec18
1010
ms.topic: conceptual
11-
ms.date: 10/29/2019
11+
ms.date: 04/14/2020
1212
---
1313

1414
# Use Data Lake Tools for Visual Studio to connect to Azure HDInsight and run Apache Hive queries
1515

16-
Learn how to use Microsoft Azure Data Lake and Stream Analytics Tools for Visual Studio (also called Data Lake Tools) to connect to [Apache Hadoop clusters in Azure HDInsight](apache-hadoop-introduction.md) and submit Hive queries.
16+
Learn how to use Microsoft Azure Data Lake and Stream Analytics Tools for Visual Studio (Data Lake Tools). Use the tool to connect to [Apache Hadoop clusters in Azure HDInsight](apache-hadoop-introduction.md) and submit Hive queries.
1717

1818
For more information about using HDInsight, see [Get started with HDInsight](apache-hadoop-linux-tutorial-get-started.md).
1919

20-
For more information about connecting to an Apache Storm cluster, see [Develop C# topologies for Apache Storm by using the Data Lake tools for Visual Studio](../storm/apache-storm-develop-csharp-visual-studio-topology.md).
20+
For more information on connecting to Apache Storm, see [Develop C# topologies for Apache Storm by using the Data Lake tools](../storm/apache-storm-develop-csharp-visual-studio-topology.md).
2121

2222
You can use Data Lake Tools for Visual Studio to access Azure Data Lake Analytics and HDInsight. For information about Data Lake Tools, see [Develop U-SQL scripts by using Data Lake Tools for Visual Studio](../../data-lake-analytics/data-lake-analytics-data-lake-tools-get-started.md).
2323

@@ -33,15 +33,15 @@ To complete this article and use Data Lake Tools for Visual Studio, you need the
3333

3434
Follow the appropriate instructions to install Data Lake Tools for your version of Visual Studio:
3535

36-
- For Visual Studio 2017 or Visual Studio 2019:
36+
* For Visual Studio 2017 or Visual Studio 2019:
3737

3838
During Visual Studio installation, make sure you include the **Azure development** workload or the **Data storage and processing** workload.
3939

40-
For existing Visual Studio installations, go to the IDE menu bar, and select **Tools** > **Get Tools and Features** to open Visual Studio Installer. In the **Workloads** tab, select at least the **Azure development** workload (under **Web & Cloud**) or the **Data storage and processing** workload (under **Other Toolsets**).
40+
For existing Visual Studio installations, go to the IDE menu bar, and select **Tools** > **Get Tools and Features** to open Visual Studio Installer. In the **Workloads** tab, select at least the **Azure development** workload (under **Web & Cloud**). Or select the **Data storage and processing** workload (under **Other Toolsets**).
4141

4242
![Workload selection, Visual Studio Installer](./media/apache-hadoop-visual-studio-tools-get-started/vs-installation.png)
4343

44-
- For Visual Studio 2015:
44+
* For Visual Studio 2015:
4545

4646
[Download Data Lake Tools](https://www.microsoft.com/download/details.aspx?id=49504). Choose the version of Data Lake Tools that matches your version of Visual Studio.
4747

@@ -91,7 +91,7 @@ To connect to your Azure subscription:
9191

9292
![HDInsight cluster list, Server Explorer, Visual Studio](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-server-explorer.png)
9393

94-
6. Expand an HDInsight cluster. The cluster contains nodes for **Hive Databases**, a default storage account, any additional linked storage accounts, and **Hadoop Service Log**. You can further expand the entities.
94+
6. Expand an HDInsight cluster. The cluster contains nodes for **Hive Databases**. Also, a default storage account, any additional linked storage accounts, and **Hadoop Service Log**. You can further expand the entities.
9595

9696
After you've connected to your Azure subscription, you can do the following tasks.
9797

@@ -105,7 +105,7 @@ To connect to the Azure portal from Visual Studio:
105105

106106
### Offer questions and feedback from Visual Studio
107107

108-
To ask questions and/or provide feedback from Visual Studio:
108+
To ask questions and, or provide feedback from Visual Studio:
109109

110110
1. From Server Explorer, choose **Azure** > **HDInsight**.
111111

@@ -120,7 +120,7 @@ To link an HDInsight cluster:
120120

121121
1. Right-click **HDInsight**, and then select **Link a HDInsight Cluster** to display the **Link a HDInsight Cluster** dialog box.
122122

123-
2. Enter a **Connection Url** in the form *https\://\<cluster&nbsp;name>.azurehdinsight.net*. The **Cluster Name** automatically fills in with the cluster name portion of your URL when you go to another field. Then enter a **Username** and **Password**, and select **Next**.
123+
2. Enter a **Connection Url** in the form `https://CLUSTERNAME.azurehdinsight.net`. The **Cluster Name** automatically fills in with the cluster name portion of your URL when you go to another field. Then enter a **Username** and **Password**, and select **Next**.
124124

125125
![Link a cluster, HDInsight, Visual Studio](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-link-cluster-dialog.png)
126126

@@ -131,6 +131,7 @@ To update a linked cluster, right-click the cluster and select **Edit**. You can
131131
![Edit a linked cluster, HDInsight, Visual Studio](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-link-cluster-update.png)
132132

133133
## Explore linked resources
134+
134135
From Server Explorer, you can see the default storage account and any linked storage accounts. If you expand the default storage account, you can see the containers on the storage account. The default storage account and the default container are marked.
135136

136137
![Data Lake Tools for Visual Studio linked resources in Server Explorer](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-linked-resources.png)
@@ -140,14 +141,15 @@ Right-click a container and select **View Container** to view the container's co
140141
![Container list and blob operations, HDInsight cluster, Visual Studio](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-blob-operations.png)
141142

142143
## Run interactive Apache Hive queries
144+
143145
[Apache Hive](https://hive.apache.org) is a data warehouse infrastructure that's built on Hadoop. Hive is used for data summarization, queries, and analysis. You can use Data Lake Tools for Visual Studio to run Hive queries from Visual Studio. For more information about Hive, see [What is Apache Hive and HiveQL on Azure HDInsight?](hdinsight-use-hive.md).
144146

145147
[Interactive Query in Azure HDInsight](../interactive-query/apache-interactive-query-get-started.md) uses [Hive on LLAP](https://cwiki.apache.org/confluence/display/Hive/LLAP) in Apache Hive 2.1. Interactive Query brings interactivity to complex, data warehouse-style queries on large, stored datasets. Running Hive queries on Interactive Query is much faster than traditional Hive batch jobs.
146148

147149
> [!NOTE]
148150
> You can run interactive Hive queries only when you connect to an [HDInsight Interactive Query](../interactive-query/apache-interactive-query-get-started.md) cluster.
149151
150-
You can also use Data Lake Tools for Visual Studio to see whats inside a Hive job. Data Lake Tools for Visual Studio collects and surfaces the Yarn logs of certain Hive jobs.
152+
You can also use Data Lake Tools for Visual Studio to see what's inside a Hive job. Data Lake Tools for Visual Studio collects and surfaces the Yarn logs of certain Hive jobs.
151153

152154
From **Server Explorer**, choose **Azure** > **HDInsight** and select your cluster. This node is the starting point in **Server Explorer** for the sections to follow.
153155

@@ -157,11 +159,11 @@ All HDInsight clusters have a default sample Hive table called `hivesampletable`
157159

158160
From your cluster, choose **Hive Databases** > **default** > **hivesampletable**.
159161

160-
- To view the `hivesampletable` schema:
162+
* To view the `hivesampletable` schema:
161163

162164
Expand **hivesampletable**. The names and data types of the `hivesampletable` columns are shown.
163165

164-
- To view the `hivesampletable` data:
166+
* To view the `hivesampletable` data:
165167

166168
Right-click **hivesampletable**, and select **View Top 100 Rows**. The list of 100 results appears in the **Hive Table: hivesampletable** window. This action is equivalent to running the following Hive query by using the Hive ODBC driver:
167169

@@ -170,6 +172,7 @@ From your cluster, choose **Hive Databases** > **default** > **hivesampletable**
170172
You can customize the row count by changing **Number of rows**; you can choose 50, 100, 200, or 1000 rows from the drop-down list.
171173

172174
### Create Hive tables
175+
173176
To create a Hive table, you can use the GUI or you can use Hive queries. For information about using Hive queries, see [Create and run Hive queries](#create-and-run-hive-queries).
174177

175178
1. From your cluster, choose **Hive Databases** > **default**.
@@ -183,6 +186,7 @@ To create a Hive table, you can use the GUI or you can use Hive queries. For inf
183186
![Create Table window, Hive, HDInsight cluster, Visual Studio](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-create-hive-table.png)
184187

185188
### Create and run Hive queries
189+
186190
You have two options for creating and running Hive queries:
187191

188192
* Create ad-hoc queries
@@ -224,7 +228,7 @@ To create and run an ad-hoc query:
224228

225229
* **Batch**
226230

227-
In the first drop-down list, choose **Batch**, and then select **Submit** (or select the drop-down icon next to **Submit** and choose **Advanced**).
231+
In the first drop-down list, choose **Batch**, and then select **Submit**. Or select the drop-down icon next to **Submit** and choose **Advanced**.
228232

229233
![Batch mode, Hive ad-hoc query, HDInsight cluster, Visual Studio](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-query-batch.png)
230234

@@ -269,7 +273,7 @@ Currently, job graphs are only shown for Hive jobs that use Tez as the execution
269273
270274
To view all the operators inside the vertex, double-click the vertices of the job graph. You can also point to a specific operator to see more details about the operator.
271275
272-
Even if Tez is specified as the execution engine, the job graph may not appear if no Tez application is launched. This situation might occur because the job doesn't contain DML statements, or because the DML statements can return without launching a Tez application. For example, `SELECT * FROM table1` won't launch the Tez application.
276+
Even if Tez is specified as the execution engine, the job graph may not appear if no Tez application is launched. This situation might occur because the job doesn't contain DML statements. Or because the DML statements can return without launching a Tez application. For example, `SELECT * FROM table1` won't launch the Tez application.
273277
274278
![Apache Hive job graph, Visual Studio](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-fast-path-hive-execution.png)
275279
@@ -284,7 +288,7 @@ From the job graph, you can select **Task Execution Detail** to get structured a
284288
285289
You can view job queries, job output, job logs, and Yarn logs for Hive jobs.
286290
287-
In the most recent release of the tools, you can see whats inside your Hive jobs by collecting and surfacing Yarn logs. A Yarn log can help you investigating performance issues. For more information about how HDInsight collects Yarn logs, see [Access Apache Hadoop YARN application logs](../hdinsight-hadoop-access-yarn-app-logs-linux.md).
291+
In the most recent release of the tools, you can see what's inside your Hive jobs by collecting and surfacing Yarn logs. A Yarn log can help you investigating performance issues. For more information about how HDInsight collects Yarn logs, see [Access Apache Hadoop YARN application logs](../hdinsight-hadoop-access-yarn-app-logs-linux.md).
288292

289293
To view Hive jobs:
290294

@@ -314,11 +318,11 @@ To view Hive jobs:
314318

315319
* An issue in which results that are started with null values aren't shown has been fixed. If you're blocked on this issue, contact the support team.
316320

317-
* The HQL script that Visual Studio creates is encoded, depending on the users local region setting. The script doesn't execute correctly if you upload the script to a cluster as a binary file.
321+
* The HQL script that Visual Studio creates is encoded, depending on the user's local region setting. The script doesn't execute correctly if you upload the script to a cluster as a binary file.
318322

319323
## Next steps
320324

321-
In this article, you learned how to use the Data Lake Tools for Visual Studio package to connect to HDInsight clusters from Visual Studio. You also learned how to run a Hive query. For more information, see these articles:
325+
In this article, you learned how to use the Data Lake Tools for Visual Studio package to connect to HDInsight clusters from Visual Studio. You also learned how to run a Hive query.
322326

323327
* [Run Apache Hive queries using the Data Lake tools for Visual Studio](apache-hadoop-use-hive-visual-studio.md)
324328
* [What is Apache Hive and HiveQL on Azure HDInsight?](hdinsight-use-hive.md)

0 commit comments

Comments
 (0)