Skip to content

Commit d81d862

Browse files
authored
Merge pull request #94012 from dagiro/freshness38
freshness38
2 parents 60dfbbd + f7d6940 commit d81d862

File tree

1 file changed

+21
-14
lines changed

1 file changed

+21
-14
lines changed

articles/hdinsight/hadoop/apache-hadoop-visual-studio-tools-get-started.md

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,13 @@
22
title: Apache Hadoop & Visual Studio Data Lake Tools - Azure HDInsight
33
description: Learn how to install and use Data Lake Tools for Visual Studio to connect to Apache Hadoop clusters in Azure HDInsight, and then run Hive queries.
44
keywords: hadoop tools,hive query,visual studio,visual studio hadoop
5-
services: hdinsight
65
author: hrasheed-msft
76
ms.author: hrasheed
87
ms.reviewer: jasonh
98
ms.service: hdinsight
109
ms.custom: hdinsightactive,hdiseo17may2017,seodec18
1110
ms.topic: conceptual
12-
ms.date: 06/03/2019
11+
ms.date: 10/29/2019
1312
---
1413

1514
# Use Data Lake Tools for Visual Studio to connect to Azure HDInsight and run Apache Hive queries
@@ -33,7 +32,8 @@ To complete this article and use Data Lake Tools for Visual Studio, you need the
3332
> [!IMPORTANT]
3433
> Data Lake Tools is no longer supported for Visual Studio 2013.
3534
36-
## Install Data Lake Tools for Visual Studio
35+
## Install Data Lake Tools for Visual Studio
36+
3737
<a name="install-or-update-data-lake-tools-for-visual-studio"></a>
3838

3939
* Visual Studio 2017 or Visual Studio 2019
@@ -53,16 +53,17 @@ To complete this article and use Data Lake Tools for Visual Studio, you need the
5353

5454
1. Open Visual Studio.
5555

56-
2. From the menu bar, navigate to **Tools** > **Extensions and Updates...**.
56+
2. From the menu bar, navigate to **Extensions** > **Manage Extensions**.
5757

58-
3. From the **Extensions and Updates** window, expand **Updates** on the left.
58+
3. From the **Manage Extensions** window, expand **Updates** on the left.
5959

6060
4. If an update is available, **Azure Data Lake and Stream Analytic Tools** will appear in the main window. Select **Update**.
6161

6262
> [!NOTE]
6363
> You can use only Data Lake Tools version 2.3.0.0 or later to connect to Interactive Query clusters and run interactive Hive queries.
6464
6565
## Connect to Azure subscriptions
66+
6667
You can use Data Lake Tools for Visual Studio to connect to your HDInsight clusters, perform some basic management operations, and run Hive queries.
6768

6869
> [!NOTE]
@@ -88,7 +89,7 @@ To connect to the Azure portal from Visual Studio:
8889

8990
1. From Server Explorer, navigate to **Azure** > **HDInsight** and select your cluster.
9091

91-
2. Right-click an HDInsight cluster, and select **Manage Cluster in Azure portal[sic]**.
92+
2. Right-click an HDInsight cluster, and select **Manage Cluster in Azure Portal**.
9293

9394
To ask questions and/or provide feedback from Visual Studio:
9495

@@ -97,6 +98,7 @@ To ask questions and/or provide feedback from Visual Studio:
9798
2. Right-click **HDInsight** and select either **MSDN Forum** to ask questions, or **Give Feedback** to give feedback.
9899

99100
## Link a cluster
101+
100102
You could link a cluster by right-clicking on **HDInsight** then select **Link a HDInsight Cluster**. Enter **Connection Url**, **user name** and **password**, click **Next** then **Finish**, the cluster should be listed under HDInsight node successful.
101103

102104
![Screenshot of Data Lake Tools for Visual Studio link cluster dialog](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-link-cluster-dialog.png)
@@ -106,6 +108,7 @@ Right click on the linked cluster, select **Edit**, user could update the cluste
106108
![Screenshot of Data Lake Tools for Visual Studio link cluster update](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-link-cluster-update.png)
107109

108110
## Explore linked resources
111+
109112
From Server Explorer, you can see the default storage account and any linked storage accounts. If you expand the default storage account, you can see the containers on the storage account. The default storage account and the default container are marked. Right-click any of the containers to view the container contents.
110113

111114
![Data Lake Tools for Visual Studio linked resources in Server Explorer](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-linked-resources.png "List linked resources")
@@ -115,6 +118,7 @@ After opening a container, you can use the following buttons to upload, delete,
115118
![Data Lake Tools for Visual Studio blob operations in Server Explorer](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-blob-operations.png "Upload, delete, and download blobs in Server Explorer")
116119

117120
## Run interactive Apache Hive queries
121+
118122
[Apache Hive](https://hive.apache.org) is a data warehouse infrastructure that's built on Hadoop. Hive is used for data summarization, queries, and analysis. You can use Data Lake Tools for Visual Studio to run Hive queries from Visual Studio. For more information about Hive, see [Use Apache Hive with HDInsight](hdinsight-use-hive.md).
119123

120124
[Interactive Query](../interactive-query/apache-interactive-query-get-started.md) uses [Hive on LLAP](https://cwiki.apache.org/confluence/display/Hive/LLAP) in Apache Hive 2.1. Interactive Query brings interactivity to complex data warehouse-style queries on large, stored datasets. Running Hive queries on Interactive Query is much faster compared to traditional Hive batch jobs.
@@ -127,6 +131,7 @@ You can also use Data Lake Tools for Visual Studio to see what’s inside a Hive
127131
From Server Explorer, navigate to **Azure** > **HDInsight** and select your cluster. This will be the starting point in Server Explorer for the sections to follow.
128132

129133
### View hivesampletable
134+
130135
All HDInsight clusters have a default sample Hive table called `hivesampletable`.
131136

132137
From your cluster, navigate to **Hive Databases** > **default** > **hivesampletable**.
@@ -144,6 +149,7 @@ Right-click **hivesampletable**, and select **View Top 100 Rows**. This is equi
144149
![Screenshot of an HDInsight Hive Visual Studio schema query](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-hive-schema.png "Hive query results")
145150

146151
### Create Hive tables
152+
147153
To create a Hive table, you can use the GUI or you can use Hive queries. For information about using Hive queries, see [Run Apache Hive queries](#run.queries).
148154

149155
1. From your cluster, navigate to **Hive Databases** > **default**.
@@ -157,6 +163,7 @@ To create a Hive table, you can use the GUI or you can use Hive queries. For inf
157163
![Screenshot of the HDInsight Visual Studio Tools Create Table window](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-create-hive-table.png "Create Hive table")
158164

159165
### <a name="run.queries"></a>Create and run Hive queries
166+
160167
You have two options for creating and running Hive queries:
161168

162169
* Create ad-hoc queries
@@ -235,7 +242,7 @@ Currently, job graphs are only shown for Hive jobs that use Tez as the execution
235242

236243
To view all the operators inside the vertex, double-click on the vertices of the job graph. You can also point to a specific operator to see more details about the operator.
237244

238-
The job graph may not appear even if Tez is specified as the execution engine if no Tez application is launched. This might happen because the job does not contain DML statements, or the DML statements can return without launching a Tez application. For example, `SELECT * FROM table1` will not launch the Tez application.
245+
The job graph may not appear even if Tez is specified as the execution engine if no Tez application is launched. This might happen because the job doesn't contain DML statements, or the DML statements can return without launching a Tez application. For example, `SELECT * FROM table1` won't launch the Tez application.
239246

240247
![Visual Studio Apache Hive job graph](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-fast-path-hive-execution.png "Hive job summary")
241248

@@ -246,8 +253,8 @@ From the job graph, you can select **Task Execution Detail** to get structured a
246253

247254
![Data Lake Visual Studio Tools Task Execution View window](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-task-execution-view.png "Task Execution View")
248255

249-
250256
### View Hive jobs
257+
251258
You can view job queries, job output, job logs, and Yarn logs for Hive jobs.
252259

253260
In the most recent release of the tools, you can see what’s inside your Hive jobs by collecting and surfacing Yarn logs. A Yarn log can help you investigating performance issues. For more information about how HDInsight collects Yarn logs, see [Access HDInsight application logs programmatically](../hdinsight-hadoop-access-yarn-app-logs.md).
@@ -257,14 +264,13 @@ To view Hive jobs:
257264
1. Right-click an HDInsight cluster, and select **View Jobs**. A list of the Hive jobs that ran on the cluster appears.
258265

259266
2. Select a job. In the **Hive Job Summary** window, select one of the following:
260-
- **Job Query**
261-
- **Job Output**
262-
- **Job Log**
263-
- **Yarn log**
267+
* **Job Query**
268+
* **Job Output**
269+
* **Job Log**
270+
* **Yarn log**
264271

265272
![Screenshot of the HDInsight Visual Studio Tools View Hive Jobs window](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-view-hive-jobs.png "View Hive jobs")
266273

267-
268274
## Run Apache Pig scripts
269275

270276
1. From the menu bar, navigate to **File** > **New** > **Project...**.
@@ -276,15 +282,16 @@ To view Hive jobs:
276282
4. In **Solution Explorer**, double-click **Script.pig** to open the script.
277283

278284
## Feedback and known issues
285+
279286
* An issue in which results that are started with null values aren't shown has been fixed. If you're blocked on this issue, contact the support team.
280287
* The HQL script that Visual Studio creates is encoded, depending on the user’s local region setting. The script doesn't execute correctly if you upload the script to a cluster as a binary file.
281288
282289
## Next steps
290+
283291
In this article, you learned how to use the Data Lake Tools for Visual Studio package to connect to HDInsight clusters from Visual Studio. You also learned how to run a Hive query. For more information, see these articles:
284292
285293
* [Run Apache Hive queries using the Data Lake tools for Visual Studio](apache-hadoop-use-hive-visual-studio.md)
286294
* [Use Hadoop Hive in HDInsight](hdinsight-use-hive.md)
287295
* [Get started using Apache Hadoop in HDInsight](apache-hadoop-linux-tutorial-get-started.md)
288296
* [Submit Apache Hadoop jobs in HDInsight](submit-apache-hadoop-jobs-programmatically.md)
289297
* [Analyze Twitter data with Apache Hadoop in HDInsight](../hdinsight-analyze-twitter-data.md)
290-

0 commit comments

Comments
 (0)