You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/hadoop/apache-hadoop-visual-studio-tools-get-started.md
+21-14Lines changed: 21 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,14 +2,13 @@
2
2
title: Apache Hadoop & Visual Studio Data Lake Tools - Azure HDInsight
3
3
description: Learn how to install and use Data Lake Tools for Visual Studio to connect to Apache Hadoop clusters in Azure HDInsight, and then run Hive queries.
4
4
keywords: hadoop tools,hive query,visual studio,visual studio hadoop
@@ -53,16 +53,17 @@ To complete this article and use Data Lake Tools for Visual Studio, you need the
53
53
54
54
1. Open Visual Studio.
55
55
56
-
2. From the menu bar, navigate to **Tools** > **Extensions and Updates...**.
56
+
2. From the menu bar, navigate to **Extensions** > **Manage Extensions**.
57
57
58
-
3. From the **Extensions and Updates** window, expand **Updates** on the left.
58
+
3. From the **Manage Extensions** window, expand **Updates** on the left.
59
59
60
60
4. If an update is available, **Azure Data Lake and Stream Analytic Tools** will appear in the main window. Select **Update**.
61
61
62
62
> [!NOTE]
63
63
> You can use only Data Lake Tools version 2.3.0.0 or later to connect to Interactive Query clusters and run interactive Hive queries.
64
64
65
65
## Connect to Azure subscriptions
66
+
66
67
You can use Data Lake Tools for Visual Studio to connect to your HDInsight clusters, perform some basic management operations, and run Hive queries.
67
68
68
69
> [!NOTE]
@@ -88,7 +89,7 @@ To connect to the Azure portal from Visual Studio:
88
89
89
90
1. From Server Explorer, navigate to **Azure** > **HDInsight** and select your cluster.
90
91
91
-
2. Right-click an HDInsight cluster, and select **Manage Cluster in Azure portal[sic]**.
92
+
2. Right-click an HDInsight cluster, and select **Manage Cluster in Azure Portal**.
92
93
93
94
To ask questions and/or provide feedback from Visual Studio:
94
95
@@ -97,6 +98,7 @@ To ask questions and/or provide feedback from Visual Studio:
97
98
2. Right-click **HDInsight** and select either **MSDN Forum** to ask questions, or **Give Feedback** to give feedback.
98
99
99
100
## Link a cluster
101
+
100
102
You could link a cluster by right-clicking on **HDInsight** then select **Link a HDInsight Cluster**. Enter **Connection Url**, **user name** and **password**, click **Next** then **Finish**, the cluster should be listed under HDInsight node successful.
101
103
102
104

@@ -106,6 +108,7 @@ Right click on the linked cluster, select **Edit**, user could update the cluste
106
108

107
109
108
110
## Explore linked resources
111
+
109
112
From Server Explorer, you can see the default storage account and any linked storage accounts. If you expand the default storage account, you can see the containers on the storage account. The default storage account and the default container are marked. Right-click any of the containers to view the container contents.
110
113
111
114

@@ -115,6 +118,7 @@ After opening a container, you can use the following buttons to upload, delete,
115
118

116
119
117
120
## Run interactive Apache Hive queries
121
+
118
122
[Apache Hive](https://hive.apache.org) is a data warehouse infrastructure that's built on Hadoop. Hive is used for data summarization, queries, and analysis. You can use Data Lake Tools for Visual Studio to run Hive queries from Visual Studio. For more information about Hive, see [Use Apache Hive with HDInsight](hdinsight-use-hive.md).
119
123
120
124
[Interactive Query](../interactive-query/apache-interactive-query-get-started.md) uses [Hive on LLAP](https://cwiki.apache.org/confluence/display/Hive/LLAP) in Apache Hive 2.1. Interactive Query brings interactivity to complex data warehouse-style queries on large, stored datasets. Running Hive queries on Interactive Query is much faster compared to traditional Hive batch jobs.
@@ -127,6 +131,7 @@ You can also use Data Lake Tools for Visual Studio to see what’s inside a Hive
127
131
From Server Explorer, navigate to **Azure** > **HDInsight** and select your cluster. This will be the starting point in Server Explorer for the sections to follow.
128
132
129
133
### View hivesampletable
134
+
130
135
All HDInsight clusters have a default sample Hive table called `hivesampletable`.
131
136
132
137
From your cluster, navigate to **Hive Databases** > **default** > **hivesampletable**.
@@ -144,6 +149,7 @@ Right-click **hivesampletable**, and select **View Top 100 Rows**. This is equi
144
149

145
150
146
151
### Create Hive tables
152
+
147
153
To create a Hive table, you can use the GUI or you can use Hive queries. For information about using Hive queries, see [Run Apache Hive queries](#run.queries).
148
154
149
155
1. From your cluster, navigate to **Hive Databases** > **default**.
@@ -157,6 +163,7 @@ To create a Hive table, you can use the GUI or you can use Hive queries. For inf
157
163

158
164
159
165
### <aname="run.queries"></a>Create and run Hive queries
166
+
160
167
You have two options for creating and running Hive queries:
161
168
162
169
* Create ad-hoc queries
@@ -235,7 +242,7 @@ Currently, job graphs are only shown for Hive jobs that use Tez as the execution
235
242
236
243
To view all the operators inside the vertex, double-click on the vertices of the job graph. You can also point to a specific operator to see more details about the operator.
237
244
238
-
The job graph may not appear even if Tez is specified as the execution engine if no Tez application is launched. This might happen because the job does not contain DML statements, or the DML statements can return without launching a Tez application. For example, `SELECT * FROM table1`will not launch the Tez application.
245
+
The job graph may not appear even if Tez is specified as the execution engine if no Tez application is launched. This might happen because the job doesn't contain DML statements, or the DML statements can return without launching a Tez application. For example, `SELECT * FROM table1` won't launch the Tez application.
239
246
240
247

241
248
@@ -246,8 +253,8 @@ From the job graph, you can select **Task Execution Detail** to get structured a
246
253
247
254

248
255
249
-
250
256
### View Hive jobs
257
+
251
258
You can view job queries, job output, job logs, and Yarn logs for Hive jobs.
252
259
253
260
In the most recent release of the tools, you can see what’s inside your Hive jobs by collecting and surfacing Yarn logs. A Yarn log can help you investigating performance issues. For more information about how HDInsight collects Yarn logs, see [Access HDInsight application logs programmatically](../hdinsight-hadoop-access-yarn-app-logs.md).
@@ -257,14 +264,13 @@ To view Hive jobs:
257
264
1. Right-click an HDInsight cluster, andselect**View Jobs**. A list of the Hive jobs that ran on the cluster appears.
258
265
259
266
2. Select a job. In the **Hive Job Summary** window, select one of the following:
260
-
-**Job Query**
261
-
-**Job Output**
262
-
-**Job Log**
263
-
-**Yarn log**
267
+
***Job Query**
268
+
***Job Output**
269
+
***Job Log**
270
+
***Yarn log**
264
271
265
272

266
273
267
-
268
274
## Run Apache Pig scripts
269
275
270
276
1. From the menu bar, navigate to **File**>**New**>**Project...**.
@@ -276,15 +282,16 @@ To view Hive jobs:
276
282
4. In**Solution Explorer**, double-click **Script.pig** to open the script.
277
283
278
284
## Feedback and known issues
285
+
279
286
* An issue in which results that are started with nullvalues aren't shown has been fixed. If you're blocked on this issue, contact the support team.
280
287
* The HQL script that Visual Studio creates is encoded, depending on the user’s local region setting. The script doesn't execute correctly if you upload the script to a cluster as a binary file.
281
288
282
289
## Next steps
290
+
283
291
In this article, you learned how to use the Data Lake Tools for Visual Studio package to connect to HDInsight clusters from Visual Studio. You also learned how to run a Hive query. For more information, see these articles:
284
292
285
293
* [Run Apache Hive queries using the Data Lake tools for Visual Studio](apache-hadoop-use-hive-visual-studio.md)
286
294
* [Use Hadoop Hive in HDInsight](hdinsight-use-hive.md)
287
295
* [Get started using Apache Hadoop in HDInsight](apache-hadoop-linux-tutorial-get-started.md)
288
296
* [Submit Apache Hadoop jobs in HDInsight](submit-apache-hadoop-jobs-programmatically.md)
289
297
* [Analyze Twitter data with Apache Hadoop in HDInsight](../hdinsight-analyze-twitter-data.md)
0 commit comments