Skip to content

Commit 613a677

Browse files
2 parents 9c07576 + d81d862 commit 613a677

File tree

4 files changed

+59
-40
lines changed

4 files changed

+59
-40
lines changed

articles/hdinsight/hadoop/apache-hadoop-use-hive-ambari-view.md

Lines changed: 11 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,12 @@
22
title: Use Apache Ambari Hive View with Apache Hadoop in Azure HDInsight
33
description: Learn how to use the Hive View from your web browser to submit Hive queries. The Hive View is part of the Ambari Web UI provided with your Linux-based HDInsight cluster.
44
author: hrasheed-msft
5+
ms.author: hrasheed
56
ms.reviewer: jasonh
6-
77
ms.service: hdinsight
88
ms.custom: hdinsightactive
99
ms.topic: conceptual
10-
ms.date: 03/21/2019
11-
ms.author: hrasheed
10+
ms.date: 10/24/2019
1211
---
1312

1413
# Use Apache Ambari Hive View with Apache Hadoop in HDInsight
@@ -26,17 +25,17 @@ Learn how to run Hive queries by using Apache Ambari Hive View. The Hive View al
2625

2726
1. From the [Azure portal](https://portal.azure.com/), select your cluster. See [List and show clusters](../hdinsight-administer-use-portal-linux.md#showClusters) for instructions. The cluster is opened in a new portal blade.
2827

29-
2. From **Cluster dashboards**, select **Ambari views**. When prompted to authenticate, use the cluster login (default `admin`) account name and password that you provided when you created the cluster.
28+
1. From **Cluster dashboards**, select **Ambari views**. When prompted to authenticate, use the cluster login (default `admin`) account name and password that you provided when you created the cluster. Alternatively, navigate to `https://CLUSTERNAME.azurehdinsight.net/#/main/views` in your browser where `CLUSTERNAME` is the name of your cluster.
3029

31-
3. From the list of views, select __Hive View__.
30+
1. From the list of views, select __Hive View__.
3231

3332
![Apache Ambari select Apache Hive view](./media/apache-hadoop-use-hive-ambari-view/select-apache-hive-view.png)
3433

3534
The Hive view page is similar to the following image:
3635

3736
![Image of the query worksheet for the Hive view](./media/apache-hadoop-use-hive-ambari-view/ambari-worksheet-view.png)
3837

39-
4. From the __Query__ tab, paste the following HiveQL statements into the worksheet:
38+
1. From the __Query__ tab, paste the following HiveQL statements into the worksheet:
4039

4140
```hiveql
4241
DROP TABLE log4jLogs;
@@ -50,8 +49,8 @@ Learn how to run Hive queries by using Apache Ambari Hive View. The Hive View al
5049
t7 string)
5150
ROW FORMAT DELIMITED FIELDS TERMINATED BY ' '
5251
STORED AS TEXTFILE LOCATION '/example/data/';
53-
SELECT t4 AS loglevel, COUNT(*) AS count FROM log4jLogs
54-
WHERE t4 = '[ERROR]'
52+
SELECT t4 AS loglevel, COUNT(*) AS count FROM log4jLogs
53+
WHERE t4 = '[ERROR]'
5554
GROUP BY t4;
5655
```
5756
@@ -71,9 +70,9 @@ Learn how to run Hive queries by using Apache Ambari Hive View. The Hive View al
7170
> [!IMPORTANT]
7271
> Leave the __Database__ selection at __default__. The examples in this document use the default database included with HDInsight.
7372
74-
5. To start the query, select **Execute** below the worksheet. The button turns orange and the text changes to **Stop**.
73+
1. To start the query, select **Execute** below the worksheet. The button turns orange and the text changes to **Stop**.
7574
76-
6. After the query has finished, the **Results** tab displays the results of the operation. The following text is the result of the query:
75+
1. After the query has finished, the **Results** tab displays the results of the operation. The following text is the result of the query:
7776
7877
loglevel count
7978
[ERROR] 3
@@ -129,7 +128,7 @@ Declare and save a set of UDFs by using the **UDF** tab at the top of the Hive V
129128
130129
After you've added a UDF to the Hive View, an **Insert udfs** button appears at the bottom of the **Query Editor**. Selecting this entry displays a drop-down list of the UDFs defined in the Hive View. Selecting a UDF adds HiveQL statements to your query to enable the UDF.
131130
132-
For example, if you have defined a UDF with the following properties:
131+
For example, if you've defined a UDF with the following properties:
133132
134133
* Resource name: myudfs
135134
@@ -139,7 +138,7 @@ For example, if you have defined a UDF with the following properties:
139138
140139
* UDF class name: com.myudfs.Awesome
141140
142-
Using the **Insert udfs** button displays an entry named **myudfs**, with another drop-down list for each UDF defined for that resource. In this case, it is **myawesomeudf**. Selecting this entry adds the following to the beginning of the query:
141+
Using the **Insert udfs** button displays an entry named **myudfs**, with another drop-down list for each UDF defined for that resource. In this case, it's **myawesomeudf**. Selecting this entry adds the following to the beginning of the query:
143142
144143
```hiveql
145144
add jar /myudfs.jar;

articles/hdinsight/hadoop/apache-hadoop-visual-studio-tools-get-started.md

Lines changed: 21 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,13 @@
22
title: Apache Hadoop & Visual Studio Data Lake Tools - Azure HDInsight
33
description: Learn how to install and use Data Lake Tools for Visual Studio to connect to Apache Hadoop clusters in Azure HDInsight, and then run Hive queries.
44
keywords: hadoop tools,hive query,visual studio,visual studio hadoop
5-
services: hdinsight
65
author: hrasheed-msft
76
ms.author: hrasheed
87
ms.reviewer: jasonh
98
ms.service: hdinsight
109
ms.custom: hdinsightactive,hdiseo17may2017,seodec18
1110
ms.topic: conceptual
12-
ms.date: 06/03/2019
11+
ms.date: 10/29/2019
1312
---
1413

1514
# Use Data Lake Tools for Visual Studio to connect to Azure HDInsight and run Apache Hive queries
@@ -33,7 +32,8 @@ To complete this article and use Data Lake Tools for Visual Studio, you need the
3332
> [!IMPORTANT]
3433
> Data Lake Tools is no longer supported for Visual Studio 2013.
3534
36-
## Install Data Lake Tools for Visual Studio
35+
## Install Data Lake Tools for Visual Studio
36+
3737
<a name="install-or-update-data-lake-tools-for-visual-studio"></a>
3838

3939
* Visual Studio 2017 or Visual Studio 2019
@@ -53,16 +53,17 @@ To complete this article and use Data Lake Tools for Visual Studio, you need the
5353

5454
1. Open Visual Studio.
5555

56-
2. From the menu bar, navigate to **Tools** > **Extensions and Updates...**.
56+
2. From the menu bar, navigate to **Extensions** > **Manage Extensions**.
5757

58-
3. From the **Extensions and Updates** window, expand **Updates** on the left.
58+
3. From the **Manage Extensions** window, expand **Updates** on the left.
5959

6060
4. If an update is available, **Azure Data Lake and Stream Analytic Tools** will appear in the main window. Select **Update**.
6161

6262
> [!NOTE]
6363
> You can use only Data Lake Tools version 2.3.0.0 or later to connect to Interactive Query clusters and run interactive Hive queries.
6464
6565
## Connect to Azure subscriptions
66+
6667
You can use Data Lake Tools for Visual Studio to connect to your HDInsight clusters, perform some basic management operations, and run Hive queries.
6768

6869
> [!NOTE]
@@ -88,7 +89,7 @@ To connect to the Azure portal from Visual Studio:
8889

8990
1. From Server Explorer, navigate to **Azure** > **HDInsight** and select your cluster.
9091

91-
2. Right-click an HDInsight cluster, and select **Manage Cluster in Azure portal[sic]**.
92+
2. Right-click an HDInsight cluster, and select **Manage Cluster in Azure Portal**.
9293

9394
To ask questions and/or provide feedback from Visual Studio:
9495

@@ -97,6 +98,7 @@ To ask questions and/or provide feedback from Visual Studio:
9798
2. Right-click **HDInsight** and select either **MSDN Forum** to ask questions, or **Give Feedback** to give feedback.
9899

99100
## Link a cluster
101+
100102
You could link a cluster by right-clicking on **HDInsight** then select **Link a HDInsight Cluster**. Enter **Connection Url**, **user name** and **password**, click **Next** then **Finish**, the cluster should be listed under HDInsight node successful.
101103

102104
![Screenshot of Data Lake Tools for Visual Studio link cluster dialog](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-link-cluster-dialog.png)
@@ -106,6 +108,7 @@ Right click on the linked cluster, select **Edit**, user could update the cluste
106108
![Screenshot of Data Lake Tools for Visual Studio link cluster update](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-link-cluster-update.png)
107109

108110
## Explore linked resources
111+
109112
From Server Explorer, you can see the default storage account and any linked storage accounts. If you expand the default storage account, you can see the containers on the storage account. The default storage account and the default container are marked. Right-click any of the containers to view the container contents.
110113

111114
![Data Lake Tools for Visual Studio linked resources in Server Explorer](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-linked-resources.png "List linked resources")
@@ -115,6 +118,7 @@ After opening a container, you can use the following buttons to upload, delete,
115118
![Data Lake Tools for Visual Studio blob operations in Server Explorer](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-blob-operations.png "Upload, delete, and download blobs in Server Explorer")
116119

117120
## Run interactive Apache Hive queries
121+
118122
[Apache Hive](https://hive.apache.org) is a data warehouse infrastructure that's built on Hadoop. Hive is used for data summarization, queries, and analysis. You can use Data Lake Tools for Visual Studio to run Hive queries from Visual Studio. For more information about Hive, see [Use Apache Hive with HDInsight](hdinsight-use-hive.md).
119123

120124
[Interactive Query](../interactive-query/apache-interactive-query-get-started.md) uses [Hive on LLAP](https://cwiki.apache.org/confluence/display/Hive/LLAP) in Apache Hive 2.1. Interactive Query brings interactivity to complex data warehouse-style queries on large, stored datasets. Running Hive queries on Interactive Query is much faster compared to traditional Hive batch jobs.
@@ -127,6 +131,7 @@ You can also use Data Lake Tools for Visual Studio to see what’s inside a Hive
127131
From Server Explorer, navigate to **Azure** > **HDInsight** and select your cluster. This will be the starting point in Server Explorer for the sections to follow.
128132

129133
### View hivesampletable
134+
130135
All HDInsight clusters have a default sample Hive table called `hivesampletable`.
131136

132137
From your cluster, navigate to **Hive Databases** > **default** > **hivesampletable**.
@@ -144,6 +149,7 @@ Right-click **hivesampletable**, and select **View Top 100 Rows**. This is equi
144149
![Screenshot of an HDInsight Hive Visual Studio schema query](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-hive-schema.png "Hive query results")
145150

146151
### Create Hive tables
152+
147153
To create a Hive table, you can use the GUI or you can use Hive queries. For information about using Hive queries, see [Run Apache Hive queries](#run.queries).
148154

149155
1. From your cluster, navigate to **Hive Databases** > **default**.
@@ -157,6 +163,7 @@ To create a Hive table, you can use the GUI or you can use Hive queries. For inf
157163
![Screenshot of the HDInsight Visual Studio Tools Create Table window](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-create-hive-table.png "Create Hive table")
158164

159165
### <a name="run.queries"></a>Create and run Hive queries
166+
160167
You have two options for creating and running Hive queries:
161168

162169
* Create ad-hoc queries
@@ -235,7 +242,7 @@ Currently, job graphs are only shown for Hive jobs that use Tez as the execution
235242

236243
To view all the operators inside the vertex, double-click on the vertices of the job graph. You can also point to a specific operator to see more details about the operator.
237244

238-
The job graph may not appear even if Tez is specified as the execution engine if no Tez application is launched. This might happen because the job does not contain DML statements, or the DML statements can return without launching a Tez application. For example, `SELECT * FROM table1` will not launch the Tez application.
245+
The job graph may not appear even if Tez is specified as the execution engine if no Tez application is launched. This might happen because the job doesn't contain DML statements, or the DML statements can return without launching a Tez application. For example, `SELECT * FROM table1` won't launch the Tez application.
239246

240247
![Visual Studio Apache Hive job graph](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-fast-path-hive-execution.png "Hive job summary")
241248

@@ -246,8 +253,8 @@ From the job graph, you can select **Task Execution Detail** to get structured a
246253

247254
![Data Lake Visual Studio Tools Task Execution View window](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-task-execution-view.png "Task Execution View")
248255

249-
250256
### View Hive jobs
257+
251258
You can view job queries, job output, job logs, and Yarn logs for Hive jobs.
252259

253260
In the most recent release of the tools, you can see what’s inside your Hive jobs by collecting and surfacing Yarn logs. A Yarn log can help you investigating performance issues. For more information about how HDInsight collects Yarn logs, see [Access HDInsight application logs programmatically](../hdinsight-hadoop-access-yarn-app-logs.md).
@@ -257,14 +264,13 @@ To view Hive jobs:
257264
1. Right-click an HDInsight cluster, and select **View Jobs**. A list of the Hive jobs that ran on the cluster appears.
258265

259266
2. Select a job. In the **Hive Job Summary** window, select one of the following:
260-
- **Job Query**
261-
- **Job Output**
262-
- **Job Log**
263-
- **Yarn log**
267+
* **Job Query**
268+
* **Job Output**
269+
* **Job Log**
270+
* **Yarn log**
264271

265272
![Screenshot of the HDInsight Visual Studio Tools View Hive Jobs window](./media/apache-hadoop-visual-studio-tools-get-started/hdinsight-visual-studio-tools-view-hive-jobs.png "View Hive jobs")
266273

267-
268274
## Run Apache Pig scripts
269275

270276
1. From the menu bar, navigate to **File** > **New** > **Project...**.
@@ -276,15 +282,16 @@ To view Hive jobs:
276282
4. In **Solution Explorer**, double-click **Script.pig** to open the script.
277283

278284
## Feedback and known issues
285+
279286
* An issue in which results that are started with null values aren't shown has been fixed. If you're blocked on this issue, contact the support team.
280287
* The HQL script that Visual Studio creates is encoded, depending on the user’s local region setting. The script doesn't execute correctly if you upload the script to a cluster as a binary file.
281288
282289
## Next steps
290+
283291
In this article, you learned how to use the Data Lake Tools for Visual Studio package to connect to HDInsight clusters from Visual Studio. You also learned how to run a Hive query. For more information, see these articles:
284292
285293
* [Run Apache Hive queries using the Data Lake tools for Visual Studio](apache-hadoop-use-hive-visual-studio.md)
286294
* [Use Hadoop Hive in HDInsight](hdinsight-use-hive.md)
287295
* [Get started using Apache Hadoop in HDInsight](apache-hadoop-linux-tutorial-get-started.md)
288296
* [Submit Apache Hadoop jobs in HDInsight](submit-apache-hadoop-jobs-programmatically.md)
289297
* [Analyze Twitter data with Apache Hadoop in HDInsight](../hdinsight-analyze-twitter-data.md)
290-

0 commit comments

Comments
 (0)