Skip to content

Commit c5f4273

Browse files
committed
cats127
1 parent b3f4726 commit c5f4273

File tree

2 files changed

+26
-30
lines changed

2 files changed

+26
-30
lines changed

articles/hdinsight/hdinsight-hadoop-emulator-visual-studio.md

Lines changed: 5 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,13 +2,12 @@
22
title: Data Lake tools for Visual Studio with Hortonworks Sandbox - Azure HDInsight
33
description: Learn how to use the Azure Data Lake tools for Visual Studio with the Hortonworks sandbox running in a local VM. With these tools, you can create and run Hive and Pig jobs on the sandbox, and view job output and history.
44
author: hrasheed-msft
5+
ms.author: hrasheed
56
ms.reviewer: jasonh
6-
77
ms.service: hdinsight
88
ms.custom: hdinsightactive
99
ms.topic: conceptual
1010
ms.date: 05/07/2018
11-
ms.author: hrasheed
1211
---
1312

1413
# Use the Azure Data Lake tools for Visual Studio with the Hortonworks Sandbox
@@ -37,7 +36,7 @@ Make sure that the Hortonworks Sandbox is running. Then follow the steps in the
3736

3837
2. From **Server Explorer**, right-click the **HDInsight** entry, and then select **Connect to HDInsight Emulator**.
3938

40-
![Screenshot of Server Explorer, with Connect to HDInsight Emulator highlighted](./media/hdinsight-hadoop-emulator-visual-studio/connect-hdinsight-emulator.png)
39+
![Server Explorer, with Connect to HDInsight Emulator highlighted](./media/hdinsight-hadoop-emulator-visual-studio/connect-hdinsight-emulator.png)
4140

4241
3. From the **Connect to HDInsight Emulator** dialog box, enter the password that you configured for Ambari.
4342

@@ -108,7 +107,7 @@ Hive provides a SQL-like query language (HiveQL) for working with structured dat
108107
> [!NOTE]
109108
> The information is the same that is available from the **Job Log** link after a job has finished.
110109
111-
![Screenshot of output log](./media/hdinsight-hadoop-emulator-visual-studio/hiveserver2-output-box.png)
110+
![Screenshot of HiveServer2 output](./media/hdinsight-hadoop-emulator-visual-studio/hiveserver2-output-box.png)
112111

113112
## Create a Hive project
114113

@@ -118,7 +117,7 @@ You can also create a project that contains multiple Hive scripts. Use a project
118117

119118
2. From the list of projects, expand **Templates**, expand **Azure Data Lake**, and then select **HIVE (HDInsight)**. From the list of templates, select **Hive Sample**. Enter a name and location, and then select **OK**.
120119

121-
![Screenshot of New Project window, with Azure Data Lake, HIVE, Hive Sample, and OK highlighted](./media/hdinsight-hadoop-emulator-visual-studio/new-apache-hive-project.png)
120+
![New Project window, with Azure Data Lake, Hive Sample, and OK](./media/hdinsight-hadoop-emulator-visual-studio/new-apache-hive-project.png)
122121

123122
The **Hive Sample** project contains two scripts, **WebLogAnalysis.hql** and **SensorDataAnalysis.hql**. You can submit these scripts by using the same **Submit** button at the top of the window.
124123

@@ -175,7 +174,7 @@ Data Lake tools also allow you to easily view information about jobs that have b
175174

176175
2. Expanding a table displays the columns for that table. To quickly view the data, right-click a table, and select **View Top 100 Rows**.
177176

178-
![Screenshot of Server Explorer, with table expanded and View Top 100 Rows selected](./media/hdinsight-hadoop-emulator-visual-studio/hdi-view-top-100-rows.png)
177+
![Server Explorer, with table expanded and View Top 100 Rows selected](./media/hdinsight-hadoop-emulator-visual-studio/hdi-view-top-100-rows.png)
179178

180179
### Database and table properties
181180

articles/hdinsight/hdinsight-hadoop-hue-linux.md

Lines changed: 21 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -3,20 +3,20 @@ title: Hue with Hadoop on HDInsight Linux-based clusters - Azure
33
description: Learn how to install Hue on HDInsight clusters and use tunneling to route the requests to Hue. Use Hue to browse storage and run Hive or Pig.
44
keywords: hue hadoop
55
author: hrasheed-msft
6+
ms.date: 12/11/2017
7+
ms.author: hrasheed
68
ms.reviewer: jasonh
7-
89
ms.service: hdinsight
910
ms.custom: hdinsightactive,hdiseo17may2017
1011
ms.topic: conceptual
11-
ms.date: 12/11/2017
12-
ms.author: hrasheed
13-
1412
---
13+
1514
# Install and use Hue on HDInsight Hadoop clusters
1615

1716
Learn how to install Hue on HDInsight clusters and use tunneling to route the requests to Hue.
1817

1918
## What is Hue?
19+
2020
Hue is a set of Web applications used to interact with an Apache Hadoop cluster. You can use Hue to browse the storage associated with a Hadoop cluster (WASB, in the case of HDInsight clusters), run Hive jobs and Pig scripts, and so on. The following components are available with Hue installations on an HDInsight Hadoop cluster.
2121

2222
* Beeswax Hive Editor
@@ -30,8 +30,6 @@ Hue is a set of Web applications used to interact with an Apache Hadoop cluster.
3030
> Components provided with the HDInsight cluster are fully supported and Microsoft Support will help to isolate and resolve issues related to these components.
3131
>
3232
> Custom components receive commercially reasonable support to help you to further troubleshoot the issue. This might result in resolving the issue OR asking you to engage available channels for the open source technologies where deep expertise for that technology is found. For example, there are many community sites that can be used, like: [MSDN forum for HDInsight](https://social.msdn.microsoft.com/Forums/azure/en-US/home?forum=hdinsight), [https://stackoverflow.com](https://stackoverflow.com). Also Apache projects have project sites on [https://apache.org](https://apache.org), for example: [Hadoop](https://hadoop.apache.org/).
33-
>
34-
>
3533
3634
## Install Hue using Script Actions
3735

@@ -41,16 +39,13 @@ This section provides instructions about how to use the script when provisioning
4139

4240
> [!NOTE]
4341
> Azure PowerShell, the Azure Classic CLI, the HDInsight .NET SDK, or Azure Resource Manager templates can also be used to apply script actions. You can also apply script actions to already running clusters. For more information, see [Customize HDInsight clusters with Script Actions](hdinsight-hadoop-customize-cluster-linux.md).
44-
>
45-
>
4642
4743
1. Start provisioning a cluster by using the steps in [Provision HDInsight clusters on Linux](hdinsight-hadoop-provision-linux-clusters.md), but do not complete provisioning.
4844

4945
> [!NOTE]
5046
> To install Hue on HDInsight clusters, the recommended headnode size is at least A4 (8 cores, 14 GB memory).
51-
>
52-
>
53-
2. On the **Optional Configuration** blade, select **Script Actions**, and provide the information as shown below:
47+
48+
1. On the **Optional Configuration** blade, select **Script Actions**, and provide the information as shown below:
5449

5550
![Provide script action parameters for Hue](./media/hdinsight-hadoop-hue-linux/hdi-hue-script-action.png "Provide script action parameters for Hue")
5651

@@ -60,17 +55,17 @@ This section provides instructions about how to use the script when provisioning
6055
* **WORKER**: Leave this blank.
6156
* **ZOOKEEPER**: Leave this blank.
6257
* **PARAMETERS**: Leave this blank.
63-
3. At the bottom of the **Script Actions**, use the **Select** button to save the configuration. Finally, use the **Select** button at the bottom of the **Optional Configuration** blade to save the optional configuration information.
64-
4. Continue provisioning the cluster as described in [Provision HDInsight clusters on Linux](hdinsight-hadoop-provision-linux-clusters.md).
58+
59+
1. At the bottom of the **Script Actions**, use the **Select** button to save the configuration. Finally, use the **Select** button at the bottom of the **Optional Configuration** blade to save the optional configuration information.
60+
61+
1. Continue provisioning the cluster as described in [Provision HDInsight clusters on Linux](hdinsight-hadoop-provision-linux-clusters.md).
6562

6663
## Use Hue with HDInsight clusters
6764

6865
SSH Tunneling is the only way to access Hue on the cluster once it is running. Tunneling via SSH allows the traffic to go directly to the headnode of the cluster where Hue is running. After the cluster has finished provisioning, use the following steps to use Hue on an HDInsight Linux cluster.
6966

7067
> [!NOTE]
7168
> We recommend using Firefox web browser to follow the instructions below.
72-
>
73-
>
7469
7570
1. Use the information in [Use SSH Tunneling to access Apache Ambari web UI, ResourceManager, JobHistory, NameNode, Oozie, and other web UI's](hdinsight-linux-ambari-ssh-tunnel.md) to create an SSH tunnel from your client system to the HDInsight cluster, and then configure your Web browser to use the SSH tunnel as a proxy.
7671

@@ -87,38 +82,39 @@ SSH Tunneling is the only way to access Hue on the cluster once it is running. T
8782
hn0-myhdi-nfebtpfdv1nubcidphpap2eq2b.ex.internal.cloudapp.net
8883

8984
This is the hostname of the primary headnode where the Hue website is located.
85+
9086
4. Use the browser to open the Hue portal at http:\//HOSTNAME:8888. Replace HOSTNAME with the name you obtained in the previous step.
9187

9288
> [!NOTE]
9389
> When you log in for the first time, you will be prompted to create an account to log in to the Hue portal. The credentials you specify here will be limited to the portal and are not related to the admin or SSH user credentials you specified while provision the cluster.
94-
>
95-
>
9690
97-
![Login to the Hue portal](./media/hdinsight-hadoop-hue-linux/hdinsight-hue-portal-login.png "Specify credentials for Hue portal")
91+
![HDInsight hue portal login window](./media/hdinsight-hadoop-hue-linux/hdinsight-hue-portal-login.png "Specify credentials for Hue portal")
9892

9993
### Run a Hive query
94+
10095
1. From the Hue portal, click **Query Editors**, and then click **Hive** to open the Hive editor.
10196

102-
![Use Hive](./media/hdinsight-hadoop-hue-linux/hdinsight-hue-portal-use-hive.png "Use Hive")
97+
![HDInsight hue portal use hive editor](./media/hdinsight-hadoop-hue-linux/hdinsight-hue-portal-use-hive.png "Use Hive")
10398
2. On the **Assist** tab, under **Database**, you should see **hivesampletable**. This is a sample table that is shipped with all Hadoop clusters on HDInsight. Enter a sample query in the right pane and see the output on the **Results** tab in the pane below, as shown in the screen capture.
10499

105-
![Run Hive query](./media/hdinsight-hadoop-hue-linux/hdinsight-hue-portal-hive-query.png "Run Hive query")
100+
![HDInsight hue portal hive query](./media/hdinsight-hadoop-hue-linux/hdinsight-hue-portal-hive-query.png "Run Hive query")
106101

107102
You can also use the **Chart** tab to see a visual representation of the result.
108103

109104
### Browse the cluster storage
105+
110106
1. From the Hue portal, click **File Browser** in the top-right corner of the menu bar.
111107
2. By default the file browser opens at the **/user/myuser** directory. Click the forward slash right before the user directory in the path to go to the root of the Azure storage container associated with the cluster.
112108

113-
![Use file browser](./media/hdinsight-hadoop-hue-linux/hdinsight-hue-portal-file-browser.png "Use file browser")
109+
![HDInsight hue portal file browser](./media/hdinsight-hadoop-hue-linux/hdinsight-hue-portal-file-browser.png "Use file browser")
110+
114111
3. Right-click on a file or folder to see the available operations. Use the **Upload** button in the right corner to upload files to the current directory. Use the **New** button to create new files or directories.
115112

116113
> [!NOTE]
117114
> The Hue file browser can only show the contents of the default container associated with the HDInsight cluster. Any additional storage accounts/containers that you might have associated with the cluster will not be accessible using the file browser. However, the additional containers associated with the cluster will always be accessible for the Hive jobs. For example, if you enter the command `dfs -ls wasb://[email protected]` in the Hive editor, you can see the contents of additional containers as well. In this command, **newcontainer** is not the default container associated with a cluster.
118-
>
119-
>
120115
121116
## Important considerations
117+
122118
1. The script used to install Hue installs it only on the primary headnode of the cluster.
123119

124120
2. During installation, multiple Hadoop services (HDFS, YARN, MR2, Oozie) are restarted for updating the configuration. After the script finishes installing Hue, it might take some time for other Hadoop services to start up. This might affect Hue's performance initially. Once all services start up, Hue will be fully functional.
@@ -128,12 +124,13 @@ SSH Tunneling is the only way to access Hue on the cluster once it is running. T
128124

129125
4. With Linux clusters, you can have a scenario where your services are running on the primary headnode while the Resource Manager could be running on the secondary. Such a scenario might result in errors (shown below) when using Hue to view details of RUNNING jobs on the cluster. However, you can view the job details when the job has completed.
130126

131-
![Hue portal error](./media/hdinsight-hadoop-hue-linux/hdinsight-hue-portal-error.png "Hue portal error")
127+
![Hue portal error sample message](./media/hdinsight-hadoop-hue-linux/hdinsight-hue-portal-error.png "Hue portal error")
132128

133129
This is due to a known issue. As a workaround, modify Ambari so that the active Resource Manager also runs on the primary headnode.
134130
5. Hue understands WebHDFS while HDInsight clusters use Azure Storage using `wasb://`. So, the custom script used with script action installs WebWasb, which is a WebHDFS-compatible service for talking to WASB. So, even though the Hue portal says HDFS in places (like when you move your mouse over the **File Browser**), it should be interpreted as WASB.
135131

136132
## Next steps
133+
137134
* [Install Apache Giraph on HDInsight clusters](hdinsight-hadoop-giraph-install-linux.md). Use cluster customization to install Giraph on HDInsight Hadoop clusters. Giraph allows you to perform graph processing using Hadoop, and it can be used with Azure HDInsight.
138135
* [Install R on HDInsight clusters](hdinsight-hadoop-r-scripts-linux.md). Use cluster customization to install R on HDInsight Hadoop clusters. R is an open-source language and environment for statistical computing. It provides hundreds of built-in statistical functions and its own programming language that combines aspects of functional and object-oriented programming. It also provides extensive graphical capabilities.
139136

0 commit comments

Comments
 (0)