You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/hdinsight-hadoop-windows-tools.md
+17-11Lines changed: 17 additions & 11 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,19 +4,20 @@ description: Work from a Windows PC in Hadoop on HDInsight. Manage and query clu
4
4
author: hrasheed-msft
5
5
ms.author: hrasheed
6
6
ms.reviewer: jasonh
7
-
ms.topic: conceptual
8
7
ms.service: hdinsight
8
+
ms.topic: conceptual
9
9
ms.custom: hdinsightactive,hdiseo17may2017
10
-
ms.date: 04/24/2019
10
+
ms.date: 12/20/2019
11
11
---
12
12
13
13
# Work in the Apache Hadoop ecosystem on HDInsight from a Windows PC
14
14
15
-
Learn about development and management options on the Windows PC for working in the Apache Hadoop ecosystem on HDInsight.
15
+
Learn about development and management options on the Windows PC for working in the Apache Hadoop ecosystem on HDInsight.
16
16
17
17
HDInsight is based on Apache Hadoop and Hadoop components, open-source technologies developed on Linux. HDInsight version 3.4 and higher uses the Ubuntu Linux distribution as the underlying OS for the cluster. However, you can work with HDInsight from a Windows client or Windows development environment.
18
18
19
19
## Use PowerShell for deployment and management tasks
20
+
20
21
Azure PowerShell is a scripting environment that you can use to control and automate deployment and management tasks in HDInsight from Windows.
21
22
22
23
Examples of tasks you can do with PowerShell:
@@ -28,23 +29,26 @@ Examples of tasks you can do with PowerShell:
28
29
Follow steps to [install and configure Azure Powershell](https://docs.microsoft.com/powershell/azure/install-az-ps) to get the latest version.
29
30
30
31
## Utilities you can run in a browser
32
+
31
33
The following utilities have a web UI that runs in a browser:
32
34
***[Azure Cloud Shell](https://docs.microsoft.com/azure/cloud-shell/overview)** is an interactive, command-line shell that runs in your browser and from within the Azure portal.
35
+
33
36
***[Apache Ambari Web UI](hdinsight-hadoop-manage-ambari.md)** is a management and monitoring utility available in the Azure portal that can be used to manage different kinds of jobs, such as:
34
37
*[Use Apache Ambari with the REST API](hdinsight-hadoop-manage-ambari-rest-api.md)
35
38
*[Apache Hive View in Apache Ambari](hadoop/apache-hadoop-use-hive-ambari-view.md)
36
39
*[Apache Tez View in Apache Ambari](hdinsight-debug-ambari-tez-view.md)
37
40
38
41
## Data Lake (Hadoop) Tools for Visual Studio
42
+
39
43
Use Data Lake Tools for Visual Studio to deploy and manage Storm topologies. Data Lake Tools also installs the SCP.NET SDK, which allows you to develop C# Storm topologies with Visual Studio.
40
44
41
-
Before you go to the following examples, [install and try Data Lake Tools for Visual Studio](hadoop/apache-hadoop-visual-studio-tools-get-started.md).
45
+
Before you go to the following examples, [install and try Data Lake Tools for Visual Studio](hadoop/apache-hadoop-visual-studio-tools-get-started.md).
42
46
43
47
Examples of tasks you can do with Visual Studio and Data Lake Tools for Visual Studio:
44
48
*[Deploy and manage Storm topologies from Visual Studio](storm/apache-storm-deploy-monitor-topology-linux.md)
45
49
*[Develop C# topologies for Storm using Visual Studio](storm/apache-storm-develop-csharp-visual-studio-topology.md). The bits include example templates for Storm topologies you can connect to databases, such as Azure Cosmos DB and SQL Database.
46
50
47
-
## Visual Studio and the .NET SDK
51
+
## Visual Studio and the .NET SDK
48
52
49
53
You can use Visual Studio with the .NET SDK to manage clusters and develop big data applications. You can use other IDEs for the following tasks, but examples are shown in Visual Studio.
50
54
@@ -54,25 +58,26 @@ Examples of tasks you can do with the .NET SDK in Visual Studio:
54
58
*[Use C# user-defined functions with Apache Hive and Apache Pig streaming on Apache Hadoop](hadoop/apache-hadoop-hive-pig-udf-dotnet-csharp.md).
55
59
56
60
## Intellij IDEA and Eclipse IDE for Spark clusters
61
+
57
62
Both [Intellij IDEA](https://www.jetbrains.com/idea/download) and the [Eclipse IDE](https://www.eclipse.org/downloads/) can be used to:
58
63
* Develop and submit a Scala Spark application on an HDInsight Spark cluster.
59
64
* Access Spark cluster resources.
60
65
* Develop and run a Scala Spark application locally.
61
66
62
-
These articles show how:
67
+
These articles show how:
63
68
* Intellij IDEA: [Create Apache Spark applications using the Azure Toolkit for Intellij plug-in and the Scala SDK.](spark/apache-spark-intellij-tool-plugin.md)
64
-
* Eclipse IDE or Scala IDE for Eclipse: [Create Apache Spark applications and the Azure Toolkit for Eclipse](spark/apache-spark-eclipse-tool-plugin.md)
69
+
* Eclipse IDE or Scala IDE for Eclipse: [Create Apache Spark applications and the Azure Toolkit for Eclipse](spark/apache-spark-eclipse-tool-plugin.md)
65
70
71
+
## Notebooks on Spark for data scientists
66
72
67
-
## Notebooks on Spark for data scientists
68
-
Apache Spark clusters in HDInsight include Apache Zeppelin notebooks and kernels that can be used with Jupyter notebooks.
73
+
Apache Spark clusters in HDInsight include Apache Zeppelin notebooks and kernels that can be used with Jupyter notebooks.
69
74
70
75
*[Learn how to use kernels on Apache Spark clusters with Jupyter notebooks to test Spark applications](spark/apache-spark-zeppelin-notebook.md)
71
-
*[Learn how to use Apache Zeppelin notebooks on Apache Spark clusters to run Spark jobs](spark/apache-spark-jupyter-notebook-kernels.md)
76
+
*[Learn how to use Apache Zeppelin notebooks on Apache Spark clusters to run Spark jobs](spark/apache-spark-jupyter-notebook-kernels.md)
72
77
73
78
## Run Linux-based tools and technologies on Windows
74
79
75
-
If you encounter a situation where you must use a tool or technology that is only available on Linux, consider the following options:
80
+
If you come across a situation where you must use a tool or technology that is only available on Linux, consider the following options:
76
81
77
82
***Bash on Ubuntu on Windows 10** provides a Linux subsystem on Windows. Bash allows you to directly run Linux utilities without having to maintain a dedicated Linux installation. See [Windows Subsystem for Linux Installation Guide for Windows 10](https://docs.microsoft.com/windows/wsl/install-win10) for installation steps. Other [Unix shells](https://www.gnu.org/software/bash/) will work as well.
78
83
***Docker for Windows** provides access to many Linux-based tools, and can be run directly from Windows. For example, you can use Docker to run the Beeline client for Hive directly from Windows. You can also use Docker to run a local Jupyter notebook and remotely connect to Spark on HDInsight. [Get started with Docker for Windows](https://docs.docker.com/docker-for-windows/)
@@ -83,6 +88,7 @@ If you encounter a situation where you must use a tool or technology that is onl
83
88
The Azure command-line interface (CLI) is Microsoft's cross-platform command-line experience for managing Azure resources. For more information, see [Azure Command-Line Interface (CLI)](https://docs.microsoft.com/cli/azure/?view=azure-cli-latest).
84
89
85
90
## Next steps
91
+
86
92
If you're new to working in Linux-based clusters, see the follow articles:
87
93
*[Set up Apache Hadoop, Apache Kafka, Apache Spark, or other clusters](hdinsight-hadoop-provision-linux-clusters.md)
88
94
*[Tips for HDInsight clusters on Linux](hdinsight-hadoop-linux-information.md)
0 commit comments