You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/hdinsight/hadoop/apache-hadoop-dotnet-csharp-mapreduce-streaming.md
+13-14Lines changed: 13 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,14 +7,14 @@ ms.reviewer: jasonh
7
7
ms.service: hdinsight
8
8
ms.topic: conceptual
9
9
ms.custom: hdinsightactive
10
-
ms.date: 11/22/2019
10
+
ms.date: 04/15/2020
11
11
---
12
12
13
13
# Use C# with MapReduce streaming on Apache Hadoop in HDInsight
14
14
15
15
Learn how to use C# to create a MapReduce solution on HDInsight.
16
16
17
-
Apache Hadoop streaming is a utility that allows you to run MapReduce jobs using a script or executable. In this example, .NET is used to implement the mapper and reducer for a word count solution.
17
+
Apache Hadoop streaming allows you to run MapReduce jobs using a script or executable. Here, .NET is used to implement the mapper and reducer for a word count solution.
18
18
19
19
## .NET on HDInsight
20
20
@@ -44,12 +44,9 @@ For more information on streaming, see [Hadoop Streaming](https://hadoop.apache.
44
44
45
45
* If using PowerShell, you'll need the [Az Module](https://docs.microsoft.com/powershell/azure/overview).
46
46
47
-
* An SSH client (optional). For more information, see [Connect to HDInsight (Apache Hadoop) using SSH](../hdinsight-hadoop-linux-use-ssh-unix.md).
48
-
49
47
* An Apache Hadoop cluster on HDInsight. See [Get Started with HDInsight on Linux](../hadoop/apache-hadoop-linux-tutorial-get-started.md).
50
48
51
-
* The [URI scheme](../hdinsight-hadoop-linux-information.md#URI-and-scheme) for your clusters primary storage. This would be `wasb://` for Azure Storage, `abfs://` for Azure Data Lake Storage Gen2 or `adl://` for Azure Data Lake Storage Gen1. If secure transfer is enabled for Azure Storage or Data Lake Storage Gen2, the URI would be `wasbs://` or `abfss://`, respectively See also, [secure transfer](../../storage/common/storage-require-secure-transfer.md).
52
-
49
+
* The [URI scheme](../hdinsight-hadoop-linux-information.md#URI-and-scheme) for your clusters primary storage. This scheme would be `wasb://` for Azure Storage, `abfs://` for Azure Data Lake Storage Gen2 or `adl://` for Azure Data Lake Storage Gen1. If secure transfer is enabled for Azure Storage or Data Lake Storage Gen2, the URI would be `wasbs://` or `abfss://`, respectively See also, [secure transfer](../../storage/common/storage-require-secure-transfer.md).
53
50
54
51
## Create the mapper
55
52
@@ -147,7 +144,7 @@ Next, you need to upload the *mapper* and *reducer* applications to HDInsight st
147
144
148
145
1. In Visual Studio, select **View** > **Server Explorer**.
149
146
150
-
1. Right-click **Azure**, select **Connect to Microsoft Azure Subscription...**, and complete the signin process.
147
+
1. Right-click **Azure**, select **Connect to Microsoft Azure Subscription...**, and complete the sign-in process.
151
148
152
149
1. Expand the HDInsight cluster that you wish to deploy this application to. An entry with the text **(Default Storage Account)** is listed.
153
150
@@ -216,14 +213,16 @@ The following procedure describes how to run a MapReduce job using an SSH sessio
216
213
217
214
The following list describes what each parameter and option represents:
218
215
219
-
* *hadoop-streaming.jar*: Specifies the jar file that contains the streaming MapReduce functionality.
220
-
* `-files`: Specifies the *mapper.exe* and *reducer.exe* files for this job. The `wasbs:///`, `adl:///`, or `abfs:///` protocol declaration before each file is the path to the root of default storage for the cluster.
221
-
* `-mapper`: Specifies the file that implements the mapper.
222
-
* `-reducer`: Specifies the file that implements the reducer.
223
-
* `-input`: Specifies the input data.
224
-
* `-output`: Specifies the output directory.
216
+
|Parameter | Description |
217
+
|---|---|
218
+
|hadoop-streaming.jar|Specifies the jar file that contains the streaming MapReduce functionality.|
219
+
|-files|Specifies the *mapper.exe* and *reducer.exe* files for this job. The `wasbs:///`, `adl:///`, or `abfs:///` protocol declaration before each file is the path to the root of default storage for the cluster.|
220
+
|-mapper|Specifies the file that implements the mapper.|
221
+
|-reducer|Specifies the file that implements the reducer.|
222
+
|-input|Specifies the input data.|
223
+
|-output|Specifies the output directory.|
225
224
226
-
3. Once the MapReduce job completes, use the following command to view the results:
225
+
1. Once the MapReduce job completes, use the following command to view the results:
0 commit comments