Skip to content

Commit a1e0964

Browse files
authored
Merge pull request #101290 from dagiro/freshness178
freshness178
2 parents 1ecffd0 + 5d3e747 commit a1e0964

File tree

1 file changed

+42
-43
lines changed

1 file changed

+42
-43
lines changed
Lines changed: 42 additions & 43 deletions
Original file line numberDiff line numberDiff line change
@@ -1,44 +1,43 @@
11
---
22
title: Submit MapReduce jobs using HDInsight .NET SDK - Azure
33
description: Learn how to submit MapReduce jobs to Azure HDInsight Apache Hadoop using HDInsight .NET SDK.
4-
ms.reviewer: jasonh
54
author: hrasheed-msft
6-
5+
ms.author: hrasheed
6+
ms.reviewer: jasonh
77
ms.service: hdinsight
8-
ms.custom: hdinsightactive
98
ms.topic: conceptual
10-
ms.date: 05/16/2018
11-
ms.author: hrasheed
12-
9+
ms.custom: hdinsightactive
10+
ms.date: 01/15/2020
1311
---
12+
1413
# Run MapReduce jobs using HDInsight .NET SDK
14+
1515
[!INCLUDE [mapreduce-selector](../../../includes/hdinsight-selector-use-mapreduce.md)]
1616

17-
Learn how to submit MapReduce jobs using HDInsight .NET SDK. HDInsight clusters come with a jar file with some MapReduce samples. The jar file is */example/jars/hadoop-mapreduce-examples.jar*. One of the samples is *wordcount*. You develop a C# console application to submit a wordcount job. The job reads the */example/data/gutenberg/davinci.txt* file, and outputs the results to */example/data/davinciwordcount*. If you want to rerun the application, you must clean up the output folder.
17+
Learn how to submit MapReduce jobs using HDInsight .NET SDK. HDInsight clusters come with a jar file with some MapReduce samples. The jar file is `/example/jars/hadoop-mapreduce-examples.jar`. One of the samples is **wordcount**. You develop a C# console application to submit a wordcount job. The job reads the `/example/data/gutenberg/davinci.txt` file, and outputs the results to `/example/data/davinciwordcount`. If you want to rerun the application, you must clean up the output folder.
1818

1919
> [!NOTE]
2020
> The steps in this article must be performed from a Windows client. For information on using a Linux, OS X, or Unix client to work with Hive, use the tab selector shown on the top of the article.
21-
>
22-
>
2321
2422
## Prerequisites
25-
Before you begin this article, you must have the following items:
2623

27-
* **A Hadoop cluster in HDInsight**. See [Get started using Linux-based Apache Hadoop in HDInsight](apache-hadoop-linux-tutorial-get-started.md).
28-
* **Visual Studio 2013/2015/2017**.
24+
* An Apache Hadoop cluster on HDInsight. See [Create Apache Hadoop clusters using the Azure portal](../hdinsight-hadoop-create-linux-clusters-portal.md).
25+
26+
* [Visual Studio](https://visualstudio.microsoft.com/vs/community/).
2927

3028
## Submit MapReduce jobs using HDInsight .NET SDK
31-
The HDInsight .NET SDK provides .NET client libraries, which makes it easier to work with HDInsight clusters from .NET.
3229

33-
**To Submit jobs**
30+
The HDInsight .NET SDK provides .NET client libraries, which make it easier to work with HDInsight clusters from .NET.
31+
32+
1. Start Visual Studio and create a C# console application.
3433

35-
1. Create a C# console application in Visual Studio.
36-
2. From the NuGet Package Manager Console, run the following command:
34+
1. Navigate to **Tools** > **NuGet Package Manager** > **Package Manager Console** and enter the following command:
3735

3836
```
3937
Install-Package Microsoft.Azure.Management.HDInsight.Job
4038
```
41-
3. Use the following code:
39+
40+
1. Copy the code below into **Program.cs**. Then edit the code by setting the values for: `existingClusterName`, `existingClusterPassword`, `defaultStorageAccountName`, `defaultStorageAccountKey`, and `defaultStorageContainerName`.
4241
4342
```csharp
4443
using System.Collections.Generic;
@@ -50,65 +49,64 @@ The HDInsight .NET SDK provides .NET client libraries, which makes it easier to
5049
using Hyak.Common;
5150
using Microsoft.WindowsAzure.Storage;
5251
using Microsoft.WindowsAzure.Storage.Blob;
53-
52+
5453
namespace SubmitHDInsightJobDotNet
5554
{
5655
class Program
5756
{
5857
private static HDInsightJobManagementClient _hdiJobManagementClient;
59-
58+
6059
private const string existingClusterName = "<Your HDInsight Cluster Name>";
61-
private const string existingClusterUri = existingClusterName + ".azurehdinsight.net";
62-
private const string existingClusterUsername = "<Cluster Username>";
6360
private const string existingClusterPassword = "<Cluster User Password>";
64-
65-
private const string defaultStorageAccountName = "<Default Storage Account Name>"; //<StorageAccountName>.blob.core.windows.net
61+
private const string defaultStorageAccountName = "<Default Storage Account Name>";
6662
private const string defaultStorageAccountKey = "<Default Storage Account Key>";
6763
private const string defaultStorageContainerName = "<Default Blob Container Name>";
68-
69-
private const string sourceFile = "/example/data/gutenberg/davinci.txt";
64+
65+
private const string existingClusterUsername = "admin";
66+
private const string existingClusterUri = existingClusterName + ".azurehdinsight.net";
67+
private const string sourceFile = "/example/data/gutenberg/davinci.txt";
7068
private const string outputFolder = "/example/data/davinciwordcount";
71-
69+
7270
static void Main(string[] args)
7371
{
7472
System.Console.WriteLine("The application is running ...");
75-
73+
7674
var clusterCredentials = new BasicAuthenticationCloudCredentials { Username = existingClusterUsername, Password = existingClusterPassword };
7775
_hdiJobManagementClient = new HDInsightJobManagementClient(existingClusterUri, clusterCredentials);
78-
76+
7977
SubmitMRJob();
80-
78+
8179
System.Console.WriteLine("Press ENTER to continue ...");
8280
System.Console.ReadLine();
8381
}
84-
82+
8583
private static void SubmitMRJob()
8684
{
8785
List<string> args = new List<string> { { "/example/data/gutenberg/davinci.txt" }, { "/example/data/davinciwordcount" } };
88-
86+
8987
var paras = new MapReduceJobSubmissionParameters
9088
{
9189
JarFile = @"/example/jars/hadoop-mapreduce-examples.jar",
9290
JarClass = "wordcount",
9391
Arguments = args
9492
};
95-
93+
9694
System.Console.WriteLine("Submitting the MR job to the cluster...");
9795
var jobResponse = _hdiJobManagementClient.JobManagement.SubmitMapReduceJob(paras);
9896
var jobId = jobResponse.JobSubmissionJsonResponse.Id;
9997
System.Console.WriteLine("Response status code is " + jobResponse.StatusCode);
10098
System.Console.WriteLine("JobId is " + jobId);
101-
99+
102100
System.Console.WriteLine("Waiting for the job completion ...");
103-
101+
104102
// Wait for job completion
105103
var jobDetail = _hdiJobManagementClient.JobManagement.GetJob(jobId).JobDetail;
106104
while (!jobDetail.Status.JobComplete)
107105
{
108106
Thread.Sleep(1000);
109107
jobDetail = _hdiJobManagementClient.JobManagement.GetJob(jobId).JobDetail;
110108
}
111-
109+
112110
// Get job output
113111
System.Console.WriteLine("Job output is: ");
114112
var storageAccess = new AzureStorageAccess(defaultStorageAccountName, defaultStorageAccountKey,
@@ -117,8 +115,8 @@ The HDInsight .NET SDK provides .NET client libraries, which makes it easier to
117115
if (jobDetail.ExitValue == 0)
118116
{
119117
// Create the storage account object
120-
CloudStorageAccount storageAccount = CloudStorageAccount.Parse("DefaultEndpointsProtocol=https;AccountName=" +
121-
defaultStorageAccountName +
118+
CloudStorageAccount storageAccount = CloudStorageAccount.Parse("DefaultEndpointsProtocol=https;AccountName=" +
119+
defaultStorageAccountName +
122120
";AccountKey=" + defaultStorageAccountKey);
123121
124122
// Create the blob client.
@@ -143,7 +141,7 @@ The HDInsight .NET SDK provides .NET client libraries, which makes it easier to
143141
else
144142
{
145143
// fetch stderr output in case of failure
146-
var output = _hdiJobManagementClient.JobManagement.GetJobErrorLogs(jobId, storageAccess);
144+
var output = _hdiJobManagementClient.JobManagement.GetJobErrorLogs(jobId, storageAccess);
147145
148146
using (var reader = new StreamReader(output, Encoding.UTF8))
149147
{
@@ -155,20 +153,21 @@ The HDInsight .NET SDK provides .NET client libraries, which makes it easier to
155153
}
156154
}
157155
}
156+
158157
```
159158
160-
4. Press **F5** to run the application.
159+
1. Press **F5** to run the application.
161160
162-
To run the job again, you must change the job output folder name, in the sample, it is "/example/data/davinciwordcount".
161+
To run the job again, you must change the job output folder name, in the sample it's `/example/data/davinciwordcount`.
163162
164-
When the job completes successfully, the application prints the content of the output file "part-r-00000".
163+
When the job completes successfully, the application prints the content of the output file `part-r-00000`.
165164
166165
## Next steps
166+
167167
In this article, you have learned several ways to create an HDInsight cluster. To learn more, see the following articles:
168168
169169
* For submitting a Hive job, see [Run Apache Hive queries using HDInsight .NET SDK](apache-hadoop-use-hive-dotnet-sdk.md).
170170
* For creating HDInsight clusters, see [Create Linux-based Apache Hadoop clusters in HDInsight](../hdinsight-hadoop-provision-linux-clusters.md).
171171
* For managing HDInsight clusters, see [Manage Apache Hadoop clusters in HDInsight](../hdinsight-administer-use-portal-linux.md).
172172
* For learning the HDInsight .NET SDK, see [HDInsight .NET SDK reference](https://docs.microsoft.com/dotnet/api/overview/azure/hdinsight).
173-
* For non-interactive authenticate to Azure, see [Create non-interactive authentication .NET HDInsight applications](../hdinsight-create-non-interactive-authentication-dotnet-applications.md).
174-
173+
* For non-interactive authenticate to Azure, see [Create non-interactive authentication .NET HDInsight applications](../hdinsight-create-non-interactive-authentication-dotnet-applications.md).

0 commit comments

Comments
 (0)