Skip to content

Commit 1253d93

Browse files
author
Sreekanth Iyer (Ushta Te Consultancy Services)
committed
Improved correctness score
1 parent 785150e commit 1253d93

File tree

5 files changed

+17
-17
lines changed

5 files changed

+17
-17
lines changed

articles/hdinsight/interactive-query/apache-hadoop-connect-hive-power-bi-directquery.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,12 @@ This article describes how to connect Microsoft Power BI to Azure HDInsight Inte
1313

1414
:::image type="content" source="./media/apache-hadoop-connect-hive-power-bi-directquery/hdinsight-power-bi-visualization.png" alt-text="HDInsight Power BI the map report." border="true":::
1515

16-
You can use the [Apache Hive ODBC driver](../hadoop/apache-hadoop-connect-hive-power-bi.md) to do import via the generic ODBC connector in Power BI Desktop. However it is not recommended for BI workloads given non-interactive nature of the Hive query engine. [HDInsight Interactive Query connector](./apache-hadoop-connect-hive-power-bi-directquery.md) and [HDInsight Apache Spark connector](/power-bi/spark-on-hdinsight-with-direct-connect) are better choices for their performance.
16+
You can use the [Apache Hive ODBC driver](../hadoop/apache-hadoop-connect-hive-power-bi.md) to do import via the generic ODBC connector in Power BI Desktop. However it isn't recommended for BI workloads given non-interactive nature of the Hive query engine. [HDInsight Interactive Query connector](./apache-hadoop-connect-hive-power-bi-directquery.md) and [HDInsight Apache Spark connector](/power-bi/spark-on-hdinsight-with-direct-connect) are better choices for their performance.
1717

1818
## Prerequisites
1919
Before going through this article, you must have the following items:
2020

21-
* **HDInsight cluster**. The cluster can be either an HDInsight cluster with Apache Hive or a newly released Interactive Query cluster. For creating clusters, see [Create cluster](../hadoop/apache-hadoop-linux-tutorial-get-started.md).
21+
* **HDInsight cluster**. The cluster can be either a HDInsight cluster with Apache Hive or a newly released Interactive Query cluster. For creating clusters, see [Create cluster](../hadoop/apache-hadoop-linux-tutorial-get-started.md).
2222
* **[Microsoft Power BI Desktop](https://powerbi.microsoft.com/desktop/)**. You can download a copy from the [Microsoft Download Center](https://www.microsoft.com/download/details.aspx?id=45331).
2323

2424
## Load data from HDInsight
@@ -31,7 +31,7 @@ The `hivesampletable` Hive table comes with all HDInsight clusters.
3131

3232
:::image type="content" source="./media/apache-hadoop-connect-hive-power-bi-directquery/hdinsight-power-bi-open-odbc.png" alt-text="HDInsight Power BI Get Data More." border="true":::
3333

34-
3. From the **Get Data** window, enter **hdinsight** in the search box.
34+
3. From the `Get Data` window, enter **hdinsight** in the search box.
3535

3636
4. From the search results, select **HDInsight Interactive Query**, and then select **Connect**. If you don't see **HDInsight Interactive Query**, you need to update your Power BI Desktop to the latest version.
3737

articles/hdinsight/interactive-query/apache-interactive-query-get-started.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -34,8 +34,8 @@ To execute Hive queries, you have the following options:
3434
|Microsoft Power BI|See [Visualize Interactive Query Apache Hive data with Power BI in Azure HDInsight](./apache-hadoop-connect-hive-power-bi-directquery.md), and [Visualize big data with Power BI in Azure HDInsight](../hadoop/apache-hadoop-connect-hive-power-bi.md).|
3535
|Visual Studio|See [Connect to Azure HDInsight and run Apache Hive queries using Data Lake Tools for Visual Studio](../hadoop/apache-hadoop-visual-studio-tools-get-started.md#run-interactive-apache-hive-queries).|
3636
|Visual Studio Code|See [Use Visual Studio Code for Apache Hive, LLAP, or pySpark](../hdinsight-for-vscode.md).|
37-
|Apache Ambari Hive View|See [Use Apache Hive View with Apache Hadoop in Azure HDInsight](../hadoop/apache-hadoop-use-hive-ambari-view.md). Hive View is not available for HDInsight 4.0.|
38-
|Apache Beeline|See [Use Apache Hive with Apache Hadoop in HDInsight with Beeline](../hadoop/apache-hadoop-use-hive-beeline.md). You can use Beeline from either the head node or from an empty edge node. We recommend using Beeline from an empty edge node. For information about creating an HDInsight cluster by using an empty edge node, see [Use empty edge nodes in HDInsight](../hdinsight-apps-use-edge-node.md).|
37+
|Apache Ambari Hive View|See [Use Apache Hive View with Apache Hadoop in Azure HDInsight](../hadoop/apache-hadoop-use-hive-ambari-view.md). Hive View isn't available for HDInsight 4.0.|
38+
|Apache Beeline|See [Use Apache Hive with Apache Hadoop in HDInsight with Beeline](../hadoop/apache-hadoop-use-hive-beeline.md). You can use Beeline from either the head node or from an empty edge node. We recommend using Beeline from an empty edge node. For information about creating a HDInsight cluster by using an empty edge node, see [Use empty edge nodes in HDInsight](../hdinsight-apps-use-edge-node.md).|
3939
|Hive ODBC|See [Connect Excel to Apache Hadoop with the Microsoft Hive ODBC driver](../hadoop/apache-hadoop-connect-excel-hive-odbc-driver.md).|
4040

4141
To find the Java Database Connectivity (JDBC) connection string:

articles/hdinsight/interactive-query/quickstart-bicep.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ The Bicep file used in this quickstart is from [Azure Quickstart Templates](http
2929
Two Azure resources are defined in the Bicep file:
3030

3131
* [Microsoft.Storage/storageAccounts](/azure/templates/microsoft.storage/storageaccounts): create an Azure Storage Account.
32-
* [Microsoft.HDInsight/cluster](/azure/templates/microsoft.hdinsight/clusters): create an HDInsight cluster.
32+
* [Microsoft.HDInsight/cluster](/azure/templates/microsoft.hdinsight/clusters): create a HDInsight cluster.
3333

3434
### Deploy the Bicep file
3535

@@ -55,13 +55,13 @@ Two Azure resources are defined in the Bicep file:
5555
You need to provide values for the parameters:
5656
5757
* Replace **\<cluster-name\>** with the name of the HDInsight cluster to create.
58-
* Replace **\<cluster-username\>** with the credentials used to submit jobs to the cluster and to log in to cluster dashboards.
59-
* Replace **\<ssh-username\>** with the credentials used to remotely access the cluster. The username can not be admin username.
58+
* Replace **\<cluster-username\>** with the credentials used to submit jobs to the cluster and to sign-in to cluster dashboards.
59+
* Replace **\<ssh-username\>** with the credentials used to remotely access the cluster. The username can’t be admin username.
6060
61-
You are prompted to enter the following password:
61+
You're prompted to enter the following password:
6262
63-
* **clusterLoginPassword**, which must be at least 10 characters long and contain one digit, one uppercase letter, one lowercase letter, and one non-alphanumeric character except single-quote, double-quote, backslash, right-bracket, full-stop. It also must not contain three consecutive characters from the cluster username or SSH username.
64-
* **sshPassword**, which must be 6-72 characters long and must contain at least one digit, one uppercase letter, and one lowercase letter. It must not contain any three consecutive characters from the cluster login name.
63+
* **clusterLoginPassword**, which must be at least 10 characters long and contain one digit, one uppercase letter, one lowercase letter, and one nonalphanumeric character except single-quote, double-quote, backslash, right-bracket, full-stop. It also must not contain three consecutive characters from the cluster username or SSH username.
64+
* **sshPassword**, which must be 6-72 characters long and must contain at least one digit, one uppercase letter, and one lowercase letter. It must not contain any three consecutive characters from the cluster sign in name.
6565
6666
> [!NOTE]
6767
> When the deployment finishes, you should see a message indicating the deployment succeeded.

articles/hdinsight/spark/apache-spark-run-machine-learning-automl.md

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,12 @@ ms.date: 10/17/2024
88

99
# Run Azure Machine Learning workloads with automated machine learning on Apache Spark in HDInsight
1010

11-
Azure Machine Learning simplifies and accelerates the building, training, and deployment of machine learning models. In automated machine learning (AutoML), you start with training data that has a defined target feature. Iterate through combinations of algorithms and feature selections automatically select the best model for your data based on the training scores. HDInsight allows customers to provision clusters with hundreds of nodes. AutoML running on Spark in an HDInsight cluster allows users to use compute capacity across these nodes to run training jobs in a scale-out fashion, and to run multiple training jobs in parallel. It allows users to run AutoML experiments while sharing the compute with their other big data workloads.
11+
Azure Machine Learning simplifies and accelerates the building, training, and deployment of machine learning models. In automated machine learning (AutoML), you start with training data that has a defined target feature. Iterate through combinations of algorithms and feature selections automatically select the best model for your data based on the training scores. HDInsight allows customers to provision clusters with hundreds of nodes. AutoML running on Spark in a HDInsight cluster allows users to use compute capacity across these nodes to run training jobs in a scale-out fashion, and to run multiple training jobs in parallel. It allows users to run AutoML experiments while sharing the compute with their other big data workloads.
1212

13-
## Install Azure Machine Learning on an HDInsight cluster
13+
## Install Azure Machine Learning on a HDInsight cluster
1414

1515
For general tutorials of automated machine learning, see [Tutorial: Use automated machine learning to build your regression model](/azure/machine-learning/tutorial-auto-train-models).
16-
All new HDInsight-Spark clusters come pre-installed with AzureML-AutoML SDK.
16+
All new HDInsight-Spark clusters come preinstalled with AzureML-AutoML SDK.
1717

1818
> [!Note]
1919
> Azure Machine Learning packages are installed into Python3 conda environment. The installed Jupyter Notebook should be run using the PySpark3 kernel.
@@ -22,7 +22,7 @@ You can use Zeppelin notebooks to use AutoML as well.
2222

2323
## Authentication for workspace
2424

25-
Workspace creation and experiment submission require an authentication token. This token can be generated using an [Microsoft Entra application](../../active-directory/develop/app-objects-and-service-principals.md). An [Microsoft Entra user](/azure/developer/python/sdk/authentication-overview) can also be used to generate the required authentication token, if multi-factor authentication isn't enabled on the account.
25+
Workspace creation and experiment submission require an authentication token. This token can be generated using an [Microsoft Entra application](../../active-directory/develop/app-objects-and-service-principals.md). An [Microsoft Entra user](/azure/developer/python/sdk/authentication-overview) can also be used to generate the required authentication token, if multifactor authentication isn't enabled on the account.
2626

2727
The following code snippet creates an authentication token using an **Microsoft Entra application**.
2828

articles/hdinsight/use-pig.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ ms.date: 10/17/2024
1111

1212
Learn how to use [Apache Pig](https://pig.apache.org/) with HDInsight.
1313

14-
Apache Pig is a platform for creating programs for Apache Hadoop by using a procedural language known as *Pig Latin*. Pig is an alternative to Java for creating *MapReduce* solutions, and it is included with Azure HDInsight. Use the following table to discover the various ways that Pig can be used with HDInsight:
14+
Apache Pig is a platform for creating programs for Apache Hadoop by using a procedural language known as *Pig Latin*. Pig is an alternative to Java for creating *MapReduce* solutions, and it's included with Azure HDInsight. Use the following table to discover the various ways that Pig can be used with HDInsight:
1515

1616
## <a id="why"></a>Why use Apache Pig
1717

@@ -35,7 +35,7 @@ For more information about Pig Latin, see [Pig Latin Reference Manual 1](https:/
3535

3636
## <a id="data"></a>Example data
3737

38-
HDInsight provides various example data sets, which are stored in the `/example/data` and `/HdiSamples` directories. These directories are in the default storage for your cluster. The Pig example in this document uses the *log4j* file from `/example/data/sample.log`.
38+
HDInsight provides various example data sets, which are stored in the `/example/data` and `/HdiSamples` directories. These directories are in the default storage for your cluster. The Pig example in this document uses the *Log4j* file from `/example/data/sample.log`.
3939

4040
Each log inside the file consists of a line of fields that contains a `[LOG LEVEL]` field to show the type and the severity, for example:
4141

0 commit comments

Comments
 (0)