Skip to content

Commit 10e817b

Browse files
committed
Merge branch 'main' of https://github.com/MicrosoftDocs/azure-docs-pr into rolyon-abac-attributes-allowed-values-update-v2
2 parents 64edc65 + cb510bb commit 10e817b

File tree

8 files changed

+208
-14
lines changed

8 files changed

+208
-14
lines changed

articles/azure-resource-manager/bicep/toc.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,8 @@
6767
href: ../../hdinsight/interactive-query/quickstart-bicep.md?toc=/azure/azure-resource-manager/bicep/toc.json
6868
- name: HDInsight - Kafka
6969
href: ../../hdinsight/kafka/apache-kafka-quickstart-bicep.md?toc=/azure/azure-resource-manager/bicep/toc.json
70+
- name: HDInsight - Spark
71+
href: ../../hdinsight/spark/apache-spark-jupyter-spark-use-bicep.md?toc=/azure/azure-resource-manager/bicep/toc.json
7072
- name: Compute
7173
items:
7274
- name: Batch

articles/hdinsight/TOC.yml

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -349,6 +349,9 @@ items:
349349
href: ./spark/apache-spark-jupyter-spark-sql-use-powershell.md
350350
- name: Create Apache Spark cluster - Azure CLI
351351
href: ./spark/apache-spark-create-cluster-cli.md
352+
- name: Create Apache Spark cluster - Bicep
353+
displayName: ARM, Resource Manager, Template
354+
href: ./spark/apache-spark-jupyter-spark-use-bicep.md
352355
- name: Create Apache Spark cluster - ARM Template
353356
displayName: Resource Manager
354357
href: ./spark/apache-spark-jupyter-spark-sql.md
Lines changed: 170 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,170 @@
1+
---
2+
title: 'Quickstart: Create Apache Spark cluster using Bicep - Azure HDInsight'
3+
description: This quickstart shows how to use Bicep to create an Apache Spark cluster in Azure HDInsight, and run a Spark SQL query.
4+
author: schaffererin
5+
ms.author: v-eschaffer
6+
ms.date: 05/02/2022
7+
ms.topic: quickstart
8+
ms.service: hdinsight
9+
ms.custom: subject-armqs, mode-arm
10+
#Customer intent: As a developer new to Apache Spark on Azure, I need to see how to create a Spark cluster and query some data.
11+
---
12+
13+
# Quickstart: Create Apache Spark cluster in Azure HDInsight using Bicep
14+
15+
In this quickstart, you use Bicep to create an [Apache Spark](./apache-spark-overview.md) cluster in Azure HDInsight. You then create a Jupyter Notebook file, and use it to run Spark SQL queries against Apache Hive tables. Azure HDInsight is a managed, full-spectrum, open-source analytics service for enterprises. The Apache Spark framework for HDInsight enables fast data analytics and cluster computing using in-memory processing. Jupyter Notebook lets you interact with your data, combine code with markdown text, and do simple visualizations.
16+
17+
If you're using multiple clusters together, you'll want to create a virtual network, and if you're using a Spark cluster you'll also want to use the Hive Warehouse Connector. For more information, see [Plan a virtual network for Azure HDInsight](../hdinsight-plan-virtual-network-deployment.md) and [Integrate Apache Spark and Apache Hive with the Hive Warehouse Connector](../interactive-query/apache-hive-warehouse-connector.md).
18+
19+
[!INCLUDE [About Bicep](../../../includes/resource-manager-quickstart-bicep-introduction.md)]
20+
21+
## Prerequisites
22+
23+
If you don't have an Azure subscription, create a [free account](https://azure.microsoft.com/free/?WT.mc_id=A261C142F) before you begin.
24+
25+
## Review the Bicep file
26+
27+
The Bicep file used in this quickstart is from [Azure Quickstart Templates](https://azure.microsoft.com/resources/templates/hdinsight-spark-linux/).
28+
29+
:::code language="bicep" source="~/quickstart-templates/quickstarts/microsoft.hdinsight/hdinsight-spark-linux/main.bicep":::
30+
31+
Two Azure resources are defined in the Bicep file:
32+
33+
* [Microsoft.Storage/storageAccounts](/azure/templates/microsoft.storage/storageaccounts): create an Azure Storage Account.
34+
* [Microsoft.HDInsight/cluster](/azure/templates/microsoft.hdinsight/clusters): create an HDInsight cluster.
35+
36+
## Deploy the Bicep file
37+
38+
1. Save the Bicep file as **main.bicep** to your local computer.
39+
1. Deploy the Bicep file using either Azure CLI or Azure PowerShell.
40+
41+
# [CLI](#tab/CLI)
42+
43+
```azurecli
44+
az group create --name exampleRG --location eastus
45+
az deployment group create --resource-group exampleRG --template-file main.bicep --parameters clusterName=<cluster-name> clusterLoginUserName=<cluster-username> sshUserName=<ssh-username>
46+
```
47+
48+
# [PowerShell](#tab/PowerShell)
49+
50+
```azurepowershell
51+
New-AzResourceGroup -Name exampleRG -Location eastus
52+
New-AzResourceGroupDeployment -ResourceGroupName exampleRG -TemplateFile ./main.bicep -clusterName "<cluster-name>" -clusterLoginUserName "<cluster-username>" -sshUserName "<ssh-username>"
53+
```
54+
55+
---
56+
57+
You need to provide values for the parameters:
58+
59+
* Replace **\<cluster-name\>** with the name of the HDInsight cluster to create.
60+
* Replace **\<cluster-username\>** with the credentials used to submit jobs to the cluster and to log in to cluster dashboards. The username has a minimum length of two characters and a maximum length of 20 characters. It must consist of digits, upper or lowercase letters, and/or the following special characters: (!#$%&\'()-^_`{}~).').
61+
* Replace **\<ssh-username\>** with the credentials used to remotely access the cluster. The username has a minimum length of two characters. It must consist of digits, upper or lowercase letters, and/or the following special characters: (%&\'^_`{}~). It cannot be the same as the cluster username.
62+
63+
You'll be prompted to enter the following:
64+
65+
* **clusterLoginPassword**, which must be at least 10 characters long and must contain at least one digit, one uppercase letter, one lowercase letter, and one non-alphanumeric character except single-quote, double-quote, backslash, right-bracket, full-stop. It also must not contain three consecutive characters from the cluster username or SSH username.
66+
* **sshPassword**, which must be 6-72 characters long and must contain at least one digit, one uppercase letter, and one lowercase letter. It must not contain any three consecutive characters from the cluster login name.
67+
68+
> [!NOTE]
69+
> When the deployment finishes, you should see a message indicating the deployment succeeded.
70+
71+
If you run into an issue with creating HDInsight clusters, it could be that you don't have the right permissions to do so. For more information, see [Access control requirements](../hdinsight-hadoop-customize-cluster-linux.md#access-control).
72+
73+
## Review deployed resources
74+
75+
Use the Azure portal, Azure CLI, or Azure PowerShell to list the deployed resources in the resource group.
76+
77+
# [CLI](#tab/CLI)
78+
79+
```azurecli-interactive
80+
az resource list --resource-group exampleRG
81+
```
82+
83+
# [PowerShell](#tab/PowerShell)
84+
85+
```azurepowershell-interactive
86+
Get-AzResource -ResourceGroupName exampleRG
87+
```
88+
89+
---
90+
91+
## Create a Jupyter Notebook file
92+
93+
[Jupyter Notebook](https://jupyter.org/) is an interactive notebook environment that supports various programming languages. You can use a Jupyter Notebook file to interact with your data, combine code with markdown text, and perform simple visualizations.
94+
95+
1. Open the [Azure portal](https://portal.azure.com).
96+
97+
2. Select **HDInsight clusters**, and then select the cluster you created.
98+
99+
:::image type="content" source="./media/apache-spark-jupyter-spark-sql/azure-portal-open-hdinsight-cluster.png" alt-text="Open HDInsight cluster in the Azure portal." border="true":::
100+
101+
3. From the portal, in **Cluster dashboards** section, select **Jupyter Notebook**. If prompted, enter the cluster login credentials for the cluster.
102+
103+
:::image type="content" source="./media/apache-spark-jupyter-spark-sql/hdinsight-spark-open-jupyter-interactive-spark-sql-query.png " alt-text="Open Jupyter Notebook to run interactive Spark SQL query." border="true":::
104+
105+
4. Select **New** > **PySpark** to create a notebook.
106+
107+
:::image type="content" source="./media/apache-spark-jupyter-spark-sql/hdinsight-spark-create-jupyter-interactive-spark-sql-query.png " alt-text="Create a Jupyter Notebook file to run interactive Spark SQL query." border="true":::
108+
109+
A new notebook is created and opened with the name Untitled(Untitled.pynb).
110+
111+
## Run Apache Spark SQL statements
112+
113+
SQL (Structured Query Language) is the most common and widely used language for querying and transforming data. Spark SQL functions as an extension to Apache Spark for processing structured data, using the familiar SQL syntax.
114+
115+
1. Verify the kernel is ready. The kernel is ready when you see a hollow circle next to the kernel name in the notebook. Solid circle denotes that the kernel is busy.
116+
117+
:::image type="content" source="./media/apache-spark-jupyter-spark-sql/jupyter-spark-kernel-status.png " alt-text="Screenshot showing that the kernel is ready." border="true":::
118+
119+
When you start the notebook for the first time, the kernel performs some tasks in the background. Wait for the kernel to be ready.
120+
121+
1. Paste the following code in an empty cell, and then press **SHIFT + ENTER** to run the code. The command lists the Hive tables on the cluster:
122+
123+
```sql
124+
%%sql
125+
SHOW TABLES
126+
```
127+
128+
When you use a Jupyter Notebook file with your HDInsight cluster, you get a preset `spark` session that you can use to run Hive queries using Spark SQL. `%%sql` tells Jupyter Notebook to use the preset `spark` session to run the Hive query. The query retrieves the top 10 rows from a Hive table (**hivesampletable**) that comes with all HDInsight clusters by default. The first time you submit the query, Jupyter will create a Spark application for the notebook. It takes about 30 seconds to complete. Once the Spark application is ready, the query is executed in about a second and produces the results. The output looks like:
129+
130+
:::image type="content" source="./media/apache-spark-jupyter-spark-sql/hdinsight-spark-get-started-hive-query.png " alt-text="Screenshot that shows an Apache Hive query in HDInsight." border="true":::
131+
132+
Every time you run a query in Jupyter, your web browser window title shows a **(Busy)** status along with the notebook title. You also see a solid circle next to the **PySpark** text in the top-right corner.
133+
134+
1. Run another query to see the data in `hivesampletable`.
135+
136+
```sql
137+
%%sql
138+
SELECT * FROM hivesampletable LIMIT 10
139+
```
140+
141+
The screen should refresh to show the query output.
142+
143+
:::image type="content" source="./media/apache-spark-jupyter-spark-sql/hdinsight-spark-get-started-hive-query-output.png " alt-text="Screenshot that shows Hive query output in HDInsight." border="true":::
144+
145+
1. From the **File** menu on the notebook, select **Close and Halt**. Shutting down the notebook releases the cluster resources, including Spark application.
146+
147+
## Clean up resources
148+
149+
When no longer needed, use the Azure portal, Azure CLI, or Azure PowerShell to delete the resource group and its resources.
150+
151+
# [CLI](#tab/CLI)
152+
153+
```azurecli-interactive
154+
az group delete --name exampleRG
155+
```
156+
157+
# [PowerShell](#tab/PowerShell)
158+
159+
```azurepowershell-interactive
160+
Remove-AzResourceGroup -Name exampleRG
161+
```
162+
163+
---
164+
165+
## Next steps
166+
167+
In this quickstart, you learned how to create an Apache Spark cluster in HDInsight and run a basic Spark SQL query. Advance to the next tutorial to learn how to use an HDInsight cluster to run interactive queries on sample data.
168+
169+
> [!div class="nextstepaction"]
170+
> [Run interactive queries on Apache Spark](./apache-spark-load-data-run-query.md)

articles/marketplace/azure-app-marketing.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ ms.reviewer: dannyevers
77
ms.service: marketplace
88
ms.subservice: partnercenter-marketplace-publisher
99
ms.topic: how-to
10-
ms.date: 06/01/2021
10+
ms.date: 04/29/2022
1111
---
1212

1313
# Sell an Azure Application offer

articles/network-watcher/diagnose-vm-network-traffic-filtering-problem.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -55,7 +55,7 @@ If you already have a network watcher enabled in at least one region, skip to th
5555

5656
1. In the **Home** portal, select **More services**. In the **Filter box**, enter *Network Watcher*. When **Network Watcher** appears in the results, select it.
5757
1. Enable a network watcher in the East US region, because that's the region the VM was deployed to in a previous step. Select **Add**, to expand it, and then select **Region** under **Subscription**, as shown in the following picture:
58-
:::image type="content" source="./media/diagnose-vm-network-traffic-filtering-problem/enable-network-watcher.png" alt-text="Screenshot of how to Enable Network Watcher.":::
58+
:::image type="content" source="./media/diagnose-vm-network-traffic-filtering-problem/enable-network-watcher.png" alt-text="Screenshot of how to Enable Network Watcher." lightbox="./media/diagnose-vm-network-traffic-filtering-problem/enable-network-watcher.png":::
5959
1. Select your region then select **Add**.
6060

6161
### Use IP flow verify
@@ -94,7 +94,7 @@ To determine why the rules in steps 3-5 of **Use IP flow verify** allow or deny
9494
1. In the search box at the top of the portal, enter *myvm*. When the **myvm Regular Network Interface** appears in the search results, select it.
9595
1. Select **Effective security rules** under **Support + troubleshooting**, as shown in the following picture:
9696

97-
:::image type="content" source="./media/diagnose-vm-network-traffic-filtering-problem/effective-security-rules.png" alt-text="Screenshot of Effective security rules.":::
97+
:::image type="content" source="./media/diagnose-vm-network-traffic-filtering-problem/effective-security-rules.png" alt-text="Screenshot of Effective security rules." lightbox="./media/diagnose-vm-network-traffic-filtering-problem/effective-security-rules.png" :::
9898

9999
In step 3 of **Use IP flow verify**, you learned that the reason the communication was allowed is because of the **AllowInternetOutbound** rule. You can see in the previous picture that the **Destination** for the rule is **Internet**. It's not clear how 13.107.21.200, the address you tested in step 3 of **Use IP flow verify**, relates to **Internet** though.
100100
1. Select the **AllowInternetOutBound** rule, and then scroll down to **Destination**, as shown in the following picture:
84.1 KB
Loading

articles/search/search-language-support.md

Lines changed: 26 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ author: HeidiSteen
88
ms.author: heidist
99
ms.service: cognitive-search
1010
ms.topic: conceptual
11-
ms.date: 04/21/2022
11+
ms.date: 05/03/2022
1212
---
1313

1414
# Create an index for multiple languages in Azure Cognitive Search
@@ -19,12 +19,30 @@ A multilingual search application supports searching over and retrieving results
1919

2020
+ On the query request, set the `searchFields` parameter to scope full text search to specific fields, and then use `select` to return just those fields that have compatible content.
2121

22-
The success of this technique hinges on the integrity of field content. By itself, Azure Cognitive Search does not translate strings or perform language detection as part of query execution. It's up to you to make sure that fields contain the strings you expect.
22+
The success of this technique hinges on the integrity of field content. By itself, Azure Cognitive Search doesn't translate strings or perform language detection as part of query execution. It's up to you to make sure that fields contain the strings you expect.
23+
24+
## Need text translation?
25+
26+
This article assumes you have translated strings in place. If that's not the case, you can attach Cognitive Services to an [enrichment pipeline](cognitive-search-concept-intro.md), invoking text translation during data ingestion. Text translation takes a dependency on the indexer feature and Cognitive Services, but all setup is done within Azure Cognitive Search.
27+
28+
To add text translation, follow these steps:
29+
30+
1. Verify your content is in a [supported data source](search-indexer-overview.md#supported-data-sources).
31+
32+
1. [Create a data source](search-howto-create-indexers.md#prepare-external-data) that points to your content.
33+
34+
1. [Create a skillset](cognitive-search-defining-skillset.md) that includes the [Text Translation skill](cognitive-search-skill-text-translation.md).
35+
36+
The Text Translation skill takes a single string as input. If you have multiple fields, can create a skillset that calls Text Translation multiple times, once for each field. Alternatively, you can use the [Text Merger skill](cognitive-search-skill-textmerger.md) to consolidate the content of multiple fields into one long string.
37+
38+
1. Create an index that includes fields for translated strings. Most of this article covers index design and field definitions for indexing and querying multi-language content.
39+
40+
1. [Attach a multi-region Cognitive Services resource](cognitive-search-attach-cognitive-services.md) to your skillset.
41+
42+
1. [Create and run the indexer](search-howto-create-indexers.md), and then apply the guidance in this article to query just the fields of interest.
2343

2444
> [!TIP]
25-
> If text translation is a requirement, you can [create a skillset](cognitive-search-defining-skillset.md) that adds [text translation](cognitive-search-skill-text-translation.md) to the indexing pipeline. This approach requires [using an indexer](search-howto-create-indexers.md) and [attaching a Cognitive Services resource](cognitive-search-attach-cognitive-services.md).
26-
>
27-
> Text translation is built into the [Import data wizard](cognitive-search-quickstart-blob.md). If you have a [supported data source](search-indexer-overview.md#supported-data-sources) with text you'd like to translate, you can step through the wizard to try out the language detection and translation functionality.
45+
> Text translation is built into the [Import data wizard](cognitive-search-quickstart-blob.md). If you have a [supported data source](search-indexer-overview.md#supported-data-sources) with text you'd like to translate, you can step through the wizard to try out the language detection and translation functionality before writing any code.
2846
2947
## Define fields for content in different languages
3048

@@ -54,7 +72,7 @@ The "analyzer" property on a field definition is used to set the [language analy
5472

5573
## Build and load an index
5674

57-
An intermediate (and perhaps obvious) step is that you have to [build and populate the index](search-get-started-dotnet.md) before formulating a query. We mention this step here for completeness. One way to determine index availability is by checking the indexes list in the [portal](https://portal.azure.com).
75+
An intermediate step is [building and populating the index](search-get-started-dotnet.md) before formulating a query. We mention this step here for completeness. One way to determine index availability is by checking the indexes list in the [portal](https://portal.azure.com).
5876

5977
## Constrain the query and trim results
6078

@@ -67,7 +85,7 @@ Parameters on the query are used to limit search to specific fields and then tri
6785

6886
Given a goal of constraining search to fields containing French strings, you would use **searchFields** to target the query at fields containing strings in that language.
6987

70-
Specifying the analyzer on a query request is not necessary. A language analyzer on the field definition will always be used during query processing. For queries that specify multiple fields invoking different language analyzers, the terms or phrases will be processed independently by the assigned analyzers for each field.
88+
Specifying the analyzer on a query request isn't necessary. A language analyzer on the field definition will always be used during query processing. For queries that specify multiple fields invoking different language analyzers, the terms or phrases will be processed independently by the assigned analyzers for each field.
7189

7290
By default, a search returns all fields that are marked as retrievable. As such, you might want to exclude fields that don't conform to the language-specific search experience you want to provide. Specifically, if you limited search to a field with French strings, you probably want to exclude fields with English strings from your results. Using the **$select** query parameter gives you control over which fields are returned to the calling application.
7391

@@ -111,7 +129,7 @@ private static void RunQueries(SearchClient srchclient)
111129

112130
## Boost language-specific fields
113131

114-
Sometimes the language of the agent issuing a query is not known, in which case the query can be issued against all fields simultaneously. IA preference for results in a certain language can be defined using [scoring profiles](index-add-scoring-profiles.md). In the example below, matches found in the description in English will be scored higher relative to matches in other languages:
132+
Sometimes the language of the agent issuing a query isn't known, in which case the query can be issued against all fields simultaneously. IA preference for results in a certain language can be defined using [scoring profiles](index-add-scoring-profiles.md). In the example below, matches found in the description in English will be scored higher relative to matches in other languages:
115133

116134
```JSON
117135
"scoringProfiles": [

0 commit comments

Comments
 (0)