MicrosoftDocs
diff --git a/‎articles/search/.openpublishing.redirection.search.json
Lines changed: 5 additions & 0 deletions b/‎articles/search/.openpublishing.redirection.search.json
Lines changed: 5 additions & 0 deletions
diff --git a/‎articles/search/TOC.yml
Lines changed: 6 additions & 10 deletions b/‎articles/search/TOC.yml
Lines changed: 6 additions & 10 deletions
diff --git a/‎articles/search/media/search-synapseml-cognitive-services/create-notebook.png
-12.4 KB b/‎articles/search/media/search-synapseml-cognitive-services/create-notebook.png
-12.4 KB
diff --git a/‎articles/search/media/search-synapseml-cognitive-services/install-library-from-maven.png
37.8 KB b/‎articles/search/media/search-synapseml-cognitive-services/install-library-from-maven.png
37.8 KB
diff --git a/‎articles/search/samples-dotnet.md
Lines changed: 0 additions & 1 deletion b/‎articles/search/samples-dotnet.md
Lines changed: 0 additions & 1 deletion
diff --git a/‎articles/search/search-synapseml-cognitive-services.md
Lines changed: 29 additions & 27 deletions b/‎articles/search/search-synapseml-cognitive-services.md
Lines changed: 29 additions & 27 deletions
@@ -1,5 +1,10 @@
 {
     "redirections": [
+        {
+            "source_path_from_root": "/articles/search/search-synonyms-tutorial-sdk.md",
+            "redirect_url": "https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/search/Azure.Search.Documents/samples/Sample02_Service.md#create-a-synonym-map",
+            "redirect_document_id": false
+        },
         {
             "source_path_from_root": "/articles/search/search-case-studies.md",
             "redirect_url": "https://azure.microsoft.com/case-studies",
 
@@ -346,13 +346,15 @@
       href: semantic-how-to-query-request.md
     - name: Typeahead query
       href: search-add-autocomplete-suggestions.md
-    - name: Use simple syntax (examples)
+    - name: Quety examples (simple syntax)
       href: search-query-simple-examples.md
-    - name: Add spell check to queries
+    - name: Add spell check
       href: speller-how-to-add.md
-    - name: Configure a suggester for typeahead
+    - name: Add synonyms
+      href: search-synonyms.md
+    - name: Add a suggester for typeahead
       href: index-add-suggesters.md
-    - name: Design a multi-language index
+    - name: Design a multilingual index
       href: search-language-support.md
     - name: Model complex data types
       href: search-howto-complex-data-types.md
@@ -366,12 +368,6 @@
         href: index-add-language-analyzers.md
       - name: Add a custom analyzer
         href: index-add-custom-analyzers.md
-    - name: Synonyms
-      items:
-      - name: Add synonyms
-        href: search-synonyms.md
-      - name: Synonyms C# example
-        href: search-synonyms-tutorial-sdk.md
     - name: Filters
       items:
       - name: Filters in text queries
 
@@ -58,7 +58,6 @@ Code samples from the Azure AI Search team demonstrate features and workflows. A
 | [multiple-data-sources](https://github.com/Azure-Samples/azure-search-dotnet-scale/tree/main/multiple-data-sources)  | [Tutorial: Index from multiple data sources](tutorial-multiple-data-sources.md). | Merges content from two data sources into one search index.
 | [Optimize-data-indexing](https://github.com/Azure-Samples/azure-search-dotnet-scale/tree/main/optimize-data-indexing) | [Tutorial: Optimize indexing with the push API](tutorial-optimize-indexing-push-api.md).| Demonstrates optimization techniques for pushing data into a search index. |
 | [DotNetHowTo](https://github.com/Azure-Samples/search-dotnet-getting-started/tree/master/DotNetHowTo)  | [How to use the .NET client library](search-howto-dotnet-sdk.md) | Steps through the basic workflow, but in more detail and with discussion of API usage.  |
-| [DotNetHowToSynonyms](https://github.com/Azure-Samples/search-dotnet-getting-started/tree/master/DotNetHowToSynonyms)  | [Example: Add synonyms in C#](search-synonyms-tutorial-sdk.md) | Synonym lists are used for query expansion, providing matchable  terms that are external to an index. |
 | [DotNetToIndexers](https://github.com/Azure-Samples/search-dotnet-getting-started/tree/master/DotNetHowToIndexers) | [Tutorial: Index Azure SQL data](search-indexer-tutorial.md) | Shows how to configure an Azure SQL indexer that has a schedule, field mappings, and parameters.  |
 | [DotNetHowToEncryptionUsingCMK](https://github.com/Azure-Samples/search-dotnet-getting-started/tree/master/DotNetHowToEncryptionUsingCMK)  | [How to configure customer-managed keys for data encryption](search-security-manage-encryption-keys.md) | Shows how to create objects that are encrypted with a Customer Key. |
 | [DotNetVectorDemo](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-dotnet/DotNetVectorDemo)  | [readme](https://github.com/Azure/azure-search-vector-samples/tree/main/demo-dotnet/DotNetVectorDemo/readme.md) | Create, load, and query a vector index. |
 
@@ -1,7 +1,7 @@
 ---
 title: 'Tutorial: Index at scale (Spark)'
 titleSuffix: Azure AI Search
-description: Search big data from Apache Spark that's been transformed by SynapseML. You'll load invoices into data frames, apply machine learning, and then send output to a generated search index.
+description: Search big data from Apache Spark that's been transformed by SynapseML. Load invoices into data frames, apply machine learning, and then send output to a generated search index.
 
 manager: nitinme
 author: HeidiSteen
@@ -10,12 +10,12 @@ ms.service: cognitive-search
 ms.custom:
   - ignite-2023
 ms.topic: tutorial
-ms.date: 02/01/2023
+ms.date: 04/22/2024
 ---
 
 # Tutorial: Index large data from Apache Spark using SynapseML and Azure AI Search
 
-In this Azure AI Search tutorial, learn how to index and query large data loaded from a Spark cluster. You'll set up a Jupyter Notebook that performs the following actions:
+In this Azure AI Search tutorial, learn how to index and query large data loaded from a Spark cluster. Set up a Jupyter Notebook that performs the following actions:
 
 > [!div class="checklist"]
 > + Load various forms (invoices) into a data frame in an Apache Spark session
@@ -24,7 +24,7 @@ In this Azure AI Search tutorial, learn how to index and query large data loaded
 > + Write the output to a search index hosted in Azure AI Search
 > + Explore and query over the content you created
 
-This tutorial takes a dependency on [SynapseML](https://www.microsoft.com/research/blog/synapseml-a-simple-multilingual-and-massively-parallel-machine-learning-library/), an open source library that supports massively parallel machine learning over big data. In SynapseML, search indexing and machine learning are exposed through *transformers* that perform specialized tasks. Transformers tap into a wide range of AI capabilities. In this exercise, you'll use the **AzureSearchWriter** APIs for analysis and AI enrichment.
+This tutorial takes a dependency on [SynapseML](https://www.microsoft.com/research/blog/synapseml-a-simple-multilingual-and-massively-parallel-machine-learning-library/), an open source library that supports massively parallel machine learning over big data. In SynapseML, search indexing and machine learning are exposed through *transformers* that perform specialized tasks. Transformers tap into a wide range of AI capabilities. In this exercise, use the **AzureSearchWriter** APIs for analysis and AI enrichment.
 
 Although Azure AI Search has native [AI enrichment](cognitive-search-concept-intro.md), this tutorial shows you how to access AI capabilities outside of Azure AI Search. By using SynapseML instead of indexers or skills, you're not subject to data limits or other constraints associated with those objects.
 
@@ -33,7 +33,7 @@ Although Azure AI Search has native [AI enrichment](cognitive-search-concept-int
 
 ## Prerequisites
 
-You'll need the `synapseml` library and several Azure resources. If possible, use the same subscription and region for your Azure resources and put everything into one resource group for simple cleanup later. The following links are for portal installs. The sample data is imported from a public site.
+You need the `synapseml` library and several Azure resources. If possible, use the same subscription and region for your Azure resources and put everything into one resource group for simple cleanup later. The following links are for portal installs. The sample data is imported from a public site.
 
 + [SynapseML package](https://microsoft.github.io/SynapseML/docs/Get%20Started/Install%20SynapseML/#python) <sup>1</sup> 
 + [Azure AI Search](search-create-service-portal.md) (any tier) <sup>2</sup> 
@@ -42,38 +42,38 @@ You'll need the `synapseml` library and several Azure resources. If possible, us
 
 <sup>1</sup> This link resolves to a tutorial for loading the package.
 
-<sup>2</sup> You can use the free search tier to index the sample data, but [choose a higher tier](search-sku-tier.md) if your data volumes are large. For non-free tiers, you'll need to provide the [search API key](search-security-api-keys.md#find-existing-keys) in the [Set up dependencies](#2---set-up-dependencies) step further on.
+<sup>2</sup> You can use the free search tier to index the sample data, but [choose a higher tier](search-sku-tier.md) if your data volumes are large. For billable tiers, provide the [search API key](search-security-api-keys.md#find-existing-keys) in the [Set up dependencies](#step-2-set-up-dependencies) step further on.
 
-<sup>3</sup> This tutorial uses Azure AI Document Intelligence and Azure AI Translator. In the instructions that follow, you'll provide a [multi-service key](../ai-services/multi-service-resource.md?pivots=azportal#get-the-keys-for-your-resource) and the region, and it will work for both services.
+<sup>3</sup> This tutorial uses Azure AI Document Intelligence and Azure AI Translator. In the instructions that follow, provide a [multi-service key](../ai-services/multi-service-resource.md?pivots=azportal#get-the-keys-for-your-resource) and the region. The same key works for both services.
 
-<sup>4</sup> In this tutorial, Azure Databricks provides the Spark computing platform and the instructions in the link will tell you how to set up the workspace. For this tutorial, we used the portal steps in "Create a workspace".
+<sup>4</sup> In this tutorial, Azure Databricks provides the Spark computing platform. We used the [portal instructions](/azure/databricks/scenarios/quickstart-create-databricks-workspace-portal?tabs=azure-portal) to set up the workspace.
 
 > [!NOTE]
 > All of the above Azure resources support security features in the Microsoft Identity platform. For simplicity, this tutorial assumes key-based authentication, using endpoints and keys copied from the portal pages of each service. If you implement this workflow in a production environment, or share the solution with others, remember to replace hard-coded keys with integrated security or encrypted keys.
 
-## 1 - Create a Spark cluster and notebook
+## Step 1: Create a Spark cluster and notebook
 
-In this section, you'll create a cluster, install the `synapseml` library, and create a notebook to run the code.
+In this section, create a cluster, install the `synapseml` library, and create a notebook to run the code.
 
 1. In Azure portal, find your Azure Databricks workspace and select **Launch workspace**.
 
 1. On the left menu, select **Compute**.
 
-1. Select **Create cluster**.
+1. Select **Create compute**.
 
-1. Give the cluster a name, accept the default configuration, and then create the cluster. It takes several minutes to create the cluster.
+1. Accept the default configuration. It takes several minutes to create the cluster.
 
 1. Install the `synapseml` library after the cluster is created:
 
-   1. Select **Library** from the tabs at the top of the cluster's page.
+   1. Select **Libraries** from the tabs at the top of the cluster's page.
 
    1. Select **Install new**.
 
       :::image type="content" source="media/search-synapseml-cognitive-services/install-library.png" alt-text="Screenshot of the Install New command." border="true":::
 
    1. Select **Maven**.
 
-   1. In Coordinates, enter `com.microsoft.azure:synapseml_2.12:0.10.0`
+   1. In Coordinates, enter `com.microsoft.azure:synapseml_2.12:1.0.4`
 
    1. Select **Install**.
 
@@ -85,13 +85,15 @@ In this section, you'll create a cluster, install the `synapseml` library, and c
 
 1. Give the notebook a name, select **Python** as the default language, and select the cluster that has the `synapseml` library.
 
-1. Create seven consecutive cells. You'll paste code into each one.
+1. Create seven consecutive cells. Paste code into each one.
 
    :::image type="content" source="media/search-synapseml-cognitive-services/create-seven-cells.png" alt-text="Screenshot of the notebook with placeholder cells." border="true":::
 
-## 2 - Set up dependencies
+## Step 2: Set up dependencies
 
-Paste the following code into the first cell of your notebook. Replace the placeholders with endpoints and access keys for each resource. No other modifications are required, so run the code when you're ready.
+Paste the following code into the first cell of your notebook. 
+
+Replace the placeholders with endpoints and access keys for each resource. Provide a name for a new search index. No other modifications are required, so run the code when you're ready.
 
 This code imports multiple packages and sets up access to the Azure resources used in this workflow.
 
@@ -109,11 +111,11 @@ search_key = "placeholder-search-service-api-key"
 search_index = "placeholder-search-index-name"
 ```
 
-## 3 - Load data into Spark
+## Step 3: Load data into Spark
 
 Paste the following code into the second cell. No modifications are required, so run the code when you're ready.
 
-This code loads a few external files from an Azure storage account that's used for demo purposes. The files are various invoices, and they're read into a data frame.
+This code loads a few external files from an Azure storage account. The files are various invoices, and they're read into a data frame.
 
 ```python
 def blob_to_url(blob):
@@ -135,11 +137,11 @@ df2 = (spark.read.format("binaryFile")
 display(df2)
 ```
 
-## 4 - Add document intelligence
+## Step 4: Add document intelligence
 
 Paste the following code into the third cell. No modifications are required, so run the code when you're ready.
 
-This code loads the [AnalyzeInvoices transformer](https://mmlspark.blob.core.windows.net/docs/0.11.2/pyspark/synapse.ml.cognitive.form.html#module-synapse.ml.cognitive.form.AnalyzeInvoices) and passes a reference to the data frame containing the invoices. It calls the pre-built [invoice model](../ai-services/document-intelligence/concept-invoice.md) of Azure AI Document Intelligence to extract information from the invoices.
+This code loads the [AnalyzeInvoices transformer](https://mmlspark.blob.core.windows.net/docs/0.11.2/pyspark/synapse.ml.cognitive.form.html#module-synapse.ml.cognitive.form.AnalyzeInvoices) and passes a reference to the data frame containing the invoices. It calls the prebuilt [invoice model](../ai-services/document-intelligence/concept-invoice.md) of Azure AI Document Intelligence to extract information from the invoices.
 
 ```python
 from synapse.ml.cognitive import AnalyzeInvoices
@@ -161,7 +163,7 @@ The output from this step should look similar to the next screenshot. Notice how
 
 :::image type="content" source="media/search-synapseml-cognitive-services/analyze-forms-output.png" alt-text="Screenshot of the AnalyzeInvoices output." border="true":::
 
-## 5 - Restructure document intelligence output
+## Step 5: Restructure document intelligence output
 
 Paste the following code into the fourth cell and run it. No modifications are required.
 
@@ -183,11 +185,11 @@ itemized_df = (FormOntologyLearner()
 display(itemized_df)
 ```
 
-Notice how this transformation recasts the nested fields into a table, which enables the next two transformations. This screenshot is trimmed for brevity. If you're following along in your own notebook, you'll have 19 columns and 26 rows.
+Notice how this transformation recasts the nested fields into a table, which enables the next two transformations. This screenshot is trimmed for brevity. If you're following along in your own notebook, you have 19 columns and 26 rows.
 
 :::image type="content" source="media/search-synapseml-cognitive-services/form-ontology-learner-output.png" alt-text="Screenshot of the FormOntologyLearner output." border="true":::
 
-## 6 - Add translations
+## Step 6: Add translations
 
 Paste the following code into the fifth cell. No modifications are required, so run the code when you're ready.
 
@@ -217,11 +219,11 @@ display(translated_df)
 > 
 > :::image type="content" source="media/search-synapseml-cognitive-services/translated-strings.png" alt-text="Screenshot of table output, showing the Translations column." border="true":::
 
-## 7 - Add a search index with AzureSearchWriter
+## Step 7: Add a search index with AzureSearchWriter
 
 Paste the following code in the sixth cell and then run it. No modifications are required.
 
-This code loads [AzureSearchWriter](https://microsoft.github.io/SynapseML/docs/Explore%20Algorithms/AI%20Services/Overview/#azure-cognitive-search-sample). It consumes a tabular dataset and infers a search index schema that defines one field for each column. The translations structure is an array, so it's articulated in the index as a complex collection with subfields for each language translation. The generated index will have a document key and use the default values for fields created using the [Create Index REST API](/rest/api/searchservice/create-index).
+This code loads [AzureSearchWriter](https://microsoft.github.io/SynapseML/docs/Explore%20Algorithms/AI%20Services/Overview/#azure-cognitive-search-sample). It consumes a tabular dataset and infers a search index schema that defines one field for each column. Because the translations structure is an array, it's articulated in the index as a complex collection with subfields for each language translation. The generated index has a document key and use the default values for fields created using the [Create Index REST API](/rest/api/searchservice/create-index).
 
 ```python
 from synapse.ml.cognitive import *
@@ -242,7 +244,7 @@ You can check the search service pages in Azure portal to explore the index defi
 > [!NOTE]
 > If you can't use default search index, you can provide an external custom definition in JSON, passing its URI as a string in the "indexJson" property. Generate the default index first so that you know which fields to specify, and then follow with customized properties if you need specific analyzers, for example.
 
-## 8 - Query the index
+## Step 8: Query the index
 
 Paste the following code into the seventh cell and then run it. No modifications are required, except that you might want to vary the syntax or try more examples to further explore your content:
Original file line number	Diff line number	Diff line change
`@@ -1,5 +1,10 @@`
`1`	`1`	`{`
`2`	`2`	`"redirections": [`
	`3`	`+ {`
	`4`	`+ "source_path_from_root": "/articles/search/search-synonyms-tutorial-sdk.md",`
	`5`	`+ "redirect_url": "https://github.com/Azure/azure-sdk-for-net/blob/main/sdk/search/Azure.Search.Documents/samples/Sample02_Service.md#create-a-synonym-map",`
	`6`	`+ "redirect_document_id": false`
	`7`	`+ },`
`3`	`8`	`{`
`4`	`9`	`"source_path_from_root": "/articles/search/search-case-studies.md",`
`5`	`10`	`"redirect_url": "https://azure.microsoft.com/case-studies",`