review edits

GitHubber17 · GitHubber17 · commit ad03a73272d8 · 2023-08-30T12:56:54.000-07:00
diff --git a/articles/ai-services/openai/how-to/integrate-synapseml.md b/articles/ai-services/openai/how-to/integrate-synapseml.md
@@ -16,21 +16,26 @@ recommendations: false
 
 # Use Azure OpenAI with large datasets
 
-Azure OpenAI can be used to solve a large number of natural language tasks through prompting the completion API. To make it easier to scale your prompting workflows from a few examples to large datasets of examples, Azure OpenAI Service is integrated with the distributed machine learning library [SynapseML](https://www.microsoft.com/research/blog/synapseml-a-simple-multilingual-and-massively-parallel-machine-learning-library/). This integration makes it easy to use the [Apache Spark](https://spark.apache.org/) distributed computing framework to process millions of prompts with Azure OpenAI Service. This tutorial shows how to apply large language models at a distributed scale by using Azure OpenAI and Azure Synapse Analytics.
+Azure OpenAI can be used to solve a large number of natural language tasks through prompting the completion API. To make it easier to scale your prompting workflows from a few examples to large datasets of examples, Azure OpenAI Service is integrated with the distributed machine learning library [SynapseML](https://www.microsoft.com/research/blog/synapseml-a-simple-multilingual-and-massively-parallel-machine-learning-library/). This integration makes it easy to use the [Apache Spark](https://spark.apache.org/) distributed computing framework to process millions of prompts with Azure OpenAI Service.
+
+This tutorial shows how to apply large language models at a distributed scale by using Azure OpenAI and Azure Synapse Analytics.
 
 ## Prerequisites
 
 - An Azure subscription. <a href="https://azure.microsoft.com/free/cognitive-services" target="_blank">Create one for free</a>.
-- Access granted to Azure OpenAI in the desired Azure subscription.
+
+- Access granted to Azure OpenAI in your Azure subscription.
+
 - An Azure OpenAI resource. [Create a resource](create-resource.md?pivots=web-portal#create-a-resource).
+
 - An Apache Spark cluster with SynapseML installed.
    - Create a [serverless Apache Spark pool](../../../synapse-analytics/get-started-analyze-spark.md#create-a-serverless-apache-spark-pool).
    - To install SynapseML for your Apache Spark cluster, see [Install SynapseML](#step-3-install-synapseml).
 
 > [!NOTE]
 > Currently, you must submit an application to access Azure OpenAI Service. To apply for access, complete <a href="https://aka.ms/oai/access" target="_blank">this form</a>. If you need assistance, open an issue on this repo to contact Microsoft.
 
-Microsoft recommends that you [create an Azure Synapse workspace](../../../synapse-analytics/get-started-create-workspace.md). However, you can also use Azure Databricks, Azure HDInsight, Spark on Kubernetes, or the Python environment with the `pyspark` package.
+We recommend that you [create an Azure Synapse workspace](../../../synapse-analytics/get-started-create-workspace.md). However, you can also use Azure Databricks, Azure HDInsight, Spark on Kubernetes, or the Python environment with the `pyspark` package.
 
 ## Use example code as a notebook
 
@@ -41,25 +46,23 @@ To use the example code in this article with your Apache Spark cluster, complete
 1. Install SynapseML for your Apache Spark cluster in your notebook.
 1. Configure the notebook to work with your Azure OpenAI service resource.
 
-### Step 1: Prepare your notebook
-
-You can create a new notebook in your Apache Spark platform, or you can download an existing notebook and import it into Azure Synapse. You can add each snippet of example code in this article as a new cell in your notebook.
+### Prepare your notebook
 
-#### (Optional) Download demonstration notebook
+You can create a new notebook in your Apache Spark platform, or you can import an existing notebook. After you have a notebook in place, you can add each snippet of example code in this article as a new cell in your notebook.
 
-As an option, you can download a demonstration notebook and connect it with your workspace.
+- To use a notebook in Azure Synapse Analytics, see [Create, develop, and maintain Synapse notebooks in Azure Synapse Analytics](../../../synapse-analytics/spark/apache-spark-development-using-notebooks.md).
 
-1. Download [this demonstration notebook](https://github.com/microsoft/SynapseML/blob/master/docs/Explore%20Algorithms/OpenAI/OpenAI.ipynb). During the download process, select **Raw**, and then save the file. 
+- To use a notebook in Azure Databricks, see [Manage notebooks for Azure Databricks](/azure/databricks/notebooks/notebooks-manage.md).
 
-1. Import the notebook [into the Synapse Workspace](../../../synapse-analytics/spark/apache-spark-development-using-notebooks.md#create-a-notebook), or if you're using Azure Databricks, import the notebook [into the Azure Databricks Workspace](/azure/databricks/notebooks/notebooks-manage#create-a-notebook).
+- (Optional) Download [this demonstration notebook](https://github.com/microsoft/SynapseML/blob/master/docs/Explore%20Algorithms/OpenAI/OpenAI.ipynb) and connect it with your workspace. During the download process, select **Raw**, and then save the file. 
 
-### Step 2: Connect your cluster
+### Connect your cluster
 
 When you have a notebook ready, connect or _attach_ your notebook to an Apache Spark cluster.
 
-### Step 3: Install SynapseML
+### Install SynapseML
 
-To run the exercises, you need to install SynapseML on your Apache Spark cluster. For more information about the installation process, see the link for Azure Synapse at the bottom of [the SynapseML website](https://microsoft.github.io/SynapseML/).
+To run the exercises, you need to install SynapseML on your Apache Spark cluster. For more information, see [Install SynapseML](https://microsoft.github.io/SynapseML/docs/Get%20Started/Install%20SynapseML/) on the [SynapseML website](https://microsoft.github.io/SynapseML/).
 
 To install SynapseML, create a new cell at the top of your notebook and run the following code.
 
@@ -98,7 +101,7 @@ To install SynapseML, create a new cell at the top of your notebook and run the
 
 The connection process can take several minutes.
 
-### Step 4: Configure the notebook
+### Configure the notebook
 
 Create a new code cell and run the following code to configure the notebook for your service. Set the `resource_name`, `deployment_name`, `location`, and `key` variables to the corresponding values for your Azure OpenAI resource.
 
@@ -138,7 +141,9 @@ df = spark.createDataFrame(
 
 ## Create the OpenAICompletion Apache Spark client
 
-To apply the Azure OpenAI Completion service to the dataframe, create an `OpenAICompletion` object that serves as a distributed client. Parameters of the service can be set either with a single value, or by a column of the dataframe with the appropriate setters on the `OpenAICompletion` object. In this example, you set the `maxTokens` parameter to 200. A token is around four characters, and this limit applies to the sum of the prompt and the result. You also set the `promptCol` parameter with the name of the prompt column in the dataframe.
+To apply the Azure OpenAI Completion service to the dataframe, create an `OpenAICompletion` object that serves as a distributed client. Parameters of the service can be set either with a single value, or by a column of the dataframe with the appropriate setters on the `OpenAICompletion` object.
+
+In this example, you set the `maxTokens` parameter to 200. A token is around four characters, and this limit applies to the sum of the prompt and the result. You also set the `promptCol` parameter with the name of the prompt column in the dataframe.
 
 ```python
 from synapse.ml.cognitive import OpenAICompletion
@@ -157,7 +162,7 @@ completion = (
 
 ## Transform the dataframe with the OpenAICompletion client
 
-After you have the dataframe and completion client, you can transform your input dataset and add a column called `completions` with all of the information the service adds. In this example, you select only the text for simplicity.
+After you have the dataframe and completion client, you can transform your input dataset and add a column called `completions` with all of the information the service adds. In this example, select only the text for simplicity.
 
 ```python
 from pyspark.sql.functions import col
@@ -167,22 +172,22 @@ display(completed_df.select(
   col("prompt"), col("error"), col("completions.choices.text").getItem(0).alias("text")))
 ```
 
-The following image shows example output with completions in Azure Synapse Analytics Studio. Keep in mind that completions text can vary so your output might look different.
+The following image shows example output with completions in Azure Synapse Analytics Studio. Keep in mind that completions text can vary. Your output might look different.
 
 :::image type="content" source="../media/how-to/synapse-studio-transform-dataframe-output.png" alt-text="Screenshot that shows sample completions in Azure Synapse Analytics Studio." border="false":::
 
 ## Explore other usage scenarios
 
-Let's review some other use case scenarios for working with Azure OpenAI Service and large datasets.
+Here are some other use cases for working with Azure OpenAI Service and large datasets.
 
 ### Improve throughput with request batching
 
 You can use Azure OpenAI Service with large datasets to improve throughput with request batching. In the previous example, you make several requests to the service, one for each prompt. To complete multiple prompts in a single request, you can use batch mode.
 
-In the `OpenAICompletion` object, instead of setting the **Prompt** column to "prompt," you can specify "batchPrompt" to create the **BatchPrompt** column. To support this method, you create a dataframe with a list of prompts per row.
+In the `OpenAICompletion` object, instead of setting the **Prompt** column to `"prompt"`, you can specify `"batchPrompt"` to create the **batchPrompt** column. To support this method, create a dataframe with a list of prompts per row.
 
 > [!NOTE]
-> There's currently a limit of 20 prompts in a single request and a limit of 2048 "tokens," or approximately 1500 words.
+> There's currently a limit of 20 prompts in a single request and a limit of 2048 tokens, or approximately 1500 words.
 
 ```python
 batch_df = spark.createDataFrame(
@@ -193,7 +198,7 @@ batch_df = spark.createDataFrame(
 ).toDF("batchPrompt")
 ```
 
-Next, you create the `OpenAICompletion` object. Rather than setting the "prompt" column, you set the "batchPrompt" column if your column is of type `Array[String]`.
+Next, create the `OpenAICompletion` object. Rather than setting the `"prompt"` column, set the `"batchPrompt"` column if your column is of type `Array[String]`.
 
 ```python
 batch_completion = (
@@ -220,7 +225,7 @@ The following image shows example output with completions for multiple prompts i
 :::image type="content" source="../media/how-to/synapse-studio-request-batch-output.png" alt-text="Screenshot that shows completions for multiple prompts in a single request in Azure Synapse Analytics Studio." border="false":::
 
 > [!NOTE]
-> There's currently a limit of 20 prompts in a single request and a limit of 2048 "tokens," or approximately 1500 words.
+> There's currently a limit of 20 prompts in a single request and a limit of 2048 tokens, or approximately 1500 words.
 
 ### Use an automatic mini-batcher
 
@@ -246,7 +251,7 @@ The following image shows example output for an automatic mini-batcher that tran
 
 ### Prompt engineering for translation
 
-Azure OpenAI can solve many different natural language tasks through [prompt engineering](completions.md). In this example, you can prompt for language translation:
+Azure OpenAI can solve many different natural language tasks through _prompt engineering_. For more information, see [Learn how to generate or manipulate text](completions.md). In this example, you can prompt for language translation:
 
 ```python
 translate_df = spark.createDataFrame(