Skip to content

Commit 8ad1ce9

Browse files
committed
edits
1 parent ad03a73 commit 8ad1ce9

File tree

1 file changed

+9
-6
lines changed

1 file changed

+9
-6
lines changed

articles/ai-services/openai/how-to/integrate-synapseml.md

Lines changed: 9 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ ms.service: cognitive-services
88
ms.subservice: openai
99
ms.custom: build-2023, build-2023-dataai
1010
ms.topic: how-to
11-
ms.date: 08/29/2023
11+
ms.date: 08/30/2023
1212
author: ChrisHMSFT
1313
ms.author: chrhoder
1414
recommendations: false
@@ -30,7 +30,7 @@ This tutorial shows how to apply large language models at a distributed scale by
3030

3131
- An Apache Spark cluster with SynapseML installed.
3232
- Create a [serverless Apache Spark pool](../../../synapse-analytics/get-started-analyze-spark.md#create-a-serverless-apache-spark-pool).
33-
- To install SynapseML for your Apache Spark cluster, see [Install SynapseML](#step-3-install-synapseml).
33+
- To install SynapseML for your Apache Spark cluster, see [Install SynapseML](#install-synapseml).
3434

3535
> [!NOTE]
3636
> Currently, you must submit an application to access Azure OpenAI Service. To apply for access, complete <a href="https://aka.ms/oai/access" target="_blank">this form</a>. If you need assistance, open an issue on this repo to contact Microsoft.
@@ -42,8 +42,11 @@ We recommend that you [create an Azure Synapse workspace](../../../synapse-analy
4242
To use the example code in this article with your Apache Spark cluster, complete the following steps:
4343

4444
1. Prepare a new or existing notebook.
45+
4546
1. Connect your Apache Spark cluster with your notebook.
47+
4648
1. Install SynapseML for your Apache Spark cluster in your notebook.
49+
4750
1. Configure the notebook to work with your Azure OpenAI service resource.
4851

4952
### Prepare your notebook
@@ -52,7 +55,7 @@ You can create a new notebook in your Apache Spark platform, or you can import a
5255

5356
- To use a notebook in Azure Synapse Analytics, see [Create, develop, and maintain Synapse notebooks in Azure Synapse Analytics](../../../synapse-analytics/spark/apache-spark-development-using-notebooks.md).
5457

55-
- To use a notebook in Azure Databricks, see [Manage notebooks for Azure Databricks](/azure/databricks/notebooks/notebooks-manage.md).
58+
- To use a notebook in Azure Databricks, see [Manage notebooks for Azure Databricks](/azure/databricks/notebooks/notebooks-manage).
5659

5760
- (Optional) Download [this demonstration notebook](https://github.com/microsoft/SynapseML/blob/master/docs/Explore%20Algorithms/OpenAI/OpenAI.ipynb) and connect it with your workspace. During the download process, select **Raw**, and then save the file.
5861

@@ -143,7 +146,7 @@ df = spark.createDataFrame(
143146

144147
To apply the Azure OpenAI Completion service to the dataframe, create an `OpenAICompletion` object that serves as a distributed client. Parameters of the service can be set either with a single value, or by a column of the dataframe with the appropriate setters on the `OpenAICompletion` object.
145148

146-
In this example, you set the `maxTokens` parameter to 200. A token is around four characters, and this limit applies to the sum of the prompt and the result. You also set the `promptCol` parameter with the name of the prompt column in the dataframe.
149+
In this example, you set the `maxTokens` parameter to 200. A token is around four characters, and this limit applies to the sum of the prompt and the result. You also set the `promptCol` parameter with the name of the prompt column in the dataframe, such as **prompt**.
147150

148151
```python
149152
from synapse.ml.cognitive import OpenAICompletion
@@ -184,7 +187,7 @@ Here are some other use cases for working with Azure OpenAI Service and large da
184187

185188
You can use Azure OpenAI Service with large datasets to improve throughput with request batching. In the previous example, you make several requests to the service, one for each prompt. To complete multiple prompts in a single request, you can use batch mode.
186189

187-
In the `OpenAICompletion` object, instead of setting the **Prompt** column to `"prompt"`, you can specify `"batchPrompt"` to create the **batchPrompt** column. To support this method, create a dataframe with a list of prompts per row.
190+
In the `OpenAICompletion` object definition, you specify the `"batchPrompt"` value to configure the dataframe to use a **batchPrompt** column. Create the dataframe with a list of prompts for each row.
188191

189192
> [!NOTE]
190193
> There's currently a limit of 20 prompts in a single request and a limit of 2048 tokens, or approximately 1500 words.
@@ -198,7 +201,7 @@ batch_df = spark.createDataFrame(
198201
).toDF("batchPrompt")
199202
```
200203

201-
Next, create the `OpenAICompletion` object. Rather than setting the `"prompt"` column, set the `"batchPrompt"` column if your column is of type `Array[String]`.
204+
Next, create the `OpenAICompletion` object. If your column is of type `Array[String]`, set the `batchPromptCol` value for the column heading, rather than the `promptCol` value.
202205

203206
```python
204207
batch_completion = (

0 commit comments

Comments
 (0)