You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/integrate-synapseml.md
+9-6Lines changed: 9 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -8,7 +8,7 @@ ms.service: cognitive-services
8
8
ms.subservice: openai
9
9
ms.custom: build-2023, build-2023-dataai
10
10
ms.topic: how-to
11
-
ms.date: 08/29/2023
11
+
ms.date: 08/30/2023
12
12
author: ChrisHMSFT
13
13
ms.author: chrhoder
14
14
recommendations: false
@@ -30,7 +30,7 @@ This tutorial shows how to apply large language models at a distributed scale by
30
30
31
31
- An Apache Spark cluster with SynapseML installed.
32
32
- Create a [serverless Apache Spark pool](../../../synapse-analytics/get-started-analyze-spark.md#create-a-serverless-apache-spark-pool).
33
-
- To install SynapseML for your Apache Spark cluster, see [Install SynapseML](#step-3-install-synapseml).
33
+
- To install SynapseML for your Apache Spark cluster, see [Install SynapseML](#install-synapseml).
34
34
35
35
> [!NOTE]
36
36
> Currently, you must submit an application to access Azure OpenAI Service. To apply for access, complete <ahref="https://aka.ms/oai/access"target="_blank">this form</a>. If you need assistance, open an issue on this repo to contact Microsoft.
@@ -42,8 +42,11 @@ We recommend that you [create an Azure Synapse workspace](../../../synapse-analy
42
42
To use the example code in this article with your Apache Spark cluster, complete the following steps:
43
43
44
44
1. Prepare a new or existing notebook.
45
+
45
46
1. Connect your Apache Spark cluster with your notebook.
47
+
46
48
1. Install SynapseML for your Apache Spark cluster in your notebook.
49
+
47
50
1. Configure the notebook to work with your Azure OpenAI service resource.
48
51
49
52
### Prepare your notebook
@@ -52,7 +55,7 @@ You can create a new notebook in your Apache Spark platform, or you can import a
52
55
53
56
- To use a notebook in Azure Synapse Analytics, see [Create, develop, and maintain Synapse notebooks in Azure Synapse Analytics](../../../synapse-analytics/spark/apache-spark-development-using-notebooks.md).
54
57
55
-
- To use a notebook in Azure Databricks, see [Manage notebooks for Azure Databricks](/azure/databricks/notebooks/notebooks-manage.md).
58
+
- To use a notebook in Azure Databricks, see [Manage notebooks for Azure Databricks](/azure/databricks/notebooks/notebooks-manage).
56
59
57
60
- (Optional) Download [this demonstration notebook](https://github.com/microsoft/SynapseML/blob/master/docs/Explore%20Algorithms/OpenAI/OpenAI.ipynb) and connect it with your workspace. During the download process, select **Raw**, and then save the file.
58
61
@@ -143,7 +146,7 @@ df = spark.createDataFrame(
143
146
144
147
To apply the Azure OpenAI Completion service to the dataframe, create an `OpenAICompletion` object that serves as a distributed client. Parameters of the service can be set either with a single value, or by a column of the dataframe with the appropriate setters on the `OpenAICompletion` object.
145
148
146
-
In this example, you set the `maxTokens` parameter to 200. A token is around four characters, and this limit applies to the sum of the prompt and the result. You also set the `promptCol` parameter with the name of the prompt column in the dataframe.
149
+
In this example, you set the `maxTokens` parameter to 200. A token is around four characters, and this limit applies to the sum of the prompt and the result. You also set the `promptCol` parameter with the name of the prompt column in the dataframe, such as **prompt**.
147
150
148
151
```python
149
152
from synapse.ml.cognitive import OpenAICompletion
@@ -184,7 +187,7 @@ Here are some other use cases for working with Azure OpenAI Service and large da
184
187
185
188
You can use Azure OpenAI Service with large datasets to improve throughput with request batching. In the previous example, you make several requests to the service, one for each prompt. To complete multiple prompts in a single request, you can use batch mode.
186
189
187
-
In the `OpenAICompletion` object, instead of setting the **Prompt** column to `"prompt"`, you can specify `"batchPrompt"`to create the**batchPrompt** column. To support this method, create a dataframe with a list of prompts per row.
190
+
In the `OpenAICompletion` object definition, you specify the `"batchPrompt"` value to configure the dataframe to use a**batchPrompt** column. Create the dataframe with a list of prompts for each row.
188
191
189
192
> [!NOTE]
190
193
> There's currently a limit of 20 prompts in a single request and a limit of 2048 tokens, or approximately 1500 words.
Next, create the `OpenAICompletion` object. Rather than setting the `"prompt"` column, set the `"batchPrompt"` column if your column is of type `Array[String]`.
204
+
Next, create the `OpenAICompletion` object. If your column is of type `Array[String]`, set the `batchPromptCol` value for the column heading, rather than the `promptCol` value.
0 commit comments