You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
#CustomerIntent: As a data scientist, I want to connect to Azure Cosmos DB for NoSQL using Spark and a service principal, so that I can avoid using connection strings.
12
+
#CustomerIntent: As a data scientist, I want to connect to Azure Cosmos DB for NoSQL by using Spark and a service principal so that I can avoid using connection strings.
13
13
---
14
14
15
15
# Use a service principal with the Spark 3 connector for Azure Cosmos DB for NoSQL
16
16
17
-
In this article, you learn how to create a Microsoft Entra application and service principal that can be used with the role-based access control. You can then use this service principal to connect to an Azure Cosmos DB for NoSQL account from Spark 3.
17
+
In this article, you learn how to create a Microsoft Entra application and service principal that can be used with role-based access control. You can then use this service principal to connect to an Azure Cosmos DB for NoSQL account from Spark 3.
18
18
19
19
## Prerequisites
20
20
21
21
- An existing Azure Cosmos DB for NoSQL account.
22
22
- If you have an existing Azure subscription, [create a new account](how-to-create-account.md?tabs=azure-portal).
23
23
- No Azure subscription? You can [try Azure Cosmos DB free](../try-free.md) with no credit card required.
24
24
- An existing Azure Databricks workspace.
25
-
- Registered Microsoft Entra application and service principal
26
-
- If you don't have a service principal and application, [register an application using the Azure portal](/entra/identity-platform/howto-create-service-principal-portal).
25
+
- Registered Microsoft Entra application and service principal.
26
+
- If you don't have a service principal and application, [register an application by using the Azure portal](/entra/identity-platform/howto-create-service-principal-portal).
27
27
28
-
## Create secret and record credentials
28
+
## Create a secret and record credentials
29
29
30
-
In this section we will create a client secret and record the value for use later.
30
+
In this section, you create a client secret and record the value for use later.
31
31
32
-
1. Open the Azure portal(<https://portal.azure.com>).
32
+
1. Open the [Azure portal](<https://portal.azure.com>).
33
33
34
-
1.Navigate to your existing Microsoft Entra application.
34
+
1.Go to your existing Microsoft Entra application.
35
35
36
-
1.Navigate to the **Certificates & secrets** page. Then, create a new secret. Save the **Client Secret** value to use later in this guide.
36
+
1.Go to the **Certificates & secrets** page. Then, create a new secret. Save the **Client Secret** value to use later in this article.
37
37
38
-
1.Navigate to the **Overview** page. Locate and record the values for **Application (client) ID**, **Object ID**, and **Directory (tenant) ID**. You also use these values later in this guide.
38
+
1.Go to the **Overview** page. Locate and record the values for **Application (client) ID**, **Object ID**, and **Directory (tenant) ID**. You also use these values later in this article.
39
39
40
-
1.Navigate to your existing Azure Cosmos DB for NoSQL account.
40
+
1.Go to your existing Azure Cosmos DB for NoSQL account.
41
41
42
-
1. Record the **URI** value in the **Overview** page. Also record the **Subscription ID** and **Resource Group** values. You' use these values too later in this guide.
42
+
1. Record the **URI** value on the **Overview** page. Also record the **Subscription ID** and **Resource Group** values. You use these values later in this article.
43
43
44
-
## Create definition and assignment
44
+
## Create a definition and an assignment
45
45
46
-
In this section we will create a Microsoft Entra ID role definition and assign that role with permissions to read and write items in the containers.
46
+
In this section, you create a Microsoft Entra ID role definition. Then you assign that role with permissions to read and write items in the containers.
47
47
48
-
1. Create a role using the `az role definition create` command. Pass in the Azure Cosmos DB for NoSQL account name and resource group, followed by a body of JSON that defines the custom role. The role is also scoped to the account level using `/`. Ensure that you provide a unique name for your role using the `RoleName` property of the request body.
48
+
1. Create a role by using the `az role definition create` command. Pass in the Azure Cosmos DB for NoSQL account name and resource group, followed by a body of JSON that defines the custom role. The role is also scoped to the account level by using `/`. Ensure that you provide a unique name for your role by using the `RoleName` property of the request body.
49
49
50
50
```azurecli
51
51
az cosmosdb sql role definition create \
@@ -94,7 +94,7 @@ In this section we will create a Microsoft Entra ID role definition and assign t
94
94
]
95
95
```
96
96
97
-
1. Use `az cosmosdb sql role assignment create` to create a role assignment. Replace the`<aad-principal-id>` with the **Object ID** you recorded earlier in this guide. Also, replace `<role-definition-id>` with the `id` value fetched from running the `az cosmosdb sql role definition list` command in a previous step.
97
+
1. Use `az cosmosdb sql role assignment create` to create a role assignment. Replace `<aad-principal-id>` with the **Object ID** you recorded earlier in this article. Also, replace `<role-definition-id>` with the `id` value fetched from running the `az cosmosdb sql role definition list` command in a previous step.
98
98
99
99
```azurecli
100
100
az cosmosdb sql role assignment create \
@@ -105,26 +105,26 @@ In this section we will create a Microsoft Entra ID role definition and assign t
105
105
--role-definition-id "<role-definition-id>"
106
106
```
107
107
108
-
## Use service principal
108
+
## Use a service principal
109
109
110
-
Now that you created a Microsoft Entra application and service principal, created a custom role, and assigned that role permissions to your Azure Cosmos DB for NoSQL account, you should be able to run a notebook.
110
+
Now that you've created a Microsoft Entra application and service principal, created a custom role, and assigned that role permissions to your Azure Cosmos DB for NoSQL account, you should be able to run a notebook.
111
111
112
112
1. Open your Azure Databricks workspace.
113
113
114
114
1. In the workspace interface, create a new **cluster**. Configure the cluster with these settings, at a minimum:
1. Use the workspace interface to search for **Maven** packages from **Maven Central** with a **Group Id** of `com.azure.cosmos.spark`. Install the package specific for Spark 3.4 with an **Artifact Id** prefixed with `azure-cosmos-spark_3-4` to the cluster.
120
+
1. Use the workspace interface to search for **Maven** packages from **Maven Central** with a **Group ID** of `com.azure.cosmos.spark`. Install the package specifically for Spark 3.4 with an **Artifact ID** prefixed with `azure-cosmos-spark_3-4` to the cluster.
121
121
122
122
1. Finally, create a new **notebook**.
123
123
124
124
> [!TIP]
125
-
> By default, the notebook will be attached to the recently created cluster.
125
+
> By default, the notebook is attached to the recently created cluster.
126
126
127
-
1. Within the notebook, set Cosmos DB Spark Connector configuration settings for NoSQL account endpoint, database name, and container name. Use the **Subscription ID**, **Resource Group**, **Application (client) ID**, **Directory (tenant) ID**, and **Client Secret** values recorded earlier in this guide.
127
+
1. Within the notebook, set Azure Cosmos DB Spark connector configuration settings for the NoSQL account endpoint, database name, and container name. Use the **Subscription ID**, **Resource Group**, **Application (client) ID**, **Directory (tenant) ID**, and **Client Secret** values recorded earlier in this article.
128
128
129
129
::: zone pivot="programming-language-python"
130
130
@@ -164,7 +164,7 @@ Now that you created a Microsoft Entra application and service principal, create
164
164
165
165
::: zone-end
166
166
167
-
1. Configure the Catalog API to manage API for NoSQL resources using Spark.
167
+
1. Configure the Catalog API to manage API for NoSQL resources by using Spark.
168
168
169
169
::: zone pivot="programming-language-python"
170
170
@@ -198,7 +198,7 @@ Now that you created a Microsoft Entra application and service principal, create
198
198
199
199
::: zone-end
200
200
201
-
1. Create a new database using `CREATE DATABASE IF NOT EXISTS`. Ensure that you provide your database name.
201
+
1. Create a new database by using `CREATE DATABASE IF NOT EXISTS`. Ensure that you provide your database name.
202
202
203
203
::: zone pivot="programming-language-python"
204
204
@@ -218,7 +218,7 @@ Now that you created a Microsoft Entra application and service principal, create
218
218
219
219
::: zone-end
220
220
221
-
1. Create a new container using database name, container name, partition key path, and throughput values that you specify.
221
+
1. Create a new container by using the database name, container name, partition key path, and throughput values that you specify.
222
222
223
223
::: zone pivot="programming-language-python"
224
224
@@ -238,7 +238,7 @@ Now that you created a Microsoft Entra application and service principal, create
238
238
239
239
::: zone-end
240
240
241
-
1. Create a sample data set.
241
+
1. Create a sample dataset.
242
242
243
243
::: zone pivot="programming-language-python"
244
244
@@ -264,7 +264,7 @@ Now that you created a Microsoft Entra application and service principal, create
264
264
265
265
::: zone-end
266
266
267
-
1. Use `spark.createDataFrame` and the previously saved OLTP configuration to add sample data to the target container.
267
+
1. Use `spark.createDataFrame` and the previously saved online transaction processing (OLTP) configuration to add sample data to the target container.
268
268
269
269
::: zone pivot="programming-language-python"
270
270
@@ -297,7 +297,7 @@ Now that you created a Microsoft Entra application and service principal, create
297
297
::: zone-end
298
298
299
299
> [!TIP]
300
-
> In this quickstart example credentials are assigned to variables in clear-text, but for security we recommend the usage of secrets. For more information on configuring secrets, see [add secrets to your Spark configuration](/azure/databricks/security/secrets/secrets#read-a-secret).
300
+
> In this quickstart example, credentials are assigned to variables in cleartext. For security, we recommend that you use secrets. For more information on how to configure secrets, see [Add secrets to your Spark configuration](/azure/databricks/security/secrets/secrets#read-a-secret).
0 commit comments