You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/synapse-link/how-to-query-analytical-store-spark-3.md
+99-1Lines changed: 99 additions & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -45,6 +45,104 @@ Thus, you can choose between loading to Spark DataFrame and creating a Spark tab
45
45
> [!NOTE]
46
46
> Please note that all `options` in the commands below are case sensitive.
47
47
48
+
## Authentication
49
+
50
+
Now Spark 3.x customers can authenticate to Azure Cosmos DB analytical store using access tokens and database account keys. Access tokes are more secure as they are short lived, meaning less risk sincee it can only be generated by trusted identities, which have been approved by assigning them the required permission using Cosmos DB RBAC.
51
+
52
+
The connector now supports two auth types, `MasterKey` and `AccessToken`. This can be configured using the property `spark.cosmos.auth.type`.
#### Access token authentication requires role assignment
88
+
89
+
To use the access token approach, you need to generate access tokens. Since access tokens are associated with azure identities, correct role-based access control (RBAC) must be assigned to the identity. This role assignment is on data plane level, and you must have minimum control plane permissions to perform the role assignment. Click [here](https://learn.microsoft.com/azure/cosmos-db/nosql/security/how-to-grant-data-plane-role-based-access) for more information.
90
+
91
+
The Access Control (IAM) role assignments from azure portal are on control plane level and don't affect the role assignments on data plane. Data plane role assignments are only available via Azure CLI. The `readAnalytics` action is required to read data from analytical store in Cosmos DB and is not part of any pre-defined roles. As such we must create a custom role definition. In addition to the `readAnalytics` action, also add the actions required for Data Reader. These are the minimum actions required for reading data from analytical store. Create a JSON file with the following content and name it role_definition.json
- Set the default subscription which has your Cosmos DB account: `az account set --subscription <name or id>`
114
+
- Create the role definition in the desired Cosmos DB account: `az cosmosdb sql role definition create --account-name <cosmos-account-name> --resource-group <resource-group-name> --body @role_definition.json`
115
+
- Copy over the role definition id returned from the above command: `/subscriptions/<subscription-id>/resourceGroups/<resource-group-name>/providers/Microsoft.DocumentDB/databaseAccounts/< cosmos-account-name >/sqlRoleDefinitions/<a-random-generated-guid>`
116
+
- Get the principal id of the identity that you want to assign the role to. The identity could be an azure app registration, a virtual machine or any other supported azure resource. Assign the role to the principal using: `az cosmosdb sql role assignment create --account-name "<cosmos-account-name>" --resource-group "<resource-group>" --scope "/" --principal-id "<principal-id-of-identity>" --role-definition-id "<role-definition-id-from-previous-step>"`
117
+
118
+
> [!Note]
119
+
> When using an azure app registration, Use the Object Id as the service principal id in the step above. Also, the principal id and the Cosmos DB account must be in the same tenant.
120
+
121
+
122
+
#### Generating the access token - Synapse Notebooks
123
+
124
+
The recommended method for Synapse Notebooks is to use service principal with a certificate to generate access tokens. Click [here](https://learn.microsoft.com/azure/synapse-analytics/spark/apache-spark-secure-credentials-with-tokenlibrary) for more information.
125
+
126
+
```scala
127
+
The following code snippet has been validated to work in a Synapse notebook
Now you can use the access token generated in this step to read data from analytical store when auth type is set to access token.
138
+
139
+
> [!Note]
140
+
> When using an Azure App registration, use the application (Client Id) in the step above.
141
+
142
+
> [!Note]
143
+
> Currently, Synapse doesn’t support generating access tokens using the azure-identity package in notebooks. Furthermore, synapse VHDs don’t include azure-identity package and its dependencies. Click [here](https://learn.microsoft.com/azure/synapse-analytics/synapse-service-identity) for more information.
144
+
145
+
48
146
### Load to Spark DataFrame
49
147
50
148
In this example, you'll create a Spark DataFrame that points to the Azure Cosmos DB analytical store. You can then perform additional analysis by invoking Spark actions against the DataFrame. This operation doesn't impact the transactional store.
0 commit comments