Skip to content

Commit ff2ecde

Browse files
authored
Merge pull request #209980 from WilliamDAssafMSFT/20220901-troubleshoot-synapse-serverless
20220901 troubleshoot synapse serverless
2 parents 4c09d06 + 1d4e6c8 commit ff2ecde

File tree

4 files changed

+222
-191
lines changed

4 files changed

+222
-191
lines changed
Lines changed: 62 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,45 +1,44 @@
11
---
2-
title: Grant permissions to managed identity in Synapse workspace
3-
description: An article that explains how to configure permissions for managed identity in Azure Synapse workspace.
2+
title: Grant permissions to managed identity in Synapse workspace
3+
description: An article that explains how to configure permissions for managed identity in Azure Synapse workspace.
44
author: meenalsri
5-
ms.service: synapse-analytics
6-
ms.topic: how-to
7-
ms.subservice: security
8-
ms.date: 04/15/2020
95
ms.author: mesrivas
106
ms.reviewer: sngun
7+
ms.date: 09/01/2022
8+
ms.service: synapse-analytics
9+
ms.subservice: security
10+
ms.topic: how-to
1111
ms.custom: subject-rbac-steps
1212
---
1313

14-
1514
# Grant permissions to workspace managed identity
1615

1716
This article teaches you how to grant permissions to the managed identity in Azure synapse workspace. Permissions, in turn, allow access to dedicated SQL pools in the workspace and ADLS Gen2 storage account through the Azure portal.
1817

19-
>[!NOTE]
20-
>This workspace managed identity will be referred to as managed identity through the rest of this document.
18+
> [!NOTE]
19+
> This workspace managed identity will be referred to as managed identity through the rest of this document.
2120
2221
## Grant the managed identity permissions to ADLS Gen2 storage account
2322

24-
An ADLS Gen2 storage account is required to create an Azure Synapse workspace. To successfully launch Spark pools in Azure Synapse workspace, the Azure Synapse managed identity needs the *Storage Blob Data Contributor* role on this storage account . Pipeline orchestration in Azure Synapse also benefits from this role.
23+
An ADLS Gen2 storage account is required to create an Azure Synapse workspace. To successfully launch Spark pools in Azure Synapse workspace, the Azure Synapse managed identity needs the *Storage Blob Data Contributor* role on this storage account. Pipeline orchestration in Azure Synapse also benefits from this role.
2524

2625
### Grant permissions to managed identity during workspace creation
2726

2827
Azure Synapse will attempt to grant the Storage Blob Data Contributor role to the managed identity after you create the Azure Synapse workspace using Azure portal. You provide the ADLS Gen2 storage account details in the **Basics** tab.
2928

30-
![Basics tab in workspace creation flow](./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-1.png)
29+
:::image type="content" source="./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-1.png" alt-text="Screenshot of the Basics tab in workspace creation flow.":::
3130

3231
Choose the ADLS Gen2 storage account and filesystem in **Account name** and **File system name**.
3332

34-
![Providing an ADLS Gen2 storage account details](./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-2.png)
33+
:::image type="content" source="./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-2.png" alt-text="Screenshot of providing the ADLS Gen2 storage account details.":::
3534

3635
If the workspace creator is also **Owner** of the ADLS Gen2 storage account, then Azure Synapse will assign the *Storage Blob Data Contributor* role to the managed identity. You'll see the following message below the storage account details that you entered.
3736

38-
![Successful Storage Blob Data Contributor assignment](./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-3.png)
37+
:::image type="content" source="./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-3.png" alt-text="Screenshot of the successful storage blob data contributor assignment.":::
3938

4039
If the workspace creator isn't the owner of the ADLS Gen2 storage account, then Azure Synapse doesn't assign the *Storage Blob Data Contributor* role to the managed identity. The message appearing below the storage account details notifies the workspace creator that they don't have sufficient permissions to grant the *Storage Blob Data Contributor* role to the managed identity.
4140

42-
![Unsuccessful Storage Blob Data Contributor assignment](./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-4.png)
41+
:::image type="content" source="./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-4.png" alt-text="Screenshot of an unsuccessful storage blob data contributor assignment, with the error box highlighted.":::
4342

4443
As the message states, you can't create Spark pools unless the *Storage Blob Data Contributor* is assigned to the managed identity.
4544

@@ -49,17 +48,19 @@ During workspace creation, if you don't assign the *Storage Blob Data contributo
4948

5049
#### Step 1: Navigate to the ADLS Gen2 storage account in Azure portal
5150

52-
In Azure portal, open the ADLS Gen2 storage account and select **Overview** from the left navigation. You'll only need to assign The *Storage Blob Data Contributor* role at the container or filesystem level. Select **Containers**.
53-
![ADLS Gen2 storage account overview](./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-5.png)
51+
In Azure portal, open the ADLS Gen2 storage account and select **Overview** from the left navigation. You'll only need to assign The *Storage Blob Data Contributor* role at the container or filesystem level. Select **Containers**.
52+
53+
:::image type="content" source="./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-5.png" alt-text="Screenshot of the Azure portal, of the Overview of the ADLS Gen2 storage account.":::
5454

5555
#### Step 2: Select the container
5656

5757
The managed identity should have data access to the container (file system) that was provided when the workspace was created. You can find this container or file system in Azure portal. Open the Azure Synapse workspace in Azure portal and select the **Overview** tab from the left navigation.
58-
![ADLS Gen2 storage account container](./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-7.png)
5958

59+
:::image type="content" source="./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-7.png" alt-text="Screenshot of the Azure portal showing the name of the ADLS Gen2 storage file 'contosocontainer'.":::
6060

6161
Select that same container or file system to grant the *Storage Blob Data Contributor* role to the managed identity.
62-
![Screenshot that shows the container or file system that you should select.](./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-6.png)
62+
63+
:::image type="content" source="./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-6.png" alt-text="Screenshot that shows the container or file system that you should select.":::
6364

6465
#### Step 3: Open Access control and add role assignment
6566

@@ -68,29 +69,66 @@ Select that same container or file system to grant the *Storage Blob Data Contri
6869
1. Select **Add** > **Add role assignment** to open the Add role assignment page.
6970

7071
1. Assign the following role. For detailed steps, see [Assign Azure roles using the Azure portal](../../role-based-access-control/role-assignments-portal.md).
71-
72+
7273
| Setting | Value |
7374
| --- | --- |
7475
| Role | Storage Blob Data Contributor |
7576
| Assign access to | MANAGEDIDENTITY |
7677
| Members | managed identity name |
7778

78-
> [!NOTE]
79+
> [!NOTE]
7980
> The managed identity name is also the workspace name.
8081
81-
![Add role assignment page in Azure portal.](../../../includes/role-based-access-control/media/add-role-assignment-page.png)
82+
:::image type="content" source="../../../includes/role-based-access-control/media/add-role-assignment-page.png" alt-text="Screenshot of the add role assignment page in the Azure portal.":::
8283

8384
1. Select **Save** to add the role assignment.
8485

8586
#### Step 4: Verify that the Storage Blob Data Contributor role is assigned to the managed identity
8687

8788
Select **Access Control(IAM)** and then select **Role assignments**.
8889

89-
![Verify role assignment](./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-14.png)
90+
:::image type="content" source="./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-14.png" alt-text="Screenshot of the Role Assignments button in the Azure portal, used to verify role assignment.":::
91+
92+
You should see your managed identity listed under the **Storage Blob Data Contributor** section with the *Storage Blob Data Contributor* role assigned to it.
93+
:::image type="content" source="./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-15.png" alt-text="Screenshot of the Azure portal, showing ADLS Gen2 storage account container selection.":::
94+
95+
#### Alternative to Storage Blob Data Contributor role
96+
97+
Instead of granting yourself a Storage Blob Data Contributor role, you can also grant more granular permissions on a subset of files.
98+
99+
All users who need access to some data in this container also must have EXECUTE permission on all parent folders up to the root (the container).
100+
101+
Learn more about how to [set ACLs in Azure Data Lake Storage Gen2](../../storage/blobs/data-lake-storage-explorer-acl.md).
90102

91-
You should see your managed identity listed under the **Storage Blob Data Contributor** section with the *Storage Blob Data Contributor* role assigned to it.
92-
![ADLS Gen2 storage account container selection](./media/how-to-grant-workspace-managed-identity-permissions/configure-workspace-managed-identity-15.png)
103+
> [!NOTE]
104+
> Execute permission on the container level must be set within Data Lake Storage Gen2.
105+
> Permissions on the folder can be set within Azure Synapse.
106+
107+
If you want to query data2.csv in this example, the following permissions are needed:
108+
109+
- Execute permission on container
110+
- Execute permission on folder1
111+
- Read permission on data2.csv
112+
113+
:::image type="content" source="../sql/media/resources-self-help-sql-on-demand/folder-structure-data-lake.png" alt-text="Diagram that shows permission structure on data lake.":::
114+
115+
1. Sign in to Azure Synapse with an admin user that has full permissions on the data you want to access.
116+
1. In the data pane, right-click the file and select **Manage access**.
117+
118+
:::image type="content" source="../sql/media/resources-self-help-sql-on-demand/manage-access.png" alt-text="Screenshot that shows the manage access option.":::
119+
120+
1. Select at least **Read** permission. Enter the user's UPN or object ID, for example, [email protected]. Select **Add**.
121+
1. Grant read permission for this user.
122+
123+
:::image type="content" source="../sql/media/resources-self-help-sql-on-demand/grant-permission.png" alt-text="Screenshot that shows granting read permissions.":::
124+
125+
> [!NOTE]
126+
> For guest users, this step needs to be done directly with Azure Data Lake because it can't be done directly through Azure Synapse.
93127
94128
## Next steps
95129

96130
Learn more about [Workspace managed identity](../../data-factory/data-factory-service-identity.md?context=/azure/synapse-analytics/context/context&tabs=synapse-analytics)
131+
132+
- [Best practices for dedicated SQL pools](../sql/best-practices-dedicated-sql-pool.md)
133+
- [Troubleshoot serverless SQL pool in Azure Synapse Analytics](../sql/resources-self-help-sql-on-demand.md)
134+
- [Azure Synapse Analytics frequently asked questions](../overview-faq.yml)

articles/synapse-analytics/sql/best-practices-serverless-sql-pool.md

Lines changed: 64 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -1,14 +1,14 @@
11
---
22
title: Best practices for serverless SQL pool
3-
description: Recommendations and best practices for working with serverless SQL pool.
3+
description: Recommendations and best practices for working with serverless SQL pool.
44
author: filippopovic
5+
ms.author: fipopovi
56
manager: craigg
7+
ms.reviewer: sngun, wiassaf
8+
ms.date: 09/01/2022
69
ms.service: synapse-analytics
7-
ms.topic: conceptual
810
ms.subservice: sql
9-
ms.date: 05/01/2020
10-
ms.author: fipopovi
11-
ms.reviewer: sngun
11+
ms.topic: conceptual
1212
---
1313

1414
# Best practices for serverless SQL pool in Azure Synapse Analytics
@@ -51,7 +51,7 @@ Multiple applications and services might access your storage account. Storage th
5151

5252
When throttling is detected, serverless SQL pool has built-in handling to resolve it. Serverless SQL pool makes requests to storage at a slower pace until throttling is resolved.
5353

54-
> [!TIP]
54+
> [!TIP]
5555
> For optimal query execution, don't stress the storage account with other workloads during query execution.
5656
5757
### Prepare files for querying
@@ -66,11 +66,7 @@ If possible, you can prepare files for better performance:
6666

6767
### Colocate your Azure Cosmos DB analytical storage and serverless SQL pool
6868

69-
Make sure your Azure Cosmos DB analytical storage is placed in the same region as an Azure Synapse workspace. Cross-region queries might cause huge latencies. Use the region property in the connection string to explicitly specify the region where the analytical store is placed (see [Query Azure Cosmos DB by using serverless SQL pool](query-cosmos-db-analytical-store.md#overview)):
70-
71-
```
72-
'account=<database account name>;database=<database name>;region=<region name>'
73-
```
69+
Make sure your Azure Cosmos DB analytical storage is placed in the same region as an Azure Synapse workspace. Cross-region queries might cause huge latencies. Use the region property in the connection string to explicitly specify the region where the analytical store is placed (see [Query Azure Cosmos DB by using serverless SQL pool](query-cosmos-db-analytical-store.md#overview)): `account=<database account name>;database=<database name>;region=<region name>'`
7470

7571
## CSV optimizations
7672

@@ -90,7 +86,7 @@ Here are best practices for using data types in serverless SQL pool.
9086

9187
### Use appropriate data types
9288

93-
The data types you use in your query affect performance and concurrency. You can get better performance if you follow these guidelines:
89+
The data types you use in your query affect performance and concurrency. You can get better performance if you follow these guidelines:
9490

9591
- Use the smallest data size that can accommodate the largest possible value.
9692
- If the maximum character value length is 30 characters, use a character data type of length 30.
@@ -111,15 +107,15 @@ You can use [sp_describe_first_results_set](/sql/relational-databases/system-sto
111107

112108
The following example shows how you can optimize inferred data types. This procedure is used to show the inferred data types:
113109

114-
```sql
110+
```sql
115111
EXEC sp_describe_first_result_set N'
116-
SELECT
112+
SELECT
117113
vendor_id, pickup_datetime, passenger_count
118-
FROM
119-
OPENROWSET(
120-
BULK ''https://sqlondemandstorage.blob.core.windows.net/parquet/taxi/*/*/*'',
121-
FORMAT=''PARQUET''
122-
) AS nyc';
114+
FROM
115+
OPENROWSET(
116+
BULK ''https://sqlondemandstorage.blob.core.windows.net/parquet/taxi/*/*/*'',
117+
FORMAT=''PARQUET''
118+
) AS nyc';
123119
```
124120

125121
Here's the result set:
@@ -132,19 +128,19 @@ Here's the result set:
132128

133129
After you know the inferred data types for the query, you can specify appropriate data types:
134130

135-
```sql
131+
```sql
136132
SELECT
137133
vendorID, tpepPickupDateTime, passengerCount
138-
FROM
139-
OPENROWSET(
140-
BULK 'https://azureopendatastorage.blob.core.windows.net/nyctlc/yellow/puYear=2018/puMonth=*/*.snappy.parquet',
141-
FORMAT='PARQUET'
142-
)
143-
WITH (
144-
vendorID varchar(4), -- we used length of 4 instead of the inferred 8000
145-
tpepPickupDateTime datetime2,
146-
passengerCount int
147-
) AS nyc;
134+
FROM
135+
OPENROWSET(
136+
BULK 'https://azureopendatastorage.blob.core.windows.net/nyctlc/yellow/puYear=2018/puMonth=*/*.snappy.parquet',
137+
FORMAT='PARQUET'
138+
)
139+
WITH (
140+
vendorID varchar(4), -- we used length of 4 instead of the inferred 8000
141+
tpepPickupDateTime datetime2,
142+
passengerCount int
143+
) AS nyc;
148144
```
149145

150146
## Filter optimization
@@ -161,7 +157,7 @@ Data is often organized in partitions. You can instruct serverless SQL pool to q
161157

162158
For more information, read about the [filename](query-data-storage.md#filename-function) and [filepath](query-data-storage.md#filepath-function) functions and see the examples for [querying specific files](query-specific-files.md).
163159

164-
> [!TIP]
160+
> [!TIP]
165161
> Always cast the results of the filepath and filename functions to appropriate data types. If you use character data types, be sure to use the appropriate length.
166162
167163
Functions used for partition elimination, filepath and filename, aren't currently supported for external tables, other than those created automatically for each table created in Apache Spark for Azure Synapse Analytics.
@@ -186,6 +182,42 @@ You can use CETAS to materialize frequently used parts of queries, like joined r
186182

187183
As CETAS generates Parquet files, statistics are automatically created when the first query targets this external table. The result is improved performance for subsequent queries targeting table generated with CETAS.
188184

185+
## Query Azure data
186+
187+
Serverless SQL pools enable you to query data in Azure Storage or Azure Cosmos DB by using [external tables and the OPENROWSET function](develop-storage-files-overview.md). Make sure that you have proper [permission set up](develop-storage-files-overview.md#permissions) on your storage.
188+
189+
### Query CSV data
190+
191+
Learn how to [query a single CSV file](query-single-csv-file.md) or [folders and multiple CSV files](query-folders-multiple-csv-files.md). You can also [query partitioned files](query-specific-files.md)
192+
193+
### Query Parquet data
194+
195+
Learn how to [query Parquet files](query-parquet-files.md) with [nested types](query-parquet-nested-types.md). You can also [query partitioned files](query-specific-files.md).
196+
197+
### Query Delta Lake
198+
199+
Learn how to [query Delta Lake files](query-delta-lake-format.md) with [nested types](query-parquet-nested-types.md).
200+
201+
### Query Azure Cosmos DB data
202+
203+
Learn how to [query Azure Cosmos DB analytical store](query-cosmos-db-analytical-store.md). You can use an [online generator](https://htmlpreview.github.io/?https://github.com/Azure-Samples/Synapse/blob/main/SQL/tools/cosmosdb/generate-openrowset.html) to generate the WITH clause based on a sample Azure Cosmos DB document. You can [create views](create-use-views.md#cosmosdb-view) on top of Azure Cosmos DB containers.
204+
205+
### Query JSON data
206+
207+
Learn how to [query JSON files](query-json-files.md). You can also [query partitioned files](query-specific-files.md).
208+
209+
### Create views, tables, and other database objects
210+
211+
Learn how to create and use [views](create-use-views.md) and [external tables](create-use-external-tables.md) or set up [row-level security](https://techcommunity.microsoft.com/t5/azure-synapse-analytics-blog/how-to-implement-row-level-security-in-serverless-sql-pools/ba-p/2354759).
212+
If you have [partitioned files](query-specific-files.md), make sure you use [partitioned views](create-use-views.md#partitioned-views).
213+
214+
### Copy and transform data (CETAS)
215+
216+
Learn how to [store query results to storage](create-external-table-as-select.md) by using the CETAS command.
217+
189218
## Next steps
190219

191-
Review the [troubleshooting](resources-self-help-sql-on-demand.md) article for solutions to common problems. If you're working with a dedicated SQL pool rather than serverless SQL pool, see [Best practices for dedicated SQL pools](best-practices-dedicated-sql-pool.md) for specific guidance.
220+
- Review the [troubleshooting serverless SQL pools](resources-self-help-sql-on-demand.md) article for solutions to common problems.
221+
- If you're working with a dedicated SQL pool rather than serverless SQL pool, see [Best practices for dedicated SQL pools](best-practices-dedicated-sql-pool.md) for specific guidance.
222+
- [Azure Synapse Analytics frequently asked questions](../overview-faq.yml)
223+
- [Grant permissions to workspace managed identity](../security/how-to-grant-workspace-managed-identity-permissions.md)

0 commit comments

Comments
 (0)