Skip to content

Commit e146134

Browse files
authored
Merge pull request #183707 from jovanpop-msft/patch-228
Improved slow query duration self-help
2 parents b70fc69 + b7b0b3b commit e146134

File tree

1 file changed

+30
-36
lines changed

1 file changed

+30
-36
lines changed

articles/synapse-analytics/sql/resources-self-help-sql-on-demand.md

Lines changed: 30 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ If Synapse Studio can't establish connection to serverless SQL pool, you'll noti
2525
1) Your network prevents communication to Azure Synapse backend. Most frequent case is that port 1443 is blocked. To get the serverless SQL pool to work, unblock this port. Other problems could prevent serverless SQL pool to work as well, [visit full troubleshooting guide for more information](../troubleshoot/troubleshoot-synapse-studio.md).
2626
2) You don't have permissions to log into serverless SQL pool. To gain access, one of the Azure Synapse workspace administrators should add you to workspace administrator or SQL administrator role. [Visit full guide on access control for more information](../security/synapse-workspace-access-control-overview.md).
2727

28-
### Websocket connection was closed unexpectedly
28+
### Query fails with error: Websocket connection was closed unexpectedly.
2929

3030
If your query fails with the error message: 'Websocket connection was closed unexpectedly', it means that your browser connection to Synapse Studio was interrupted, for example because of a network issue.
3131

@@ -35,13 +35,13 @@ If the issue still continues, create a [support ticket](../../azure-portal/suppo
3535

3636
## Query execution
3737

38-
### File cannot be opened
38+
### Query fails because file cannot be opened
3939

4040
If your query fails with the error 'File cannot be opened because it does not exist or it is used by another process' and you're sure both file exist and it's not used by another process it means serverless SQL pool can't access the file. This problem usually happens because your Azure Active Directory identity doesn't have rights to access the file or because a firewall is blocking access to the file. By default, serverless SQL pool is trying to access the file using your Azure Active Directory identity. To resolve this issue, you need to have proper rights to access the file. Easiest way is to grant yourself 'Storage Blob Data Contributor' role on the storage account you're trying to query.
4141
- [Visit full guide on Azure Active Directory access control for storage for more information](../../storage/blobs/assign-azure-role-data-access.md).
4242
- [Visit Control storage account access for serverless SQL pool in Azure Synapse Analytics](develop-storage-files-storage-access-control.md)
4343

44-
**Alternative to Storage Blob Data Contributor role**
44+
#### Alternative to Storage Blob Data Contributor role
4545

4646
Instead of granting Storage Blob Data Contributor, you can also grant more granular permissions on a subset of files.
4747

@@ -74,7 +74,7 @@ If you would like to query data2.csv in this example, the following permissions
7474
> [!NOTE]
7575
> For guest users, this needs to be done directly with the Azure Data Lake Service as it can not be done directly through Azure Synapse.
7676
77-
### Query cannot be executed due to current resource constraints
77+
### Query fails because it cannot be executed due to current resource constraints
7878

7979
If your query fails with the error message 'This query can't be executed due to current resource constraints', it means that serverless SQL pool isn't able to execute it at this moment due to resource constraints:
8080

@@ -98,7 +98,7 @@ The easiest way is to resolve this issue is grant yourself `Storage Blob DataCon
9898
- [Visit full guide on Azure Active Directory access control for storage for more information](../../storage/blobs/assign-azure-role-data-access.md).
9999
- [Visit Control storage account access for serverless SQL pool in Azure Synapse Analytics](develop-storage-files-storage-access-control.md)
100100

101-
#### DataVerse table is not accessible - content of directory cannot be listed
101+
#### Content of DataVerse table cannot be listed
102102

103103
If you are using the Synapse link for DataVerse to read the linked DataVerse tables, you need to use Azure AD account to access the linked data using the serverless SQL pool.
104104
If you try to use a SQL login to read an external table that is referencing the DataVerse table, you will get the following error:
@@ -133,12 +133,12 @@ This error indicates that you are using an object (table or view) that doesn't e
133133
- List the tables/views and check does the object exists. Use SSMS or ADS because Synapse studio might show some tables that are not available in the serverless SQL pool.
134134
- If you see the object, check are you using some case-sensitive/binary database collation. Maybe the object name does not match the name that you used in the query. With a binary database collation, `Employee` and `employee` are two different objects.
135135
- If you don't see the object, maybe you are trying to query a table from a Lake/Spark database. There are a few reasons why the table might not be available in the serverless pool:
136-
  - The table has some column types that cannot be represented in serverless SQL.
137-
  - The table has a format that is not supported in serverless SQL pool (Delta, ORC, etc.)
136+
  - The table has some column types that cannot be represented in serverless SQL.
137+
  - The table has a format that is not supported in serverless SQL pool (Delta, ORC, etc.)
138138

139139
### Could not allocate tempdb space while transferring data from one distribution to another
140140

141-
This error is special case of the generic [query fails because it cannot be executed due to current resource constraints](#query-cannot-be-executed-due-to-current-resource-constraints) error. This error is returned when the resources allocated to the `tempdb` database are insufficient to run the query.
141+
This error is special case of the generic [query fails because it cannot be executed due to current resource constraints](#query-fails-because-it-cannot-be-executed-due-to-current-resource-constraints) error. This error is returned when the resources allocated to the `tempdb` database are insufficient to run the query.
142142

143143
Apply the same mitigation and the best practices before you file a support ticket.
144144

@@ -508,9 +508,7 @@ spark.conf.set("spark.sql.legacy.parquet.int96RebaseModeInWrite", "CORRECTED")
508508

509509
## Configuration
510510

511-
You might get an error while you try to create objects or configure security rules. Some of the most common errors re listed in this section.
512-
513-
### Please create a master key in the database or open the master key in the session before performing this operation.
511+
### Query fails with: Please create a master key in the database or open the master key in the session before performing this operation.
514512

515513
If your query fails with the error message 'Please create a master key in the database or open the master key in the session before performing this operation.', it means that your user database has no access to a master key in the moment.
516514

@@ -563,24 +561,18 @@ Create a separate database and reference the synchronized [tables](../metadata/t
563561

564562
## Cosmos DB
565563

566-
The items in the Cosmos DB transactional store are eventually moved to the analytical schema where they are accessible for querying using the serverless SQL pools. The most common errors are listed in this section.
567-
568-
### Cannot execute the OPENROWSET function on Cosmos DB container
569-
570-
There are multiple issues that might cause this error.
564+
Possible errors and troubleshooting actions are listed in the following table.
571565

572566
| Error | Root cause |
573567
| --- | --- |
574568
| Syntax errors:<br/> - Incorrect syntax near `Openrowset`<br/> - `...` is not a recognized `BULK OPENROWSET` provider option.<br/> - Incorrect syntax near `...` | Possible root causes:<br/> - Not using CosmosDB as the first parameter.<br/> - Using a string literal instead of an identifier in the third parameter.<br/> - Not specifying the third parameter (container name). |
575569
| There was an error in the CosmosDB connection string. | - The account, database, or key isn't specified. <br/> - There's some option in a connection string that isn't recognized.<br/> - A semicolon (`;`) is placed at the end of a connection string. |
576570
| Resolving CosmosDB path has failed with the error "Incorrect account name" or "Incorrect database name." | The specified account name, database name, or container can't be found, or analytical storage hasn't been enabled to the specified collection.|
577571
| Resolving CosmosDB path has failed with the error "Incorrect secret value" or "Secret is null or empty." | The account key isn't valid or is missing. |
572+
| Column `column name` of the type `type name` isn't compatible with the external data type `type name`. | The specified column type in the `WITH` clause doesn't match the type in the Azure Cosmos DB container. Try to change the column type as it's described in the section [Azure Cosmos DB to SQL type mappings](query-cosmos-db-analytical-store.md#azure-cosmos-db-to-sql-type-mappings), or use the `VARCHAR` type. |
573+
| Column contains `NULL` values in all cells. | Possibly a wrong column name or path expression in the `WITH` clause. The column name (or path expression after the column type) in the `WITH` clause must match some property name in the Azure Cosmos DB collection. Comparison is *case-sensitive*. For example, `productCode` and `ProductCode` are different properties. |
578574

579-
### Column isn't compatible with the external data type
580-
581-
The specified column type in the `WITH` clause doesn't match the type in the Azure Cosmos DB container. Try to change the column type as it's described in the section [Azure Cosmos DB to SQL type mappings](query-cosmos-db-analytical-store.md#azure-cosmos-db-to-sql-type-mappings), or use the `VARCHAR` type.
582-
583-
Try to generate the `WITH` clause using a [sample document](https://htmlpreview.github.io/?https://github.com/Azure-Samples/Synapse/blob/main/SQL/tools/cosmosdb/generate-openrowset.html).
575+
You can report suggestions and issues on the [Azure Synapse Analytics feedback page](https://feedback.azure.com/d365community/forum/9b9ba8e4-0825-ec11-b6e6-000d3a4f07b8).
584576

585577
### UTF-8 collation warning is returned while reading CosmosDB string types
586578

@@ -597,9 +589,7 @@ A serverless SQL pool will return a compile-time warning if the `OPENROWSET` col
597589

598590
Azure Synapse SQL will return `NULL` instead of the values that you see in the transaction store in the following cases:
599591
- There is a synchronization delay between transactional and analytical store. The value that you entered in Cosmos DB transactional store might appear in analytical store after 2-3 minutes.
600-
- Possibly wrong column name or path expression in the `WITH` clause. The column name (or path expression after the column type) in the `WITH` clause must match the property names in Cosmos DB collection. The comparison is case-sensitive (for example, `productCode` and `ProductCode` are different properties). Make sure that your column names exactly match the Cosmos DB property names.
601-
- If you are querying **complex documents** with the nested objects and sub-arrays, maybe your query incorrectly references these objects.
602-
Try to generate the `WITH` clause using a [sample document](https://htmlpreview.github.io/?https://github.com/Azure-Samples/Synapse/blob/main/SQL/tools/cosmosdb/generate-openrowset.html).
592+
- Possibly wrong column name or path expression in the `WITH` clause. Column name (or path expression after the column type) in the `WITH` clause must match the property names in Cosmos DB collection. Comparison is case-sensitive (for example, `productCode` and `ProductCode` are different properties). Make sure that your column names exactly match the Cosmos DB property names.
603593
- The property might not be moved to the analytical storage because it violates some [schema constraints](../../cosmos-db/analytical-store-introduction.md#schema-constraints), such as more than 1000 properties or more than 127 nesting levels.
604594
- If you are using well-defined [schema representation](../../cosmos-db/analytical-store-introduction.md#schema-representation) the value in transactional store might have a wrong type. Well-defined schema locks the types for each property by sampling the documents. Any value added in the transactional store that doesn't match the type is treated as a wrong value and not migrated to the analytical store.
605595
- If you are using full-fidelity [schema representation](../../cosmos-db/analytical-store-introduction.md#schema-representation) make sure that you are adding type suffix after property name like `$.price.int64`. If you don't see a value for the referenced path, maybe it is stored under different type path, for example `$.price.float64`. See [how to query Cosmos Db collections in the full-fidelity schema](query-cosmos-db-analytical-store.md#query-items-with-full-fidelity-schema).
@@ -620,8 +610,6 @@ If you are experiencing some unexpected performance issues, make sure that you a
620610
- Make sure that you are using [Latin1_General_100_BIN2_UTF8 collation](best-practices-serverless-sql-pool.md#use-proper-collation-to-utilize-predicate-pushdown-for-character-columns) when you filter your data using string predicates.
621611
- If you have repeating queries that might be cached, try to use [CETAS to store query results in Azure Data Lake Storage](best-practices-serverless-sql-pool.md#use-cetas-to-enhance-query-performance-and-joins).
622612

623-
See the [best practices for serverless sql pools](best-practices-serverless-sql-pool.md) for more details.
624-
625613
## Delta Lake
626614

627615
There are some limitations and known issues that you might see in Delta Lake support in serverless SQL pools.
@@ -691,24 +679,30 @@ Now you can continue using Delta Lake folder with Spark pool. You will provide c
691679

692680
The serverless SQL pool assign the resources to the queries based on the size of data set and query complexity. You cannot impact or limit the resources that are provided to the queries. There are some cases where you might experience unexpected query performance degradations and identify the root causes.
693681

694-
### Query duration is very long
682+
### Query duration is very long
683+
684+
If you have queries with the query duration longer than 30min, this indicates that returning results to the client is slow. Serverless SQL pool has 30min limit for execution, and any additional time is spent on result streaming. Try with
685+
- If you are using [Synapse studio](#query-is-slow-when-executed-using-synapse-studio) try to reproduce the issues with some other application like SQL Server Management Studio or Azure Data Studio.
686+
- If your query is slow when executed using [SSMS, ADS, Power BI, or some other application](#query-is-slow-when-executed-using-application) check networking issues and best practices.
687+
688+
#### Query is slow when executed using Synapse studio
695689

696690
If you are using Synapse Studio, try using some desktop client such as SQL Server Management Studio or Azure Data Studio. Synapse Studio is a web client that is connecting to serverless pool using HTTP protocol, that is generally slower than the native SQL connections used in SQL Server Management Studio or Azure Data Studio.
697691

698-
If you have queries with the query duration longer than 30min, this indicates that returning results to the client is slow. Serverless SQL pool has 30min limit for execution, and any additional time is spent on result streaming.
692+
#### Query is slow when executed using application
699693

700694
Check the following issues if you are experiencing the slow query execution:
701695
- Make sure that the client applications are collocated with the serverless SQL pool endpoint. Executing a query across the region can cause additional latency and slow streaming of result set.
702696
- Make sure that you don’t have networking issues that can cause the slow streaming of result set
703697
- Make sure that the client application has enough resources (for example, not using 100% CPU).
704-
- Make sure that the storage account or cosmosDB analytical storage is placed in the same region as your serverless SQL endpoint.
698+
- Make sure that the storage account or Cosmos DB analytical storage is placed in the same region as your serverless SQL endpoint.
705699

706700
See the best practices for [collocating the resources](best-practices-serverless-sql-pool.md#client-applications-and-network-connections).
707701

708702
### High variations in query durations
709703

710704
If you are executing the same query and observing variations in the query durations, there might be several reasons that can cause this behavior:
711-
- Check is this a first execution of a query. The first execution of a query collects the statistics required to create a plan. The statistics are collected by scanning the underlying files and might increase the query duration. In synapse studio you will see additional “global statistics creation” queries in the SQL request list, that are executed before your query.
705+
- Check is this a first execution of a query. The first execution of a query collects the statistics required to create a plan. The statistics are collected by scanning the underlying files and might increase the query duration. In Synapse studio you will see additional “global statistics creation” queries in the SQL request list, that are executed before your query.
712706
- Statistics might expire after some time, so periodically you might observe an impact on performance because the serverless pool must scan and re-built the statistics. You might notice additional “global statistics creation” queries in the SQL request list, that are executed before your query.
713707
- Check is there some additional workload that is running on the same endpoint when you executed the query with the longer duration. The serverless SQL endpoint will equally allocate the resources to all queries that are executed in parallel, and the query might be delayed.
714708

@@ -726,17 +720,17 @@ See the [Synapse Studio section](#synapse-studio).
726720

727721
## Security
728722

729-
### AAD service principal login failures when SPI is creating a role assignment
730-
If you want to create role assignment for Service Principal Identifier/AAD app using another SPI, or have already created one and it fails to login, you're probably receiving following error:
723+
### Azure AD service principal login failures when SPI is creating a role assignment
724+
If you want to create role assignment for Service Principal Identifier/Azure AD app using another SPI, or have already created one and it fails to login, you're probably receiving following error:
731725
```
732726
Login error: Login failed for user '<token-identified principal>'.
733727
```
734728
For service principals login should be created with Application ID as SID (not with Object ID). There is a known limitation for service principals which is preventing the Azure Synapse service from fetching Application ID from Microsoft Graph when creating role assignment for another SPI/app.
735729

736-
**Solution #1**
730+
#### Solution #1
737731
Navigate to Azure portal > Synapse Studio > Manage > Access control and manually add Synapse Administrator or Synapse SQL Administrator for desired Service Principal.
738732

739-
**Solution #2**
733+
#### Solution #2
740734
You need to manually create a proper login through SQL code:
741735
```sql
742736
use master
@@ -747,7 +741,7 @@ ALTER SERVER ROLE sysadmin ADD MEMBER [<service_principal_name>];
747741
go
748742
```
749743

750-
**Solution #3**
744+
#### Solution #3
751745
You can also setup service principal Synapse Admin using PowerShell. You need to have [Az.Synapse module](/powershell/module/az.synapse) installed.
752746
The solution is to use cmdlet New-AzSynapseRoleAssignment with `-ObjectId "parameter"` - and in that parameter field to provide Application ID (instead of Object ID) using workspace admin Azure service principal credentials. PowerShell script:
753747
```azurepowershell
@@ -762,7 +756,7 @@ Connect-AzAccount -ServicePrincipal -Credential $cred -Tenant $tenantId
762756
New-AzSynapseRoleAssignment -WorkspaceName "<workspaceName>" -RoleDefinitionName "Synapse Administrator" -ObjectId "<app_id_to_add_as_admin>" [-Debug]
763757
```
764758

765-
**Validation**
759+
#### Validation
766760
Connect to serverless SQL endpoint and verify that the external login with SID `app_id_to_add_as_admin` is created:
767761
```sql
768762
select name, convert(uniqueidentifier, sid) as sid, create_date

0 commit comments

Comments
 (0)