You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/sql-data-warehouse/sql-data-warehouse-develop-ctas.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,10 +4,10 @@ description: Explanation and examples of the CREATE TABLE AS SELECT (CTAS) state
4
4
author: joannapea
5
5
ms.author: joanpo
6
6
ms.reviewer: wiassaf
7
-
ms.date: 06/09/2022
7
+
ms.date: 01/21/2025
8
8
ms.service: azure-synapse-analytics
9
9
ms.subservice: sql-dw
10
-
ms.topic: conceptual
10
+
ms.topic: concept-article
11
11
ms.custom:
12
12
- azure-synapse
13
13
---
@@ -90,7 +90,7 @@ WITH(
90
90
);
91
91
```
92
92
93
-
Now you want to create a new copy of this table, with a `Clustered Columnstore Index`, so you can take advantage of the performance of Clustered Columnstore tables. You also want to distribute this table on `ProductKey`, because you're anticipating joins on this column and want to avoid data movement during joins on `ProductKey`. Lastly, you also want to add partitioning on `OrderDateKey`, so you can quickly delete old data by dropping old partitions. Here is the CTAS statement, which copies your old table into a new table.
93
+
Now you want to create a new copy of this table, with a `Clustered Columnstore Index`, so you can take advantage of the performance of Clustered Columnstore tables. You also want to distribute this table on `ProductKey`, because you're anticipating joins on this column and want to avoid data movement during joins on `ProductKey`. Lastly, you also want to add partitioning on `OrderDateKey`, so you can quickly delete old data by dropping old partitions. Here's the CTAS statement, which copies your old table into a new table.
94
94
95
95
```sql
96
96
CREATETABLEFactInternetSales_new
@@ -169,9 +169,9 @@ The value stored for result is different. As the persisted value in the result c
169
169
170
170
This is important for data migrations. Even though the second query is arguably more accurate, there's a problem. The data would be different compared to the source system, and that leads to questions of integrity in the migration. This is one of those rare cases where the "wrong" answer is actually the right one!
171
171
172
-
The reason we see a disparity between the two results is due to implicit type casting. In the first example, the table defines the column definition. When the row is inserted, an implicit type conversion occurs. In the second example, there is no implicit type conversion as the expression defines the data type of the column.
172
+
The reason we see a disparity between the two results is due to implicit type casting. In the first example, the table defines the column definition. When the row is inserted, an implicit type conversion occurs. In the second example, there's no implicit type conversion as the expression defines the data type of the column.
173
173
174
-
Notice also that the column in the second example has been defined as a NULLable column, whereas in the first example it has not. When the table was created in the first example, column nullability was explicitly defined. In the second example, it was left to the expression, and by default would result in a NULL definition.
174
+
Notice also that the column in the second example has been defined as a NULLable column, whereas in the first example it hasn't. When the table was created in the first example, column nullability was explicitly defined. In the second example, it was left to the expression, and by default would result in a NULL definition.
175
175
176
176
To resolve these issues, you must explicitly set the type conversion and nullability in the SELECT portion of the CTAS statement. You can't set these properties in 'CREATE TABLE'.
177
177
The following example demonstrates how to fix the code:
@@ -194,7 +194,7 @@ Note the following:
194
194
* The second part of the ISNULL is a constant, 0.
195
195
196
196
> [!NOTE]
197
-
> For the nullability to be correctly set, it's vital to use ISNULL and not COALESCE. COALESCE is not a deterministic function, and so the result of the expression will always be NULLable. ISNULL is different. It's deterministic. Therefore, when the second part of the ISNULL function is a constant or a literal, the resulting value will be NOT NULL.
197
+
> For the nullability to be correctly set, it's vital to use ISNULL and not COALESCE. COALESCE isn't a deterministic function, and so the result of the expression will always be NULLable. ISNULL is different. It's deterministic. Therefore, when the second part of the ISNULL function is a constant or a literal, the resulting value will be NOT NULL.
198
198
199
199
Ensuring the integrity of your calculations is also important for table partition switching. Imagine you have this table defined as a fact table:
200
200
@@ -270,6 +270,6 @@ You can see that type consistency and maintaining nullability properties on a CT
270
270
271
271
CTAS is one of the most important statements in Synapse SQL. Make sure you thoroughly understand it. See the [CTAS documentation](/sql/t-sql/statements/create-table-as-select-azure-sql-data-warehouse?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true).
272
272
273
-
## Next steps
273
+
## Related content
274
274
275
275
For more development tips, see the [development overview](sql-data-warehouse-overview-develop.md).
Copy file name to clipboardExpand all lines: articles/synapse-analytics/sql-data-warehouse/sql-data-warehouse-manage-monitor.md
+14-14Lines changed: 14 additions & 14 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,10 +4,10 @@ description: Learn how to monitor your Azure Synapse Analytics dedicated SQL poo
4
4
author: WilliamDAssafMSFT
5
5
ms.author: wiassaf
6
6
ms.reviewer: kecona
7
-
ms.date: 11/09/2022
7
+
ms.date: 01/22/2025
8
8
ms.service: azure-synapse-analytics
9
9
ms.subservice: sql-dw
10
-
ms.topic: conceptual
10
+
ms.topic: concept-article
11
11
ms.custom: synapse-analytics
12
12
---
13
13
@@ -25,7 +25,7 @@ GRANT VIEW DATABASE STATE TO myuser;
25
25
26
26
## Monitor connections
27
27
28
-
All logins to your data warehouse are logged to [sys.dm_pdw_exec_sessions](/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-exec-sessions-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true). This DMV contains the last 10,000 logins. The `session_id` is the primary key and is assigned sequentially for each new login.
28
+
All logins to your data warehouse are logged to [sys.dm_pdw_exec_sessions](/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-exec-sessions-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true). This DMV contains the last 10,000 logins. The `session_id` is the primary key and is assigned sequentially for each new login.
29
29
30
30
```sql
31
31
-- Other Active Connections
@@ -34,10 +34,10 @@ SELECT * FROM sys.dm_pdw_exec_sessions where status <> 'Closed' and session_id <
34
34
35
35
## Monitor query execution
36
36
37
-
All queries executed on SQL pool are logged to [sys.dm_pdw_exec_requests](/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-exec-requests-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true). This DMV contains the last 10,000 queries executed. The `request_id` uniquely identifies each query and is the primary key for this DMV. The `request_id` is assigned sequentially for each new query and is prefixed with QID, which stands for query ID. Querying this DMV for a given `session_id` shows all queries for a given login.
37
+
All queries executed on SQL pool are logged to [sys.dm_pdw_exec_requests](/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-exec-requests-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true). This DMV contains the last 10,000 queries executed. The `request_id` uniquely identifies each query and is the primary key for this DMV. The `request_id` is assigned sequentially for each new query and is prefixed with QID, which stands for query ID. Querying this DMV for a given `session_id` shows all queries for a given login.
38
38
39
39
> [!NOTE]
40
-
> Stored procedures use multiple Request IDs. Request IDs are assigned in sequential order.
40
+
> Stored procedures use multiple Request IDs. Request IDs are assigned in sequential order.
41
41
42
42
Here are steps to follow to investigate query execution plans and times for a particular query.
43
43
@@ -59,7 +59,7 @@ ORDER BY total_elapsed_time DESC;
59
59
60
60
From the preceding query results, **note the Request ID** of the query that you would like to investigate.
61
61
62
-
Queries in the **Suspended** state can be queued due to a large number of active running queries. These queries also appear in the [sys.dm_pdw_waits](/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-waits-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true). In that case, look for waits such as UserConcurrencyResourceType. For information on concurrency limits, see [Memory and concurrency limits](memory-concurrency-limits.md) or [Resource classes for workload management](resource-classes-for-workload-management.md). Queries can also wait for other reasons such as for object locks. If your query is waiting for a resource, see [Investigating queries waiting for resources](#monitor-waiting-queries) further down in this article.
62
+
Queries in the **Suspended** state can be queued due to a large number of active running queries. These queries also appear in the [sys.dm_pdw_waits](/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-waits-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true). In that case, look for waits such as UserConcurrencyResourceType. For information on concurrency limits, see [Memory and concurrency limits](memory-concurrency-limits.md) or [Resource classes for workload management](resource-classes-for-workload-management.md). Queries can also wait for other reasons such as for object locks. If your query is waiting for a resource, see [Investigating queries waiting for resources](#monitor-waiting-queries) further down in this article.
63
63
64
64
To simplify the lookup of a query in the [sys.dm_pdw_exec_requests](/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-exec-requests-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true) table, use [LABEL](/sql/t-sql/queries/option-clause-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true) to assign a comment to your query, which can be looked up in the `sys.dm_pdw_exec_requests` view.
65
65
@@ -90,7 +90,7 @@ WHERE request_id = 'QID####'
90
90
ORDER BY step_index;
91
91
```
92
92
93
-
When a DSQL plan is taking longer than expected, the cause can be a complex plan with many DSQL steps or just one step taking a long time. If the plan is many steps with several move operations, consider optimizing your table distributions to reduce data movement. The [Table distribution](sql-data-warehouse-tables-distribute.md) article explains why data must be moved to solve a query. The article also explains some distribution strategies to minimize data movement.
93
+
When a DSQL plan is taking longer than expected, the cause can be a complex plan with many DSQL steps or just one step taking a long time. If the plan is many steps with several move operations, consider optimizing your table distributions to reduce data movement. The [Table distribution](sql-data-warehouse-tables-distribute.md) article explains why data must be moved to solve a query. The article also explains some distribution strategies to minimize data movement.
94
94
95
95
To investigate further details about a single step, inspect the `operation_type` column of the long-running query step and note the **Step Index**:
If you discover that your query is not making progress because it is waiting for a resource, here is a query that shows all the resources a query is waiting for.
149
+
If you discover that your query isn't making progress because it's waiting for a resource, here's a query that shows all the resources a query is waiting for.
150
150
151
151
```sql
152
152
-- Find queries
@@ -168,11 +168,11 @@ WHERE waits.request_id = 'QID####'
168
168
ORDER BYwaits.object_name, waits.object_type, waits.state;
169
169
```
170
170
171
-
If the query is actively waiting on resources from another query, then the state will be **AcquireResources**. If the query has all the required resources, then the state will be **Granted**.
171
+
If the query is actively waiting on resources from another query, then the state will be **AcquireResources**. If the query has all the required resources, then the state will be **Granted**.
172
172
173
173
## Monitor tempdb
174
174
175
-
The `tempdb` database is used to hold intermediate results during query execution. High utilization of the `tempdb` database can lead to slow query performance. For every DW100c configured, 399 GB of `tempdb` space is allocated (DW1000c would have 3.99 TB of total `tempdb` space). Below are tips for monitoring `tempdb` usage and for decreasing `tempdb` usage in your queries.
175
+
The `tempdb` database is used to hold intermediate results during query execution. High utilization of the `tempdb` database can lead to slow query performance. For every DW100c configured, 399 GB of `tempdb` space is allocated (DW1000c would have 3.99 TB of total `tempdb` space). Below are tips for monitoring `tempdb` usage and for decreasing `tempdb` usage in your queries.
176
176
177
177
### Monitor tempdb with views
178
178
@@ -215,11 +215,11 @@ ORDER BY sr.request_id;
215
215
> Use [Azure Synapse SQL Distribution Advisor](../sql/distribution-advisor.md) to get recommendations on the distribution method suited for your workloads.
216
216
> Use the [Azure Synapse Toolkit](https://github.com/microsoft/Azure_Synapse_Toolbox/tree/master/TSQL_Queries/TempDB) to monitor `tempdb` using T-SQL queries.
217
217
218
-
If you have a query that is consuming a large amount of memory or have received an error message related to the allocation of `tempdb`, it could be due to a very large [CREATE TABLE AS SELECT (CTAS)](/sql/t-sql/statements/create-table-as-select-azure-sql-data-warehouse) or [INSERT SELECT](/sql/t-sql/statements/insert-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true) statement running that is failing in the final data movement operation. This can usually be identified as a ShuffleMove operation in the distributed query plan right before the final INSERT SELECT. Use [sys.dm_pdw_request_steps](/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-request-steps-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true) to monitor ShuffleMove operations.
218
+
If you have a query that is consuming a large amount of memory or have received an error message related to the allocation of `tempdb`, it could be due to a very large [CREATE TABLE AS SELECT (CTAS)](/sql/t-sql/statements/create-table-as-select-azure-sql-data-warehouse) or [INSERT SELECT](/sql/t-sql/statements/insert-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true) statement running that is failing in the final data movement operation. This can usually be identified as a ShuffleMove operation in the distributed query plan right before the final INSERT SELECT. Use [sys.dm_pdw_request_steps](/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-request-steps-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true) to monitor ShuffleMove operations.
219
219
220
-
The most common mitigation is to break your CTAS or INSERT SELECT statement into multiple load statements so that the data volume will not exceed the 399 GB per 100DWUc `tempdb` limit. You can also scale your cluster to a larger size to increase how much `tempdb` space you have.
220
+
The most common mitigation is to break your CTAS or INSERT SELECT statement into multiple load statements so that the data volume won't exceed the 399 GB per 100DWUc `tempdb` limit. You can also scale your cluster to a larger size to increase how much `tempdb` space you have.
221
221
222
-
In addition to CTAS and INSERT SELECT statements, large, complex queries running with insufficient memory can spill into `tempdb` causing queries to fail. Consider running with a larger [resource class](resource-classes-for-workload-management.md) to avoid spilling into `tempdb`.
222
+
In addition to CTAS and INSERT SELECT statements, large, complex queries running with insufficient memory can spill into `tempdb` causing queries to fail. Consider running with a larger [resource class](resource-classes-for-workload-management.md) to avoid spilling into `tempdb`.
223
223
224
224
## Monitor memory
225
225
@@ -363,6 +363,6 @@ WHERE waiting.state = 'Queued'
363
363
ORDER BY Lock_Request_Time DESC;
364
364
```
365
365
366
-
## Next steps
366
+
## Related content
367
367
368
368
- For more information about DMVs, see [System views](../sql/reference-tsql-system-views.md).
0 commit comments