You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/sql-data-warehouse/sql-data-warehouse-tables-overview.md
+18-18Lines changed: 18 additions & 18 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,10 +3,10 @@ title: Designing tables
3
3
description: Introduction to designing tables using dedicated SQL pool.
4
4
author: WilliamDAssafMSFT
5
5
ms.author: wiassaf
6
-
ms.date: 07/05/2023
6
+
ms.date: 01/22/2025
7
7
ms.service: azure-synapse-analytics
8
8
ms.subservice: sql-dw
9
-
ms.topic: conceptual
9
+
ms.topic: concept-article
10
10
ms.custom: azure-synapse
11
11
---
12
12
@@ -26,15 +26,15 @@ A [star schema](https://en.wikipedia.org/wiki/Star_schema) organizes data into f
26
26
27
27
## Schema and table names
28
28
29
-
Schemas are a good way to group tables, used in a similar fashion, together. If you're migrating multiple databases from an on-prem solution to a dedicated SQL pool, it works best to migrate all of the fact, dimension, and integration tables to one schema in a dedicated SQL pool.
29
+
Schemas are a good way to group tables, used in a similar fashion, together. If you're migrating multiple databases from an on-premises solution to a dedicated SQL pool, it works best to migrate all of the fact, dimension, and integration tables to one schema in a dedicated SQL pool.
30
30
31
31
For example, you could store all the tables in the [WideWorldImportersDW](/sql/sample/world-wide-importers/database-catalog-wwi-olap?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true) sample dedicated SQL pool within one schema called `wwi`. The following code creates a [user-defined schema](/sql/t-sql/statements/create-schema-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true) called `wwi`.
32
32
33
33
```sql
34
34
CREATESCHEMAwwi;
35
35
```
36
36
37
-
To show the organization of the tables in dedicated SQL pool, you could use fact, dim, and int as prefixes to the table names. The following table shows some of the schema and table names for `WideWorldImportersDW`.
37
+
To show the organization of the tables in dedicated SQL pool, you could use fact, dim, and int as prefixes to the table names. The following table shows some of the schema and table names for `WideWorldImportersDW`.
@@ -47,15 +47,15 @@ Tables store data either permanently in Azure Storage, temporarily in Azure Stor
47
47
48
48
### Regular table
49
49
50
-
A regular table stores data in Azure Storage as part of dedicated SQL pool. The table and the data persist regardless of whether a session is open. The following example creates a regular table with two columns.
50
+
A regular table stores data in Azure Storage as part of dedicated SQL pool. The table and the data persist regardless of whether a session is open. The following example creates a regular table with two columns.
51
51
52
52
```sql
53
53
CREATETABLEMyTable (col1 int, col2 int );
54
54
```
55
55
56
56
### Temporary table
57
57
58
-
A temporary table only exists for the duration of the session. You can use a temporary table to prevent other users from seeing temporary results and also to reduce the need for cleanup.
58
+
A temporary table only exists for the duration of the session. You can use a temporary table to prevent other users from seeing temporary results and also to reduce the need for cleanup.
59
59
60
60
Temporary tables utilize local storage to offer fast performance. For more information, see [Temporary tables](sql-data-warehouse-tables-temporary.md).
61
61
@@ -71,7 +71,7 @@ Dedicated SQL pool supports the most commonly used data types. For a list of the
71
71
72
72
## Distributed tables
73
73
74
-
A fundamental feature of dedicated SQL pool is the way it can store and operate on tables across [distributions](massively-parallel-processing-mpp-architecture.md#distributions). Dedicated SQL pool supports three methods for distributing data: round-robin (default), hash and replicated.
74
+
A fundamental feature of dedicated SQL pool is the way it can store and operate on tables across [distributions](massively-parallel-processing-mpp-architecture.md#distributions). Dedicated SQL pool supports three methods for distributing data: round-robin (default), hash and replicated.
75
75
76
76
### Hash-distributed tables
77
77
@@ -87,7 +87,7 @@ For more information, see [Design guidance for replicated tables](design-guidanc
87
87
88
88
### Round-robin tables
89
89
90
-
A round-robin table distributes table rows evenly across all distributions. The rows are distributed randomly. Loading data into a round-robin table is fast. Keep in mind that queries can require more data movement than the other distribution methods.
90
+
A round-robin table distributes table rows evenly across all distributions. The rows are distributed randomly. Loading data into a round-robin table is fast. Keep in mind that queries can require more data movement than the other distribution methods.
91
91
92
92
For more information, see [Design guidance for distributed tables](sql-data-warehouse-tables-distribute.md).
93
93
@@ -106,17 +106,17 @@ The table category often determines which option to choose for distributing the
106
106
107
107
## Table partitions
108
108
109
-
A partitioned table stores and performs operations on the table rows according to data ranges. For example, a table could be partitioned by day, month, or year. You can improve query performance through partition elimination, which limits a query scan to data within a partition. You can also maintain the data through partition switching. Since the data in SQL pool is already distributed, too many partitions can slow query performance. For more information, see [Partitioning guidance](sql-data-warehouse-tables-partition.md). When partition switching into table partitions that are not empty, consider using the TRUNCATE_TARGET option in your [ALTER TABLE](/sql/t-sql/statements/alter-table-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true) statement if the existing data is to be truncated. The below code switches in the transformed daily data into the SalesFact overwriting any existing data.
109
+
A partitioned table stores and performs operations on the table rows according to data ranges. For example, a table could be partitioned by day, month, or year. You can improve query performance through partition elimination, which limits a query scan to data within a partition. You can also maintain the data through partition switching. Since the data in SQL pool is already distributed, too many partitions can slow query performance. For more information, see [Partitioning guidance](sql-data-warehouse-tables-partition.md). When partition switching into table partitions that are not empty, consider using the TRUNCATE_TARGET option in your [ALTER TABLE](/sql/t-sql/statements/alter-table-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true) statement if the existing data is to be truncated. The below code switches in the transformed daily data into the SalesFact overwriting any existing data.
110
110
111
111
```sql
112
112
ALTERTABLE SalesFact_DailyFinalLoad SWITCH PARTITION 256 TO SalesFact PARTITION 256 WITH (TRUNCATE_TARGET =ON);
113
113
```
114
114
115
115
## Columnstore indexes
116
116
117
-
By default, dedicated SQL pool stores a table as a clustered columnstore index. This form of data storage achieves high data compression and query performance on large tables.
117
+
By default, dedicated SQL pool stores a table as a clustered columnstore index. This form of data storage achieves high data compression and query performance on large tables.
118
118
119
-
The clustered columnstore index is usually the best choice, but in some cases a clustered index or a heap is the appropriate storage structure.
119
+
The clustered columnstore index is usually the best choice, but in some cases a clustered index or a heap is the appropriate storage structure.
120
120
121
121
> [!TIP]
122
122
> A heap table can be especially useful for loading transient data, such as a staging table which is transformed into a final table.
@@ -127,13 +127,13 @@ For a list of columnstore features, see [What's new for columnstore indexes](/sq
127
127
128
128
The query optimizer uses column-level statistics when it creates the plan for executing a query.
129
129
130
-
To improve query performance, it's important to have statistics on individual columns, especially columns used in query joins. [Creating statistics](sql-data-warehouse-tables-statistics.md#automatic-creation-of-statistic) happens automatically.
130
+
To improve query performance, it's important to have statistics on individual columns, especially columns used in query joins. [Creating statistics](sql-data-warehouse-tables-statistics.md#automatic-creation-of-statistic) happens automatically.
131
131
132
132
Updating statistics doesn't happen automatically. Update statistics after a significant number of rows are added or changed. For example, update statistics after a load. For more information, see [Statistics guidance](sql-data-warehouse-tables-statistics.md).
133
133
134
134
## Primary key and unique key
135
135
136
-
PRIMARY KEY is only supported when NONCLUSTERED and NOT ENFORCED are both used. UNIQUE constraint is only supported with NOT ENFORCED is used. Check [Dedicated SQL pool table constraints](sql-data-warehouse-table-constraints.md).
136
+
PRIMARY KEY is only supported when NONCLUSTERED and NOT ENFORCED are both used. UNIQUE constraint is only supported with NOT ENFORCED is used. Check [Dedicated SQL pool table constraints](sql-data-warehouse-table-constraints.md).
137
137
138
138
## Commands for creating tables
139
139
@@ -144,7 +144,7 @@ You can create a table as a new empty table. You can also create and populate a
144
144
|[CREATE TABLE](/sql/t-sql/statements/create-table-azure-sql-data-warehouse?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true)| Creates an empty table by defining all the table columns and options. |
145
145
|[CREATE EXTERNAL TABLE](/sql/t-sql/statements/create-external-table-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true)| Creates an external table. The definition of the table is stored in dedicated SQL pool. The table data is stored in Azure Blob storage or Azure Data Lake Store. |
146
146
|[CREATE TABLE AS SELECT](/sql/t-sql/statements/create-table-as-select-azure-sql-data-warehouse?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true)| Populates a new table with the results of a select statement. The table columns and data types are based on the select statement results. To import data, this statement can select from an external table. |
147
-
|[CREATE EXTERNAL TABLE AS SELECT](/sql/t-sql/statements/create-external-table-as-select-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true)| Creates a new external table by exporting the results of a select statement to an external location. The location is either Azure Blob storage or Azure Data Lake Store. |
147
+
|[CREATE EXTERNAL TABLE AS SELECT](/sql/t-sql/statements/create-external-table-as-select-transact-sql?toc=/azure/synapse-analytics/sql-data-warehouse/toc.json&bc=/azure/synapse-analytics/sql-data-warehouse/breadcrumb/toc.json&view=azure-sqldw-latest&preserve-view=true)| Creates a new external table by exporting the results of a select statement to an external location. The location is either Azure Blob storage or Azure Data Lake Store. |
148
148
149
149
## Aligning source data with dedicated SQL pool
150
150
@@ -154,7 +154,7 @@ If data is coming from multiple data stores, you load the data into the dedicate
154
154
155
155
## Unsupported table features
156
156
157
-
Dedicated SQL pool supports many, but not all, of the table features offered by other databases. The following list shows some of the table features that aren't supported in dedicated SQL pool:
157
+
Dedicated SQL pool supports many, but not all, of the table features offered by other databases. The following list shows some of the table features that aren't supported in dedicated SQL pool:
@@ -178,7 +178,7 @@ One simple way to identify space and rows consumed by a table in each of the 60
178
178
DBCC PDW_SHOWSPACEUSED('dbo.FactInternetSales');
179
179
```
180
180
181
-
However, using DBCC commands can be quite limiting. Dynamic management views (DMVs) show more detail than DBCC commands. Start by creating this view:
181
+
However, using DBCC commands can be quite limiting. Dynamic management views (DMVs) show more detail than DBCC commands. Start by creating this view:
182
182
183
183
```sql
184
184
CREATEVIEWdbo.vTableSizes
@@ -296,7 +296,7 @@ FROM size
296
296
297
297
### Table space summary
298
298
299
-
This query returns the rows and space by table. It allows you to see which tables are your largest tables and whether they are round-robin, replicated, or hash-distributed. For hash-distributed tables, the query shows the distribution column.
299
+
This query returns the rows and space by table. It allows you to see which tables are your largest tables and whether they are round-robin, replicated, or hash-distributed. For hash-distributed tables, the query shows the distribution column.
300
300
301
301
```sql
302
302
SELECT
@@ -372,7 +372,7 @@ ORDER BY distribution_id
372
372
;
373
373
```
374
374
375
-
## Next steps
375
+
## Related content
376
376
377
377
After creating the tables for your dedicated SQL pool, the next step is to load data into the table. For a loading tutorial, see [Loading data to dedicated SQL pool](load-data-wideworldimportersdw.md) and review [Data loading strategies for dedicated SQL pool in Azure Synapse Analytics](design-elt-data-loading.md).
0 commit comments