You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/sql-data-warehouse/design-guidance-for-replicated-tables.md
+14-13Lines changed: 14 additions & 13 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: Design guidance for replicated tables
3
-
description: Recommendations for designing replicated tables in your Azure SQL Data Warehouse schema.
3
+
description: Recommendations for designing replicated tables in SQL Analytics
4
4
services: sql-data-warehouse
5
5
author: XiaoyuMSFT
6
6
manager: craigg
@@ -11,26 +11,27 @@ ms.date: 03/19/2019
11
11
ms.author: xiaoyul
12
12
ms.reviewer: igorstan
13
13
ms.custom: seo-lt-2019
14
+
ms.custom: azure-synapse
14
15
---
15
16
16
-
# Design guidance for using replicated tables in Azure SQL Data Warehouse
17
-
This article gives recommendations for designing replicated tables in your SQL Data Warehouse schema. Use these recommendations to improve query performance by reducing data movement and query complexity.
17
+
# Design guidance for using replicated tables in SQL Analytics
18
+
This article gives recommendations for designing replicated tables in your SQL Analytics schema. Use these recommendations to improve query performance by reducing data movement and query complexity.
This article assumes you are familiar with data distribution and data movement concepts in SQL Data Warehouse. For more information, see the [architecture](massively-parallel-processing-mpp-architecture.md) article.
23
+
This article assumes you are familiar with data distribution and data movement concepts in SQL Analytics. For more information, see the [architecture](massively-parallel-processing-mpp-architecture.md) article.
23
24
24
25
As part of table design, understand as much as possible about your data and how the data is queried. For example, consider these questions:
25
26
26
27
- How large is the table?
27
28
- How often is the table refreshed?
28
-
- Do I have fact and dimension tables in a data warehouse?
29
+
- Do I have fact and dimension tables in a SQL Analytics database?
29
30
30
31
## What is a replicated table?
31
32
A replicated table has a full copy of the table accessible on each Compute node. Replicating a table removes the need to transfer data among Compute nodes before a join or aggregation. Since the table has multiple copies, replicated tables work best when the table size is less than 2 GB compressed. 2 GB is not a hard limit. If the data is static and does not change, you can replicate larger tables.
32
33
33
-
The following diagram shows a replicated table that is accessible on each Compute node. In SQL Data Warehouse, the replicated table is fully copied to a distribution database on each Compute node.
34
+
The following diagram shows a replicated table that is accessible on each Compute node. In SQL Analytics, the replicated table is fully copied to a distribution database on each Compute node.
@@ -44,8 +45,8 @@ Consider using a replicated table when:
44
45
Replicated tables may not yield the best query performance when:
45
46
46
47
- The table has frequent insert, update, and delete operations. These data manipulation language (DML) operations require a rebuild of the replicated table. Rebuilding frequently can cause slower performance.
47
-
- The data warehouse is scaled frequently. Scaling a data warehouse changes the number of Compute nodes, which incurs rebuilding the replicated table.
48
-
- The table has a large number of columns, but data operations typically access only a small number of columns. In this scenario, instead of replicating the entire table, it might be more effective to distribute the table, and then create an index on the frequently accessed columns. When a query requires data movement, SQL Data Warehouse only moves data for the requested columns.
48
+
- The SQL Analytics database is scaled frequently. Scaling a SQL Analytics database changes the number of Compute nodes, which incurs rebuilding the replicated table.
49
+
- The table has a large number of columns, but data operations typically access only a small number of columns. In this scenario, instead of replicating the entire table, it might be more effective to distribute the table, and then create an index on the frequently accessed columns. When a query requires data movement, SQL Analytics only moves data for the requested columns.
49
50
50
51
## Use replicated tables with simple query predicates
51
52
Before you choose to distribute or replicate a table, think about the types of queries you plan to run against the table. Whenever possible,
@@ -113,11 +114,11 @@ We re-created `DimDate` and `DimSalesTerritory` as replicated tables, and ran th
113
114
114
115
115
116
## Performance considerations for modifying replicated tables
116
-
SQL Data Warehouse implements a replicated table by maintaining a master version of the table. It copies the master version to one distribution database on each Compute node. When there is a change, SQL Data Warehouse first updates the master table. Then it rebuilds the tables on each Compute node. A rebuild of a replicated table includes copying the table to each Compute node and then building the indexes. For example, a replicated table on a DW400 has 5 copies of the data. A master copy and a full copy on each Compute node. All data is stored in distribution databases. SQL Data Warehouse uses this model to support faster data modification statements and flexible scaling operations.
117
+
SQL Analytics implements a replicated table by maintaining a master version of the table. It copies the master version to one distribution database on each Compute node. When there is a change, SQL Analytics first updates the master table. Then it rebuilds the tables on each Compute node. A rebuild of a replicated table includes copying the table to each Compute node and then building the indexes. For example, a replicated table on a DW400 has 5 copies of the data. A master copy and a full copy on each Compute node. All data is stored in distribution databases. SQL Analytics uses this model to support faster data modification statements and flexible scaling operations.
117
118
118
119
Rebuilds are required after:
119
120
- Data is loaded or modified
120
-
- The data warehouse is scaled to a different level
121
+
- The SQL Analytics instance is scaled to a different level
121
122
- Table definition is updated
122
123
123
124
Rebuilds are not required after:
@@ -127,7 +128,7 @@ Rebuilds are not required after:
127
128
The rebuild does not happen immediately after data is modified. Instead, the rebuild is triggered the first time a query selects from the table. The query that triggered the rebuild reads immediately from the master version of the table while the data is asynchronously copied to each Compute node. Until the data copy is complete, subsequent queries will continue to use the master version of the table. If any activity happens against the replicated table that forces another rebuild, the data copy is invalidated and the next select statement will trigger data to be copied again.
128
129
129
130
### Use indexes conservatively
130
-
Standard indexing practices apply to replicated tables. SQL Data Warehouse rebuilds each replicated table index as part of the rebuild. Only use indexes when the performance gain outweighs the cost of rebuilding the indexes.
131
+
Standard indexing practices apply to replicated tables. SQL Analytics rebuilds each replicated table index as part of the rebuild. Only use indexes when the performance gain outweighs the cost of rebuilding the indexes.
131
132
132
133
### Batch data loads
133
134
When loading data into replicated tables, try to minimize rebuilds by batching loads together. Perform all the batched loads before running select statements.
@@ -177,8 +178,8 @@ SELECT TOP 1 * FROM [ReplicatedTable]
177
178
## Next steps
178
179
To create a replicated table, use one of these statements:
179
180
180
-
-[CREATE TABLE (Azure SQL Data Warehouse)](/sql/t-sql/statements/create-table-azure-sql-data-warehouse)
181
-
-[CREATE TABLE AS SELECT (Azure SQL Data Warehouse)](/sql/t-sql/statements/create-table-as-select-azure-sql-data-warehouse)
0 commit comments