You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/sql-data-warehouse/design-guidance-for-replicated-tables.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: Design guidance for replicated tables
3
-
description: Recommendations for designing replicated tables in Synapse SQL
3
+
description: Recommendations for designing replicated tables in Synapse SQL pool
4
4
services: synapse-analytics
5
5
author: XiaoyuMSFT
6
6
manager: craigg
@@ -13,27 +13,27 @@ ms.reviewer: igorstan
13
13
ms.custom: seo-lt-2019, azure-synapse
14
14
---
15
15
16
-
# Design guidance for using replicated tables in SQL Analytics
16
+
# Design guidance for using replicated tables in Synapse SQL pool
17
17
18
-
This article gives recommendations for designing replicated tables in your SQL Analytics schema. Use these recommendations to improve query performance by reducing data movement and query complexity.
18
+
This article gives recommendations for designing replicated tables in your Synapse SQL pool schema. Use these recommendations to improve query performance by reducing data movement and query complexity.
This article assumes you are familiar with data distribution and data movement concepts in SQL Analytics. For more information, see the [architecture](massively-parallel-processing-mpp-architecture.md) article.
24
+
This article assumes you are familiar with data distribution and data movement concepts in SQL pool. For more information, see the [architecture](massively-parallel-processing-mpp-architecture.md) article.
25
25
26
26
As part of table design, understand as much as possible about your data and how the data is queried. For example, consider these questions:
27
27
28
28
- How large is the table?
29
29
- How often is the table refreshed?
30
-
- Do I have fact and dimension tables in a SQL Analytics database?
30
+
- Do I have fact and dimension tables in a SQL pool database?
31
31
32
32
## What is a replicated table?
33
33
34
34
A replicated table has a full copy of the table accessible on each Compute node. Replicating a table removes the need to transfer data among Compute nodes before a join or aggregation. Since the table has multiple copies, replicated tables work best when the table size is less than 2 GB compressed. 2 GB is not a hard limit. If the data is static and does not change, you can replicate larger tables.
35
35
36
-
The following diagram shows a replicated table that is accessible on each Compute node. In SQL Analytics, the replicated table is fully copied to a distribution database on each Compute node.
36
+
The following diagram shows a replicated table that is accessible on each Compute node. In SQL pool, the replicated table is fully copied to a distribution database on each Compute node.
@@ -47,8 +47,8 @@ Consider using a replicated table when:
47
47
Replicated tables may not yield the best query performance when:
48
48
49
49
- The table has frequent insert, update, and delete operations. The data manipulation language (DML) operations require a rebuild of the replicated table. Rebuilding frequently can cause slower performance.
50
-
- The SQL Analytics database is scaled frequently. Scaling a SQL Analytics database changes the number of Compute nodes, which incurs rebuilding the replicated table.
51
-
- The table has a large number of columns, but data operations typically access only a small number of columns. In this scenario, instead of replicating the entire table, it might be more effective to distribute the table, and then create an index on the frequently accessed columns. When a query requires data movement, SQL Analytics only moves data for the requested columns.
50
+
- The SQL pool database is scaled frequently. Scaling a SQL pool database changes the number of Compute nodes, which incurs rebuilding the replicated table.
51
+
- The table has a large number of columns, but data operations typically access only a small number of columns. In this scenario, instead of replicating the entire table, it might be more effective to distribute the table, and then create an index on the frequently accessed columns. When a query requires data movement, SQL pool only moves data for the requested columns.
52
52
53
53
## Use replicated tables with simple query predicates
54
54
@@ -119,7 +119,7 @@ We re-created `DimDate` and `DimSalesTerritory` as replicated tables, and ran th
119
119
120
120
## Performance considerations for modifying replicated tables
121
121
122
-
SQL Analytics implements a replicated table by maintaining a master version of the table. It copies the master version to the first distribution database on each Compute node. When there is a change, SQL Analytics first updates the master version, then it rebuilds the tables on each Compute node. A rebuild of a replicated table includes copying the table to each Compute node and then building the indexes. For example, a replicated table on a DW2000c has 5 copies of the data. A master copy and a full copy on each Compute node. All data is stored in distribution databases. SQL Analytics uses this model to support faster data modification statements and flexible scaling operations.
122
+
SQL pool implements a replicated table by maintaining a master version of the table. It copies the master version to the first distribution database on each Compute node. When there is a change, the master version is updated first, then the tables on each Compute node are rebuilt. A rebuild of a replicated table includes copying the table to each Compute node and then building the indexes. For example, a replicated table on a DW2000c has 5 copies of the data. A master copy and a full copy on each Compute node. All data is stored in distribution databases. SQL pool uses this model to support faster data modification statements and flexible scaling operations.
123
123
124
124
Rebuilds are required after:
125
125
@@ -136,7 +136,7 @@ The rebuild does not happen immediately after data is modified. Instead, the reb
136
136
137
137
### Use indexes conservatively
138
138
139
-
Standard indexing practices apply to replicated tables. SQL Analytics rebuilds each replicated table index as part of the rebuild. Only use indexes when the performance gain outweighs the cost of rebuilding the indexes.
139
+
Standard indexing practices apply to replicated tables. SQL pool rebuilds each replicated table index as part of the rebuild. Only use indexes when the performance gain outweighs the cost of rebuilding the indexes.
140
140
141
141
### Batch data load
142
142
@@ -188,7 +188,7 @@ SELECT TOP 1 * FROM [ReplicatedTable]
188
188
189
189
To create a replicated table, use one of these statements:
0 commit comments