You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/sql-data-warehouse/design-guidance-for-replicated-tables.md
+26-30Lines changed: 26 additions & 30 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -41,9 +41,9 @@ Consider using a replicated table when:
41
41
- The table size on disk is less than 2 GB, regardless of the number of rows. To find the size of a table, you can use the [DBCC PDW_SHOWSPACEUSED](https://docs.microsoft.com/sql/t-sql/database-console-commands/dbcc-pdw-showspaceused-transact-sql) command: `DBCC PDW_SHOWSPACEUSED('ReplTableCandidate')`.
42
42
- The table is used in joins that would otherwise require data movem'nt. When joining tables that are not distributed on the same column, such as a hash-distributed table to a round-robin table, data movement is required to complete the query. If one of the tables is small, consider a replicated table. We recommend using replicated tables instead of round-robin tables in most cases. To view data movement operations in query plans, use [sys.dm_pdw_request_steps](https://docs.microsoft.com/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-request-steps-transact-sql). The BroadcastMoveOperation is the typical data movement operation that can be eliminated by using a replicated table.
43
43
44
-
Replicated tables may not yield the best query'performance when:
44
+
Replicated tables may not yield the best queryperformance when:
45
45
46
-
- The table has frequent insert, update, and delete operations. Thes' data manipulation language (DML) operations require a rebuild of the replicated table. Rebuilding frequently can cause slower performance.
46
+
- The table has frequent insert, update, and delete operations. The data manipulation language (DML) operations require a rebuild of the replicated table. Rebuilding frequently can cause slower performance.
47
47
- The SQL Analytics database is scaled frequently. Scaling a SQL Analytics database changes the number of Compute nodes, which incurs rebuilding the replicated table.
48
48
- The table has a large number of columns, but data operations typically access only a small number of columns. In this scenario, instead of replicating the entire table, it might be more effective to distribute the table, and then create an index on the frequently accessed columns. When a query requires data movement, SQL Analytics only moves data for the requested columns.
49
49
@@ -65,31 +65,31 @@ WHERE EnglishDescription LIKE '%frame%comfortable%'
65
65
66
66
```
67
67
68
-
# Convert existing round-robin tables to replicated tables
69
-
you already have round-robin tables, we recommend converting them to replicated tables if they meet the criteria outlined in this article. Replicated tables improve performance over round-robin tables because they eliminate the need for data movement.A round-robin table always requires data movement for joins. '
70
-
71
-
his example uses [CTAS](/sql/t-sql/statements/create-ta'le-as-select-azure-sql-data-warehouse) to change the DimSalesTerritory table to a replicated table. This example works regardless of whether DimSalesTerritory is hash-distributed or round-robin.
ENAME OBJECT [dbo].[DimSalesTerritory] to [DimSalesTerritory_old];
85
-
ENAME OBJECT [dbo'.[DimSalesTerrit'ry_REPLICATE] TO [DimSalesTerritory];
86
-
87
-
ROP TABLE [dbo].[DimSalesTerritory_old];
88
-
``
68
+
## Convert existing round-robin tables to replicated tables
69
+
If you already have round-robin tables, we recommend converting them to replicated tables if they meet the criteria outlined in this article. Replicated tables improve performance over round-robin tables because they eliminate the need for data movement.A round-robin table always requires data movement for joins.
70
+
71
+
This example uses [CTAS](/sql/t-sql/statements/create-table-as-select-azure-sql-data-warehouse) to change the DimSalesTerritory table to a replicated table. This example works regardless of whether DimSalesTerritory is hash-distributed or round-robin.
RENAME OBJECT [dbo].[DimSalesTerritory] to [DimSalesTerritory_old];
85
+
RENAME OBJECT [dbo].[DimSalesTerritory_REPLICATE] TO [DimSalesTerritory];
86
+
87
+
DROPTABLE [dbo].[DimSalesTerritory_old];
88
+
```
89
89
90
-
## Query performance example for round-robin versus replicated
90
+
### Query performance example for round-robin versus replicated
91
91
92
-
replicated table does not require any data movement for joins because the entire table is already present on each Compute node. If the dimension tables are round-robin distributed, a join copies the dimension table in full to each Compute node. To move the data, the query plan contains an operation called BroadcastMoveOperation. This type of data movement operation slows query performance and is eliminated by using replicated tables. To view query plan steps, use the [sys.dm_pdw_request_steps](/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-request-steps-transact-sql) system catalog view.
92
+
A replicated table does not require any data movement for joins because the entire table is already present on each Compute node. If the dimension tables are round-robin distributed, a join copies the dimension table in full to each Compute node. To move the data, the query plan contains an operation called BroadcastMoveOperation. This type of data movement operation slows query performance and is eliminated by using replicated tables. To view query plan steps, use the [sys.dm_pdw_request_steps](/sql/relational-databases/system-dynamic-management-views/sys-dm-pdw-request-steps-transact-sql) system catalog view.
93
93
94
94
For example, in following query against the AdventureWorks schema, the `FactInternetSales` table is hash-distributed. The `DimDate` and `DimSalesTerritory` tables are smaller dimension tables. This query returns the total sales in North America for fiscal year 2004:
95
95
@@ -132,7 +132,7 @@ Standard indexing practices apply to replicated tables. SQL Analytics rebuilds e
132
132
### Batch data loads
133
133
When loading data into replicated tables, try to minimize rebuilds by batching loads together. Perform all the batched loads before running select statements.
134
134
135
-
or example, this load pattern loads data from four sources and invokes four rebuilds. ''
135
+
For example, this load pattern loads data from four sources and invokes four rebuilds.
136
136
137
137
Load from source 1.
138
138
- Select statement triggers rebuild 1.
@@ -181,7 +181,3 @@ To create a replicated table, use one of these statements:
181
181
-[CREATE TABLE AS SELECT (SQL Analytics)](/sql/t-sql/statements/create-table-as-select-azure-sql-data-warehouse)
182
182
183
183
For an overview of distributed tables, see [distributed tables](sql-data-warehouse-tables-distribute.md).
0 commit comments