Skip to content

Commit 047fe59

Browse files
committed
update to links
1 parent 4ce7c44 commit 047fe59

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

articles/synapse-analytics/sql-data-warehouse/sql-data-warehouse-best-practices-development.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -34,7 +34,7 @@ See also [Manage table statistics](../../sql-data-warehouse/sql-data-warehouse-t
3434
## Hash distribute large tables
3535
By default, tables are Round Robin distributed. This design makes it easy for users to get started creating tables without having to decide how their tables should be distributed. Round Robin tables may perform sufficiently for some workloads, but in most cases selecting a distribution column will perform much better. The most common example of when a table distributed by a column will far outperform a Round Robin table is when two large fact tables are joined. For example, if you have an orders table, which is distributed by order_id, and a transactions table, which is also distributed by order_id, when you join your orders table to your transactions table on order_id, this query becomes a pass-through query, which means we eliminate data movement operations. Fewer steps mean a faster query. Less data movement also makes for faster queries. This explanation only scratch the surface. When loading a distributed table, be sure that your incoming data is not sorted on the distribution key as this will slow down your loads. See the below links for many more details on how selecting a distribution column can improve performance as well as how to define a distributed table in the WITH clause of your CREATE TABLES statement.
3636

37-
See also [Table overview](../../sql-data-warehouse/sql-data-warehouse-tables-overview.md), [Table distribution](sql-data-warehouse-tables-distribute.md), [Selecting table distribution](https://blogs.msdn.microsoft.com/sqlcat/20../../choosing-hash-distributed-table-vs-round-robin-distributed-table-in-azure-sql-dw-service/), [CREATE TABLE](../../sql-data-warehouse/sql-data-warehouse-tables-overview), [CREATE TABLE AS SELECT](sql-data-warehouse-develop-ctas.md)
37+
See also [Table overview](../../sql-data-warehouse/sql-data-warehouse-tables-overview.md), [Table distribution](sql-data-warehouse-tables-distribute.md), [Selecting table distribution](https://blogs.msdn.microsoft.com/sqlcat/20../../choosing-hash-distributed-table-vs-round-robin-distributed-table-in-azure-sql-dw-service/), [CREATE TABLE](../../sql-data-warehouse/sql-data-warehouse-tables-overview.md), [CREATE TABLE AS SELECT](sql-data-warehouse-develop-ctas.md)
3838

3939
## Do not over-partition
4040
While partitioning data can be effective for maintaining your data through partition switching or optimizing scans by with partition elimination, having too many partitions can slow down your queries. Often a high granularity partitioning strategy that may work well on SQL Server may not work well on SQL pool. Having too many partitions can also reduce the effectiveness of clustered columnstore indexes if each partition has fewer than 1 million rows. Keep in mind that behind the scenes, SQL pool partitions your data for you into 60 databases, so if you create a table with 100 partitions, this actually results in 6000 partitions. Each workload is different so the best advice is to experiment with partitioning to see what works best for your workload. Consider lower granularity than what may have worked for you in SQL Server. For example, consider using weekly or monthly partitions rather than daily partitions.

0 commit comments

Comments
 (0)