Skip to content

Commit 2b44f7f

Browse files
Merge pull request #101766 from Kat-Campise/mpp_architecture
mpp architecture fix links
2 parents fac0ae0 + 0bb2fa5 commit 2b44f7f

File tree

1 file changed

+23
-51
lines changed

1 file changed

+23
-51
lines changed
Lines changed: 23 additions & 51 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
---
22
title: Azure Synapse Analytics (formerly SQL DW) architecture
3-
description: Learn how Azure Synapse Analytics (formerly SQL DW) combines massively parallel processing (MPP) with Azure storage to achieve high performance and scalability.
3+
description: Learn how Azure Synapse Analytics (formerly SQL DW) combines massively parallel processing (MPP) with Azure Storage to achieve high performance and scalability.
44
services: sql-data-warehouse
55
author: mlee3gsd
66
manager: craigg
@@ -17,22 +17,22 @@ ms.reviewer: igorstan
1717
Azure Synapse is a limitless analytics service that brings together enterprise data warehousing and Big Data analytics. It gives you the freedom to query data on your terms, using either serverless on-demand or provisioned resources—at scale. Azure Synapse brings these two worlds together with a unified experience to ingest, prepare, manage, and serve data for immediate BI and machine learning needs.
1818

1919
Azure Synapse has four components:
20-
- SQL Analytics : Complete T-SQL based analytics
20+
- SQL Analytics: Complete T-SQL based analytics
2121
- SQL pool (pay per DWU provisioned) – Generally Available
2222
- SQL on-demand (pay per TB processed) – (Preview)
23-
- Spark : Deeply integrated Apache Spark (Preview)
24-
- Data Integration : Hybrid data integration (Preview)
25-
- Studio : unified user experience. (Preview)
23+
- Spark: Deeply integrated Apache Spark (Preview)
24+
- Data Integration: Hybrid data integration (Preview)
25+
- Studio: unified user experience. (Preview)
2626

2727
> [!VIDEO https://www.youtube.com/embed/PlyQ8yOb8kc]
2828
2929
## SQL Analytics MPP architecture components
3030

31-
[SQL Analytics](sql-data-warehouse-overview-what-is.md#sql-analytics-and-sql-pool-in-azure-synapse) leverages a scale out architecture to distribute computational processing of data across multiple nodes. The unit of scale is an abstraction of compute power that is known as a [data warehouse unit](what-is-a-data-warehouse-unit-dwu-cdwu.md). Compute is separate from storage which enables you to scale compute independently of the data in your system.
31+
[SQL Analytics](sql-data-warehouse-overview-what-is.md#sql-analytics-and-sql-pool-in-azure-synapse) leverages a scale-out architecture to distribute computational processing of data across multiple nodes. The unit of scale is an abstraction of compute power that is known as a [data warehouse unit](what-is-a-data-warehouse-unit-dwu-cdwu.md). Compute is separate from storage, which enables you to scale compute independently of the data in your system.
3232

3333
![SQL Analytics architecture](media/massively-parallel-processing-mpp-architecture/massively-parallel-processing-mpp-architecture.png)
3434

35-
SQL Analytics uses a node-based architecture. Applications connect and issue T-SQL commands to a Control node, which is the single point of entry for SQL Analytics. The Control node runs the MPP engine which optimizes queries for parallel processing, and then passes operations to Compute nodes to do their work in parallel.
35+
SQL Analytics uses a node-based architecture. Applications connect and issue T-SQL commands to a Control node, which is the single point of entry for SQL Analytics. The Control node runs the MPP engine, which optimizes queries for parallel processing, and then passes operations to Compute nodes to do their work in parallel.
3636

3737
The Compute nodes store all user data in Azure Storage and run the parallel queries. The Data Movement Service (DMS) is a system-level internal service that moves data across the nodes as necessary to run queries in parallel and return accurate results.
3838

@@ -43,9 +43,9 @@ With decoupled storage and compute, when using SQL Analytics one can:
4343
* Pause compute capacity while leaving data intact, so you only pay for storage.
4444
* Resume compute capacity during operational hours.
4545

46-
### Azure storage
46+
### Azure Storage
4747

48-
SQL Analytics leverages Azure storage to keep your user data safe. Since your data is stored and managed by Azure storage, there is a separate charge for your storage consumption. The data itself is sharded into **distributions** to optimize the performance of the system. You can choose which sharding pattern to use to distribute the data when you define the table. These sharding patterns are supported:
48+
SQL Analytics leverages Azure Storage to keep your user data safe. Since your data is stored and managed by Azure Storage, there is a separate charge for your storage consumption. The data is sharded into **distributions** to optimize the performance of the system. You can choose which sharding pattern to use to distribute the data when you define the table. These sharding patterns are supported:
4949

5050
* Hash
5151
* Round Robin
@@ -88,55 +88,27 @@ There are performance considerations for the selection of a distribution column,
8888
## Round-robin distributed tables
8989
A round-robin table is the simplest table to create and delivers fast performance when used as a staging table for loads.
9090

91-
A round-robin distributed table distributes data evenly across the table but without any further optimization. A distribution is first chosen at random and then buffers of rows are assigned to distributions sequentially. It is quick to load data into a round-robin table, but query performance can often be better with hash distributed tables. Joins on round-robin tables require reshuffling data and this takes additional time.
91+
A round-robin distributed table distributes data evenly across the table but without any further optimization. A distribution is first chosen at random and then buffers of rows are assigned to distributions sequentially. It is quick to load data into a round-robin table, but query performance can often be better with hash distributed tables. Joins on round-robin tables require reshuffling data, which takes additional time.
9292

9393

9494
## Replicated Tables
9595
A replicated table provides the fastest query performance for small tables.
9696

97-
A table that is replicated caches a full copy of the table on each compute node. Consequently, replicating a table removes the need to transfer data among compute nodes before a join or aggregation. Replicated tables are best utilized with small tables. Extra storage is required and there is additional overhead that is incurred when writing data which make large tables impractical.
97+
A table that is replicated caches a full copy of the table on each compute node. Consequently, replicating a table removes the need to transfer data among compute nodes before a join or aggregation. Replicated tables are best utilized with small tables. Extra storage is required and there is additional overhead that is incurred when writing data, which make large tables impractical.
9898

99-
The diagram below shows a replicated table which is cached on the first distribution on each compute node.
99+
The diagram below shows a replicated table that is cached on the first distribution on each compute node.
100100

101101
![Replicated table](media/sql-data-warehouse-distributed-data/replicated-table.png "Replicated table")
102102

103103
## Next steps
104-
Now that you know a bit about Azure Synapse, learn how to quickly [create a SQL pool][create a SQL pool] and [load sample data][load sample data]. If you are new to Azure, you may find the [Azure glossary][Azure glossary] helpful as you encounter new terminology. Or look at some of these other Azure Synapse Resources.
105-
106-
* [Customer success stories]
107-
* [Blogs]
108-
* [Feature requests]
109-
* [Videos]
110-
* [Customer Advisory Team blogs]
111-
* [Create support ticket]
112-
* [MSDN forum]
113-
* [Stack Overflow forum]
114-
* [Twitter]
115-
116-
<!--Image references-->
117-
[1]: ./media/sql-data-warehouse-overview-what-is/dwarchitecture.png
118-
119-
<!--Article references-->
120-
[Create support ticket]: ./sql-data-warehouse-get-started-create-support-ticket.md
121-
[load sample data]: ./sql-data-warehouse-load-sample-databases.md
122-
[create a SQL pool]: ./sql-data-warehouse-get-started-provision.md
123-
[Migration documentation]: ./sql-data-warehouse-overview-migrate.md
124-
[Azure Synapse solution partners]: ./sql-data-warehouse-partner-business-intelligence.md
125-
[Integrated tools overview]: ./sql-data-warehouse-overview-integrate.md
126-
[Backup and restore overview]: ./sql-data-warehouse-restore-database-overview.md
127-
[Azure glossary]: ../azure-glossary-cloud-terminology.md
128-
129-
<!--MSDN references-->
130-
131-
<!--Other Web references-->
132-
[Customer success stories]: https://azure.microsoft.com/case-studies/?service=sql-data-warehouse
133-
[Blogs]: https://azure.microsoft.com/blog/tag/azure-sql-data-warehouse/
134-
[Customer Advisory Team blogs]: https://blogs.msdn.microsoft.com/sqlcat/tag/sql-dw/
135-
[Feature requests]: https://feedback.azure.com/forums/307516-sql-data-warehouse
136-
[MSDN forum]: https://social.msdn.microsoft.com/Forums/azure/home?forum=AzureSQLDataWarehouse
137-
[Stack Overflow forum]: https://stackoverflow.com/questions/tagged/azure-sqldw
138-
[Twitter]: https://twitter.com/hashtag/SQLDW
139-
[Videos]: https://azure.microsoft.com/documentation/videos/index/?services=sql-data-warehouse
140-
[SLA for Azure Synapse]: https://azure.microsoft.com/support/legal/sla/sql-data-warehouse/v1_0/
141-
[Volume Licensing]: https://www.microsoftvolumelicensing.com/DocumentSearch.aspx?Mode=3&DocumentTypeId=37
142-
[Service Level Agreements]: https://azure.microsoft.com/support/legal/sla/
104+
Now that you know a bit about Azure Synapse, learn how to quickly [create a SQL pool](./sql-data-warehouse-get-started-provision.md) and [load sample data](./sql-data-warehouse-load-sample-databases.md). If you are new to Azure, you may find the [Azure glossary](../azure-glossary-cloud-terminology.md) helpful as you encounter new terminology. Or look at some of these other Azure Synapse Resources.
105+
106+
* [Customer success stories](https://azure.microsoft.com/case-studies/?service=sql-data-warehouse)
107+
* [Blogs](https://azure.microsoft.com/blog/tag/azure-sql-data-warehouse/)
108+
* [Feature requests](https://feedback.azure.com/forums/307516-sql-data-warehouse)
109+
* [Videos](https://azure.microsoft.com/documentation/videos/index/?services=sql-data-warehouse)
110+
* [Create support ticket](./sql-data-warehouse-get-started-create-support-ticket.md)
111+
* [MSDN forum](https://social.msdn.microsoft.com/Forums/azure/home?forum=AzureSQLDataWarehouse)
112+
* [Stack Overflow forum](https://stackoverflow.com/questions/tagged/azure-sqldw)
113+
* [Twitter](https://twitter.com/hashtag/SQLDW)
114+

0 commit comments

Comments
 (0)