You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/synapse-analytics/guidance/security-white-paper-introduction.md
+27-3Lines changed: 27 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -20,7 +20,7 @@ ms.date: 01/14/2022
20
20
-[Dedicated SQL pool](../sql-data-warehouse/sql-data-warehouse-overview-what-is.md?context=/azure/synapse-analytics/context/context) (formerly SQL DW) for enterprise data warehousing.
21
21
- Deep integration with [Power BI](https://powerbi.microsoft.com/), [Azure Cosmos DB](../../cosmos-db/synapse-link.md?context=/azure/synapse-analytics/context/context), and [Azure Machine Learning](../machine-learning/what-is-machine-learning.md).
22
22
23
-
Azure Synapse data security and privacy are non-negotiable. The purpose of this white paper, then, is to provide a comprehensive overview of Azure Synapse security features, which are enterprise-grade and industry-leading. The white paper comprises a series of articles that cover the following five layers of security:
23
+
Azure Synapse data security and privacy are non-negotiable. The purpose of this white paper is to provide a comprehensive overview of Azure Synapse security features, which are enterprise-grade and industry-leading. The white paper comprises a series of articles that cover the following five layers of security:
24
24
25
25
- Data protection
26
26
- Access control
@@ -30,9 +30,9 @@ Azure Synapse data security and privacy are non-negotiable. The purpose of this
30
30
31
31
This white paper targets all enterprise security stakeholders. They include security administrators, network administrations, Azure administrators, workspace administrators, and database administrators.
32
32
33
-
**Writers:** Vengatesh Parasuraman, Fretz Nuson, Ron Dunn, Khendr'a Reid, John Hoang, Nithesh Krishnappa, Mykola Kovalenko, Brad Schacht, Pedro Matinez, Mark Pryce-Maher, and Arshad Ali.
33
+
**Writers:** Vengatesh Parasuraman, Fretz Nuson, Ron Dunn, Khendr'a Reid, John Hoang, Nithesh Krishnappa, Mykola Kovalenko, Brad Schacht, Pedro Martinez, Mark Pryce-Maher, and Arshad Ali.
34
34
35
-
**Technical Reviewers:** Nandita Valsan, Rony Thomas, Daniel Crawford, and Tammy Richter Jones.
35
+
**Technical Reviewers:** Nandita Valsan, Rony Thomas, Abhishek Narain, Daniel Crawford, and Tammy Richter Jones.
36
36
37
37
**Applies to:** Azure Synapse Analytics, dedicated SQL pool (formerly SQL DW), serverless SQL pool, and Apache Spark pool.
38
38
@@ -53,6 +53,30 @@ Some common security questions include:
53
53
54
54
The purpose of this white paper is to provide answers to these common security questions, and many others.
55
55
56
+
## Component architecture
57
+
58
+
Azure Synapse is a Platform-as-a-service (PaaS) analytics service that brings together multiple independent components such as dedicated SQL pools, serverless SQL pools, Apache Spark pools, and data integration pipelines. These components are designed to work together to provide a seamless analytical platform experience.
59
+
60
+
[Dedicated SQL pools](../sql/overview-architecture.md) are provisioned clusters that provide enterprise data warehousing capabilities for SQL workloads. Data is ingested into managed storage powered by Azure Storage, which is also a PaaS service. Compute is isolated from storage enabling customers to scale compute independently of their data. Dedicated SQL pools also provide the ability to query data files directly over customer-managed Azure Storage accounts by using external tables.
61
+
62
+
[Serverless SQL pools](../sql/on-demand-workspace-overview.md) are on-demand clusters that provide a SQL interface to query and analyze data directly over customer-managed Azure Storage accounts. Since they're serverless, there's no managed storage, and the compute nodes scale automatically in response to the query workload.
63
+
64
+
[Apache Spark](../spark/apache-spark-overview.md) in Azure Synapse is one of Microsoft's implementations of open-source Apache Spark in the cloud. Spark instances are provisioned on-demand based on the metadata configurations defined in the Spark pools. Each user gets their own dedicated Spark instance for running their jobs. The data files processed by the Spark instances are managed by the customer in their own Azure Storage accounts.
65
+
66
+
[Pipelines](../../data-factory/concepts-pipelines-activities.md) are a logical grouping of activities that perform data movement and data transformation at scale. [Data flow] (../../data-factory/concepts-data-flow-overview.md) is a transformation activity in a pipeline that's developed by using a low-code user interface. It can execute data transformations at scale. Behind the scenes, data flows use Apache Spark clusters of Azure Synapse to execute automatically generated code. Pipelines and data flows are compute-only services, and they don't have any managed storage associated with them.
67
+
68
+
Pipelines use the Integration Runtime (IR) as the scalable compute infrastructure for performing data movement and dispatch activities. Data movement activities run on the IR whereas the dispatch activities run on variety of other compute engines, including Azure SQL Database, Azure HDInsight, Azure Databricks, Apache Spark clusters of Azure Synapse, and others. Azure Synapse supports two types of IR: Azure Integration Runtime and Self-hosted Integration Runtime. The [Azure IR](/azure/data-factory/concepts-integration-runtime.md#azure-integration-runtime) provides a fully managed, scalable, and on-demand compute infrastructure. The [Self-hosted IR](/azure/data-factory/concepts-integration-runtime.md#self-hosted-integration-runtime) is installed and configured by the customer in their own network, either in on-premises machines or in Azure cloud virtual machines.
69
+
70
+
Customers can choose to associate their Synapse workspace with a [managed workspace virtual network](../security/synapse-workspace-managed-vnet.md). When associated with a managed workspace virtual network, Azure IRs and Apache Spark clusters that are used by pipelines, data flows, and the Apache Spark pools are deployed inside the managed workspace virtual network. This setup ensures network isolation between the workspaces for pipelines and Apache Spark workloads.
71
+
72
+
The following diagram depicts the various components of Azure Synapse.
Each individual component of Azure Synapse depicted in the diagram provides its own security features. Security features provide data protection, access control, authentication, network security, and threat protection for securing the compute and the associated data that’s processed. Additionally, Azure Storage, being a PaaS service, provides additional security of its own, that's set up and managed by the customer in their own storage accounts. This level of component isolation limits and minimizes the exposure if there were a security vulnerability in any one of its components.
79
+
56
80
## Security layers
57
81
58
82
Azure Synapse implements a multi-layered security architecture for end-to-end protection of your data. There are five layers:
0 commit comments