You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/purview/concept-best-practices-network.md
+22-20Lines changed: 22 additions & 20 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -57,35 +57,37 @@ Here are some best practices:
57
57
58
58
:::image type="content" source="media/concept-best-practices/network-azure-runtime.png" alt-text="Screenshot that shows the connection flow between Microsoft Purview, the Azure runtime, and data sources."lightbox="media/concept-best-practices/network-azure-runtime.png":::
59
59
60
-
1. A manual or automatic scan is initiated from the Microsoft Purview data map through the Azure integration runtime.
60
+
1. A manual or automatic scan is initiated from the Microsoft Purview Data Map through the Azure integration runtime.
61
61
62
62
2. The Azure integration runtime connects to the data source to extract metadata.
63
63
64
64
3. Metadata is queued in Microsoft Purview managed storage and stored in Azure Blob Storage.
65
65
66
-
4. Metadata is sent to the Microsoft Purview data map.
66
+
4. Metadata is sent to the Microsoft Purview Data Map.
67
67
68
-
- Scanning on-premises and VM-based data sources always requires using a self-hosted integration runtime. The Azure integration runtime is not supported for these data sources. The following steps show the communication flow at a high level when you're using a self-hosted integration runtime to scan a data source:
68
+
- Scanning on-premises and VM-based data sources always requires using a self-hosted integration runtime. The Azure integration runtime isn't supported for these data sources. The following steps show the communication flow at a high level when you're using a self-hosted integration runtime to scan a data source. The first diagram shows a scenario where resources are within Azure or on a VM in Azure. The second diagram shows a scenario with on-premises resources. The steps between the two are the same from Microsoft Purview's perspective:
69
69
70
70
:::image type="content" source="media/concept-best-practices/network-self-hosted-runtime.png" alt-text="Screenshot that shows the connection flow between Microsoft Purview, a self-hosted runtime, and data sources."lightbox="media/concept-best-practices/network-self-hosted-runtime.png":::
71
71
72
+
:::image type="content" source="media/concept-best-practices/security-self-hosted-runtime-on-premises.png" alt-text="Screenshot that shows the connection flow between Microsoft Purview, an on-premises self-hosted runtime, and data sources in on-premises network."lightbox="media/concept-best-practices/security-self-hosted-runtime-on-premises.png":::
73
+
72
74
1. A manual or automatic scan is triggered. Microsoft Purview connects to Azure Key Vault to retrieve the credential to access a data source.
73
75
74
-
2. The scan is initiated from the Microsoft Purview data map through a self-hosted integration runtime.
76
+
2. The scan is initiated from the Microsoft Purview Data Map through a self-hosted integration runtime.
75
77
76
-
3. The self-hosted integration runtime service from the VM connects to the data source to extract metadata.
78
+
3. The self-hosted integration runtime service from the VM or on-premises machine connects to the data source to extract metadata.
77
79
78
-
4. Metadata is processed in VM memory for the self-hosted integration runtime. Metadata is queued in Microsoft Purview managed storage and then stored in Azure Blob Storage.
80
+
4. Metadata is processed in the machine's memory for the self-hosted integration runtime. Metadata is queued in Microsoft Purview managed storage and then stored in Azure Blob Storage. Actual data never leaves the boundary of your network.
79
81
80
-
5. Metadata is sent to the Microsoft Purview data map.
82
+
5. Metadata is sent to the Microsoft Purview Data Map.
81
83
82
84
### Authentication options
83
85
84
86
When you're scanning a data source in Microsoft Purview, you need to provide a credential. Microsoft Purview can then read the metadata of the assets by using the Azure integration runtime in the destination data source. When you're using a public network, authentication options and requirements vary based on the following factors:
85
87
86
88
-**Data source type**. For example, if the data source is Azure SQL Database, you need to use SQL authentication with db_datareader access to each database. This can be a user-managed identity or a Microsoft Purview managed identity. Or it can be a service principal in Azure Active Directory added to SQL Database as db_datareader.
87
89
88
-
If the data source is Azure Blob Storage, you can use a Microsoft Purview managed identity or a service principal in Azure Active Directory added as a Blob Storage Data Reader role on the Azure storage account. Or simply use the storage account's key.
90
+
If the data source is Azure Blob Storage, you can use a Microsoft Purview managed identity, or a service principal in Azure Active Directory added as a Blob Storage Data Reader role on the Azure storage account. Or use the storage account's key.
89
91
90
92
-**Authentication type**. We recommend that you use a Microsoft Purview managed identity to scan Azure data sources when possible, to reduce administrative overhead. For any other authentication types, you need to [set up credentials for source authentication inside Microsoft Purview](manage-credentials.md):
91
93
@@ -95,7 +97,7 @@ When you're scanning a data source in Microsoft Purview, you need to provide a c
95
97
96
98
-**Runtime type that's used in the scan**. Currently, you can't use a Microsoft Purview managed identity with a self-hosted integration runtime.
97
99
98
-
### Additional considerations
100
+
### Other considerations
99
101
100
102
- If you choose to scan data sources using public endpoints, your self-hosted integration runtime VMs must have outbound access to data sources and Azure endpoints.
101
103
@@ -105,7 +107,7 @@ When you're scanning a data source in Microsoft Purview, you need to provide a c
105
107
106
108
## Option 2: Use private endpoints
107
109
108
-
Similar to other PaaS solutions, Microsoft Purview does not support deploying directly into a virtual network. So you can't use certain networking features with the offering's resources, such as network security groups, route tables, or other network-dependent appliances such as Azure Firewall. Instead, you can use private endpoints that can be enabled on your virtual network. You can then disable public internet access to securely connect to Microsoft Purview.
110
+
Similar to other PaaS solutions, Microsoft Purview doesn't support deploying directly into a virtual network. So you can't use certain networking features with the offering's resources, such as network security groups, route tables, or other network-dependent appliances such as Azure Firewall. Instead, you can use private endpoints that can be enabled on your virtual network. You can then disable public internet access to securely connect to Microsoft Purview.
109
111
110
112
You must use private endpoints for your Microsoft Purview account if you have any of the following requirements:
111
113
@@ -149,7 +151,7 @@ You must use private endpoints for your Microsoft Purview account if you have an
149
151
150
152
### Current limitations
151
153
152
-
- Scanning multiple Azure sources by using the entire subscription or resource group through ingestion private endpoints and a self-hosted integration runtime is not supported when you're using private endpoints for ingestion. Instead, you can register and scan data sources individually.
154
+
- Scanning multiple Azure sources by using the entire subscription or resource group through ingestion private endpoints and a self-hosted integration runtime isn't supported when you're using private endpoints for ingestion. Instead, you can register and scan data sources individually.
153
155
154
156
- For limitations related to Microsoft Purview private endpoints, see [Known limitations](catalog-private-link-troubleshoot.md#known-limitations).
155
157
@@ -184,7 +186,7 @@ The self-hosted integration runtime VMs can be deployed inside the same Azure vi
184
186
185
187
:::image type="content" source="media/concept-best-practices/network-pe-multi-vnet.png" alt-text="Screenshot that shows Microsoft Purview with private endpoints in a scenario of multiple virtual networks."lightbox="media/concept-best-practices/network-pe-multi-vnet.png":::
186
188
187
-
You can optionally deploy an additional self-hosted integration runtime in the spoke virtual networks.
189
+
You can optionally deploy another self-hosted integration runtime in the spoke virtual networks.
188
190
189
191
#### Multiple regions, multiple virtual networks
190
192
@@ -198,15 +200,15 @@ For performance and cost optimization, we highly recommended deploying one or mo
198
200
199
201
#### Name resolution for multiple Microsoft Purview accounts
200
202
201
-
It is recommended to follow these recommendations, if your organization needs to deploy and maintain multiple Microsoft Purview accounts using private endpoints:
203
+
It's recommended to follow these recommendations, if your organization needs to deploy and maintain multiple Microsoft Purview accounts using private endpoints:
202
204
203
205
1. Deploy at least one _account_ private endpoint for each Microsoft Purview account.
204
206
2. Deploy at least one set of _ingestion_ private endpoints for each Microsoft Purview account.
205
207
3. Deploy one _portal_ private endpoint for one of the Microsoft Purview accounts in your Azure environments. Create one DNS A record for _portal_ private endpoint to resolve `web.purview.azure.com`. The _portal_ private endpoint can be used by all purview accounts in the same Azure virtual network or virtual networks connected through VNet peering.
206
208
207
209
:::image type="content" source="media/concept-best-practices/network-pe-dns.png" alt-text="Screenshot that shows how to handle private endpoints and DNS records for multiple Microsoft Purview accounts."lightbox="media/concept-best-practices/network-pe-dns.png":::
208
210
209
-
This scenario also applies if multiple Microsoft Purview accounts are deployed across multiple subscriptions and multiple VNets that are connected through VNet peering. _Portal_ private endpoint mainly renders static assets related to the Microsoft Purview governance portal, thus, it is independent of Microsoft Purview account, therefore, only one _portal_ private endpoint is needed to visit all Microsoft Purview accounts in the Azure environment if VNets are connected.
211
+
This scenario also applies if multiple Microsoft Purview accounts are deployed across multiple subscriptions and multiple VNets that are connected through VNet peering. _Portal_ private endpoint mainly renders static assets related to the Microsoft Purview governance portal, thus, it's independent of Microsoft Purview account, therefore, only one _portal_ private endpoint is needed to visit all Microsoft Purview accounts in the Azure environment if VNets are connected.
210
212
211
213
:::image type="content" source="media/concept-best-practices/network-pe-dns-multi-vnet.png" alt-text="Screenshot that shows how to handle private endpoints and DNS records for multiple Microsoft Purview accounts in multiple vnets."lightbox="media/concept-best-practices/network-pe-dns-multi-vnet.png":::
212
214
@@ -256,19 +258,19 @@ For scanning data sources across your on-premises and Azure networks, you may ne
256
258
257
259
- To simplify management, when possible, use Azure runtime and [Microsoft Purview Managed runtime](catalog-managed-vnet.md) to scan Azure data sources.
258
260
259
-
- The Self-hosted integration runtime service can communicate with Microsoft Purview through public or private network over port 443. For more information see, [self-hosted integration runtime networking requirements](manage-integration-runtimes.md#networking-requirements).
261
+
- The Self-hosted integration runtime service can communicate with Microsoft Purview through public or private network over port 443. For more information, see, [self-hosted integration runtime networking requirements](manage-integration-runtimes.md#networking-requirements).
260
262
261
-
- One self-hosted integration runtime VM can be used to scan one or multiple data sources in Microsoft Purview, however, self-hosted integration runtime must be only registered for Microsoft Purview and cannot be used for Azure Data Factory or Azure Synapse at the same time.
263
+
- One self-hosted integration runtime VM can be used to scan one or multiple data sources in Microsoft Purview, however, self-hosted integration runtime must be only registered for Microsoft Purview and can't be used for Azure Data Factory or Azure Synapse at the same time.
262
264
263
-
- You can register and use one or multiple self-hosted integration runtime in one Microsoft Purview account. It is recommended to place at least one self-hosted integration runtime VM in each region or on-premises network where your data sources reside.
265
+
- You can register and use one or multiple self-hosted integration runtimes in one Microsoft Purview account. It's recommended to place at least one self-hosted integration runtime VM in each region or on-premises network where your data sources reside.
264
266
265
-
- It is recommended to define a baseline for required capacity for each self-hosted integration runtime VM and scale the VM capacity based on demand.
267
+
- It's recommended to define a baseline for required capacity for each self-hosted integration runtime VM and scale the VM capacity based on demand.
266
268
267
-
- It is recommended to setup network connection between self-hosted integration runtime VMs and Microsoft Purview and its managed resources through private network, when possible.
269
+
- It's recommended to set up network connection between self-hosted integration runtime VMs and Microsoft Purview and its managed resources through private network, when possible.
268
270
269
271
- Allow outbound connectivity to download.microsoft.com, if auto-update is enabled.
270
272
271
-
- The self-hosted integration runtime service does not require outbound internet connectivity, if self-hosted integration runtime VMs are deployed in an Azure VNet or in the on-premises network that is connected to Azure through an ExpressRoute or Site to Site VPN connection. In this case, the scan and metadata ingestion process can be done through private network.
273
+
- The self-hosted integration runtime service doesn't require outbound internet connectivity, if self-hosted integration runtime VMs are deployed in an Azure VNet or in the on-premises network that is connected to Azure through an ExpressRoute or Site to Site VPN connection. In this case, the scan and metadata ingestion process can be done through private network.
272
274
273
275
- Self-hosted integration runtime can communicate Microsoft Purview and its managed resources directly or through [a proxy server](manage-integration-runtimes.md#proxy-server-considerations). Avoid using proxy settings if self-hosted integration runtime VM is inside an Azure VNet or connected through ExpressRoute or Site to Site VPN connection.
0 commit comments