You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/purview/concept-best-practices-scanning.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,5 +1,5 @@
1
1
---
2
-
title: Best practices for scanning of data sources in Purview
2
+
title: Best practices for scanning of data sources in Azure Purview
3
3
description: This article provides best practices for registering and scanning various data sources in Azure Purview.
4
4
author: athenads
5
5
ms.author: athenadsouza
@@ -12,7 +12,7 @@ ms.custom: ignite-fall-2021
12
12
13
13
# Azure Purview scanning best practices
14
14
15
-
Azure Purview supports automated scanning of on-prem, multi-cloud, and SaaS data sources. Running a "scan" invokes the process to ingest metadata from the registered data sources. The metadata curated at the end of scan and curation process includes technical metadata like data asset names (table names/ file names), file size, columns, data lineage and so on. For structured data sources (for example Relational Database Management System) the schema details are also captured. The curation process applies automated classification labels on the schema attributes based on the scan rule set configured, and sensitivity labels if your Purview account is connected to a Microsoft 365 Security & Compliance Center.
15
+
Azure Purview supports automated scanning of on-prem, multi-cloud, and SaaS data sources. Running a "scan" invokes the process to ingest metadata from the registered data sources. The metadata curated at the end of scan and curation process includes technical metadata like data asset names (table names/ file names), file size, columns, data lineage and so on. For structured data sources (for example Relational Database Management System) the schema details are also captured. The curation process applies automated classification labels on the schema attributes based on the scan rule set configured, and sensitivity labels if your Azure Purview account is connected to a Microsoft 365 Security & Compliance Center.
16
16
17
17
## Why do you need best practices to manage data sources?
18
18
@@ -25,7 +25,7 @@ The design considerations and recommendations have been organized based on the k
25
25
26
26
- The hierarchy aligning with the organization’s strategy (geographical, business function, source of data, etc.) defining the data sources to be registered and scanned needs to be created using Collections.
27
27
28
-
- By design, you cannot register data sources multiple times in the same Purview account. This architecture helps to avoid the risk of assigning different access control to the same data source.
28
+
- By design, you cannot register data sources multiple times in the same Azure Purview account. This architecture helps to avoid the risk of assigning different access control to the same data source.
29
29
30
30
### Design recommendations
31
31
@@ -80,10 +80,10 @@ To avoid unexpected cost and rework, it is recommended to plan and follow the be
80
80
> This feature has cost considerations, refer to the [pricing page](https://azure.microsoft.com/pricing/details/azure-purview/) for details.
81
81
82
82
3.**Set up a scan** for the registered data source(s)
83
-
-**Scan name**: By default, Purview uses a naming convention **SCAN-[A-Z][a-z][a-z]** which is not helpful when trying to identify a scan that you have run. As a best practice, use a meaningful naming convention. An instance could be naming the scan as _environment-source-frequency-time_, for example DEVODS-Daily-0200, which would represent a daily scan at 0200 hrs.
83
+
-**Scan name**: By default, Azure Purview uses a naming convention **SCAN-[A-Z][a-z][a-z]** which is not helpful when trying to identify a scan that you have run. As a best practice, use a meaningful naming convention. An instance could be naming the scan as _environment-source-frequency-time_, for example DEVODS-Daily-0200, which would represent a daily scan at 0200 hrs.
84
84
85
85
-**Authentication**: Azure Purview offers various authentication methods for scanning the data sources, depending on the type of source (Azure cloud or on-prem or third-party sources). It is recommended to follow the least privilege principle for authentication method following below order of preference:
86
-
- Purview MSI - Managed Identity (for example, for Azure Data Lake Gen2 sources)
86
+
-Azure Purview MSI - Managed Identity (for example, for Azure Data Lake Gen2 sources)
87
87
- User-assigned Managed Identity
88
88
- Service Principal
89
89
- SQL Authentication (for example, for on-prem or Azure SQL sources)
@@ -146,9 +146,9 @@ To avoid unexpected cost and rework, it is recommended to plan and follow the be
146
146
147
147
### Points to note
148
148
149
-
- If a field / column, table, or a file is removed from the source system after the scan was executed, it will only be reflected (removed) in Purview after the next scheduled full / incremental scan.
149
+
- If a field / column, table, or a file is removed from the source system after the scan was executed, it will only be reflected (removed) in Azure Purview after the next scheduled full / incremental scan.
150
150
- An asset can be deleted from Azure Purview catalog using the **delete** icon under the name of the asset (this will not remove the object in the source). However, if you run full scan on the same source, it would get reingested in the catalog. If you have scheduled a weekly / monthly scan instead (incremental) the deleted asset will not be picked unless the object is modified at source (for example, a column is added / removed from the table).
151
-
- To understand the behavior of subsequent scans after *manually* editing a data asset or an underlying schema through Purview Studio, refer to [Catalog asset details](./catalog-asset-details.md#scans-on-edited-assets).
151
+
- To understand the behavior of subsequent scans after *manually* editing a data asset or an underlying schema through Azure Purview Studio, refer to [Catalog asset details](./catalog-asset-details.md#scans-on-edited-assets).
152
152
- For more details refer the tutorial on [how to view, edit, and delete assets](./catalog-asset-details.md)
0 commit comments