You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/purview/scan-data-sources.md
+13-12Lines changed: 13 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -11,9 +11,9 @@ ms.date: 01/25/2023
11
11
12
12
# Scan data sources in Microsoft Purview
13
13
14
-
In Microsoft Purview, after you [register a data source](manage-data-sources.md#register-a-new-source) your data source, you can scan your source to import metadata about the information stored in that source, and apply any classifications to sensitive data.
14
+
In Microsoft Purview, after you [register your data source](manage-data-sources.md#register-a-new-source), you can scan your source to capture technical metadata, extract schema, and apply classifications to your data.
15
15
16
-
* For more information about scanning in general, see our [scanning concept article](concept-scans-and-ingestion.md)
16
+
* For more information about scanning in general, see our [scanning concept article](concept-scans-and-ingestion.md).
17
17
* For best practices, see our [scanning best practices article.](concept-best-practices-scanning.md)
18
18
19
19
In this article, you'll learn the basic steps for scanning any data source.
@@ -28,18 +28,18 @@ In this article, you'll learn the basic steps for scanning any data source.
28
28
Before you can scan your data source, you must take these steps:
29
29
30
30
1.[Register your data source](manage-data-sources.md#register-a-new-source) - This essentially gives Microsoft Purview the address of your data source, and maps it to a [collection](catalog-permissions.md#a-collections-example) in the Microsoft Purview Data Map.
31
-
1. Consider your network - If your source is on an on-premises network, or a virtual private network (VPN), or if your [Microsoft Purview account is using private endpoints](catalog-private-link-end-to-end.md), you'll need a self-hosted integration runtime, which is a tool that will sit on a machine in your private network so your source and Microsoft Purview can connect during the scan. [Here are the instructions to create a self-hosted integration runtime.](manage-integration-runtimes.md)
31
+
1. Consider your network - If your source is in an on-premises network, or a virtual private network (VPN), or if your [Microsoft Purview account is using private endpoints](catalog-private-link-end-to-end.md), you'll need a self-hosted integration runtime, which is a tool that will sit on a machine in your private network so your source and Microsoft Purview can connect during the scan. [Here are the instructions to create a self-hosted integration runtime.](manage-integration-runtimes.md)
32
32
1. Consider what credentials you're going to use to connect to your source. All [source pages](microsoft-purview-connector-overview.md) will have a **Scan** section that will include details about what authentication types are available.
33
33
34
-
## Creating a scan
34
+
## Create a scan
35
35
36
36
In the steps below we'll be using [Azure Blob Storage](register-scan-azure-blob-storage-source.md) as an example, and authenticating with the Microsoft Purview Managed Identity.
37
37
38
38
>[!IMPORTANT]
39
39
> These are the general steps for creating a scan, but you should refer to [the source page](microsoft-purview-connector-overview.md) for source-specific prerequistes and scanning instructions.
40
40
41
41
42
-
1. In the [Azure portal](https://portal.azure.com), open your **Microsoft Purview account** and select the **Open Microsoft Purview governance portal**.
42
+
1. In the [Azure portal](https://portal.azure.com), open your **Microsoft Purview account** and select the **Open Microsoft Purview governance portal** button.
43
43
44
44
:::image type="content" source="./media/scan-data-sources/open-purview-studio.png" alt-text="Screenshot of Microsoft Purview window in Azure portal, with the Microsoft Purview governance portal button highlighted." border="true":::
45
45
@@ -50,12 +50,13 @@ In the steps below we'll be using [Azure Blob Storage](register-scan-azure-blob-
50
50
51
51
1. Provide a **Name** for the scan.
52
52
1. Select your authentication method. Here we chose the Purview MSI (managed identity.)
53
-
1. Choose the current collection, or a sub collection for the scan. The collection you choose will house the metadata discovered during the scan.
54
-
55
-
1. Select **Test connection**. If it isn't successful, see our [troubleshooting] section. On a successful connection, select **Continue**
56
53
57
54
:::image type="content" source="media/scan-data-sources/register-blob-managed-identity.png" alt-text="Screenshot that shows the managed identity option to run the scan":::
58
55
56
+
1. Choose the current collection, or a sub collection for the scan. The collection you choose will house the metadata discovered during the scan.
57
+
58
+
1. Select **Test connection**. If it isn't successful, see our [troubleshooting] section. On a successful connection, select **Continue**.
59
+
59
60
1. Depending on the source, you can scope your scan to a specific subset of data. For Azure Blob Storage, we can select folders and subfolders by choosing the appropriate items in the list.
60
61
61
62
:::image type="content" source="media/scan-data-sources/register-blob-scope-scan.png" alt-text="Scope your scan":::
@@ -72,7 +73,7 @@ In the steps below we'll be using [Azure Blob Storage](register-scan-azure-blob-
Depending on the amount of data in your data source, a scan can take some time to run, so here's how you can check on progress and see results when the scan is complete.
78
79
@@ -90,7 +91,7 @@ Depending on the amount of data in your data source, a scan can take some time t
1. You can _run an incremental scan_ or a _full scan_ again.
106
+
1. You can run a full scan, which will scan all the content in your scope, but some sources also have **incremental scan** available. Incremental scan will scan only those resources that have been updated since the last scan. Check the **supported capabilities** table in your source page to see if incremental scan is available for your source after the first scan.
106
107
107
108
:::image type="content" source="media/scan-data-sources/register-blob-full-inc-scan.png" alt-text="full or incremental scan":::
108
109
@@ -112,7 +113,7 @@ Setting up the connection for your scan can complex since it's a custom set up f
112
113
113
114
If you're unable to connect to your source, follow these steps:
114
115
115
-
1. Review your [source page](microsoft-purview-connector-overview.md)prerequisites to make sure there's nothing you've missed.
116
+
1. Review your [source page](microsoft-purview-connector-overview.md)prerequisites to make sure there's nothing you've missed.
116
117
1. Review your authentication option in the **Scan** section of your source page to confirm you have set up the authentication method correctly.
1.[Create a support request](../azure-portal/supportability/how-to-create-azure-support-request.md#go-to-help--support-from-the-global-header), so our support team can help you troubleshoot your specific environment.
0 commit comments