Skip to content

Commit 6780437

Browse files
Merge pull request #239418 from whhender/patch-110
Asset normalization limitations
2 parents 587e742 + 9531b19 commit 6780437

File tree

3 files changed

+65
-54
lines changed

3 files changed

+65
-54
lines changed

articles/purview/catalog-asset-details.md

Lines changed: 48 additions & 40 deletions
Original file line numberDiff line numberDiff line change
@@ -1,31 +1,63 @@
11
---
2-
title: Asset details page in the Microsoft Purview Data Catalog
3-
description: View relevant information and take action on assets in the data catalog
2+
title: Asset management in the Microsoft Purview Data Catalog
3+
description: View relevant information and take action on assets in the Microsoft Purview Data Catalog.
44
author: nayenama
55
ms.author: nayenama
66
ms.service: purview
77
ms.subservice: purview-data-catalog
88
ms.topic: how-to
9-
ms.date: 07/25/2022
9+
ms.date: 05/26/2023
1010
---
11-
# Asset details page in the Microsoft Purview Data Catalog
11+
# Asset management in the Microsoft Purview Data Catalog
1212

13-
This article discusses how assets are displayed in the Microsoft Purview Data Catalog, and all the features and details available to them. It describes how you can view relevant information or take action on assets in your catalog.
13+
This article discusses how assets are displayed in the Microsoft Purview Data Catalog, and all the features and details available to them. It describes how you can view relevant information or take action on assets in your catalog.
1414

1515
## Prerequisites
1616

1717
- Set up your data sources and scan the assets into your catalog.
1818
- *Or* Use the Microsoft Purview Atlas APIs to ingest assets into the catalog.
1919

20-
## Open an asset details page
20+
## Discover assets
2121

2222
You can discover your assets in the Microsoft Purview Data Catalog by either:
2323
- [Browsing the data catalog](how-to-browse-catalog.md)
2424
- [Searching the data catalog](how-to-search-catalog.md)
2525

26-
Once you find the asset you're looking for, you can view all of the asset information or take action on them as described in following sections.
26+
Once you've discovered an asset, select it to view the asset details page, which contains all the asset details, and more actions you can take.
2727

28-
## Asset details tabs explained
28+
## Editing assets
29+
30+
To edit an asset, other than adding a rating, you will need the [data curator](catalog-permissions.md) role on the collection where the asset is housed.
31+
32+
To edit assets you can either:
33+
34+
- Select multiple assets in the catalog to [bulk edit assets](how-to-bulk-edit-assets.md)
35+
- Select a single asset and select the **Edit** button at the top of the [asset details page](#asset-details-page)
36+
37+
#### Scan behavior after editing assets
38+
39+
Microsoft Purview works to reflect the truth of the source system whenever possible. For example, if you edit a column and later it's deleted from the source table. A scan will remove the column metadata from the asset in Microsoft Purview.
40+
41+
Both column-level and asset-level updates such as adding a description, glossary term or classification don't impact scan updates. Scans will update new columns and classifications regardless if these changes are made.
42+
43+
If you update the **name** or **data type** of a column, subsequent scans **won't** update the asset schema. New columns and classifications **won't** be detected.
44+
45+
## Delete asset
46+
47+
If you're a data curator on the collection containing an asset, you can delete an asset by selecting the delete icon under the name of the asset.
48+
49+
> [!IMPORTANT]
50+
> You cannot delete an asset that has child assets.
51+
>
52+
> Currently, Microsoft Purview doesn't support cascaded deletes. For example, if you attempt to delete a storage account asset in your catalog the containers, folders and files within them will still exist in the data map and the the storage account asset will still exist in relation to them.
53+
54+
Any asset you delete using the delete button is permanently deleted in Microsoft Purview. However, if you run a **full scan** on the source from which the asset was ingested into the catalog, then the asset is reingested and you can discover it using the Microsoft Purview Data Catalog.
55+
56+
If you have a scheduled scan (weekly or monthly) on the source, the **deleted asset won't get re-ingested** into the catalog unless the asset is modified by an end user since the previous run of the scan. For example, say you manually delete a SQL table from the Microsoft Purview Data Map. Later, a data engineer adds a new column to the source table. When Microsoft Purview scans the database, the table will be reingested into the data map and be discoverable in the data catalog.
57+
58+
## Asset details page
59+
60+
At the top of the asset details page there are several tabs:
2961

3062
:::image type="content" source="media/catalog-asset-details/asset-tabs.png" alt-text="Asset details tabs":::
3163

@@ -99,24 +131,6 @@ Below are a list of actions you can take from an asset details page. Actions ava
99131

100132
:::image type="content" source="media/catalog-asset-details/asset-details-actions.png" alt-text="Screenshot that shows actions available on the asset details page.":::
101133

102-
### Editing assets
103-
104-
If you're a data curator on the collection containing an asset, you can edit an asset by selecting the edit icon on the top-left corner of the asset.
105-
106-
At the asset level you can edit or add a description, classification, or glossary term by staying on the overview tab of the edit screen.
107-
108-
You can navigate to the schema tab on the edit screen to update column name, data type, column level classification, terms, or asset description.
109-
110-
You can navigate to the contact tab of the edit screen to update owners and experts on the asset. You can search by full name, email or alias of the person within your Azure active directory.
111-
112-
#### Scan behavior after editing assets
113-
114-
Microsoft Purview works to reflect the truth of the source system whenever possible. For example, if you edit a column and later it's deleted from the source table. A scan will remove the column metadata from the asset in Microsoft Purview.
115-
116-
Both column-level and asset-level updates such as adding a description, glossary term or classification don't impact scan updates. Scans will update new columns and classifications regardless if these changes are made.
117-
118-
If you update the **name** or **data type** of a column, subsequent scans **won't** update the asset schema. New columns and classifications **won't** be detected.
119-
120134
### Request access to data
121135

122136
If a [self-service data access workflow](how-to-workflow-self-service-data-access-hybrid.md) has been created, you can request access to a desired asset directly from the asset details page! To learn more about Microsoft Purview's data policy applications, see [how to enable data use management](how-to-enable-data-use-management.md).
@@ -138,19 +152,6 @@ Microsoft Purview makes it easy to work with useful data you find the data catal
138152
- SQL Server
139153
- Teradata
140154

141-
### Deleting assets
142-
143-
If you're a data curator on the collection containing an asset, you can delete an asset by selecting the delete icon under the name of the asset.
144-
145-
> [!IMPORTANT]
146-
> You cannot delete an asset that has child assets.
147-
>
148-
> Currently, Microsoft Purview doesn't support cascaded deletes. For example, if you attempt to delete a storage account asset in your catalog the containers, folders and files within them will still exist in the data map and the the storage account asset will still exist in relation to them.
149-
150-
Any asset you delete using the delete button is permanently deleted in Microsoft Purview. However, if you run a **full scan** on the source from which the asset was ingested into the catalog, then the asset is reingested and you can discover it using the Microsoft Purview catalog.
151-
152-
If you have a scheduled scan (weekly or monthly) on the source, the **deleted asset won't get re-ingested** into the catalog unless the asset is modified by an end user since the previous run of the scan. For example, say you manually delete a SQL table from the Microsoft Purview Data Map. Later, a data engineer adds a new column to the source table. When Microsoft Purview scans the database, the table will be reingested into the data map and be discoverable in the data catalog.
153-
154155
## Ratings
155156

156157
Assets can be rated by all users with read access, or better, to that asset in Microsoft Purview.
@@ -176,7 +177,7 @@ These ratings can be seen by all users with read access, and rating can be [adde
176177
1. Choose a star rating, add a comment, and select **Submit**.
177178
:::image type="content" source="media/catalog-asset-details/rate-asset.png" alt-text="Screenshot of a rating, showing five start selected and a comment about the quality of the data.":::
178179

179-
## Edit or delete your rating
180+
### Edit or delete your rating
180181

181182
1. Select the ratings button in the asset's header.
182183
1. Select the **Open ratings** button.
@@ -215,7 +216,14 @@ If you have [data curator](catalog-permissions.md) permissions Microsoft Purview
215216
:::image type="content" source="media/catalog-asset-details/remove-tag.png" alt-text="Screenshot that shows the remove tag button highlighted next to an existing page.":::
216217
1. Confirm the removal of the tag.
217218

219+
## Duplicate assets
220+
221+
If you notice duplicate assets in your Microsoft Purview Data Catalog, review the [asset normalization](concept-asset-normalization.md) documentation. Microsoft Purview normalizes assets to prevent duplication, but not all possible scenarios are covered.
222+
223+
Compare the fully qualified asset names for duplicate assets, and update ingestion points to resolve capitalization or character differences. Then, [delete the duplicated asset](#delete-asset) in your catalog.
224+
218225
## Next steps
219226

220227
- [Browse the Microsoft Purview Data Catalog](how-to-browse-catalog.md)
221228
- [Search the Microsoft Purview Data Catalog](how-to-search-catalog.md)
229+
- [Asset normalization](concept-asset-normalization.md)

articles/purview/concept-asset-normalization.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,24 @@
11
---
22
title: Asset normalization
3-
description: Learn how Microsoft Purview prevents duplicate assets in your data map through asset normalization
3+
description: Learn how Microsoft Purview prevents duplicating assets in your data map through asset normalization.
44
author: nayenama
55
ms.author: nayenama
66
ms.service: purview
77
ms.subservice: purview-data-catalog
88
ms.topic: conceptual
9-
ms.date: 02/17/2023
9+
ms.date: 05/26/2023
1010
ms.custom: ignite-fall-2021
1111
---
1212

1313
# Asset normalization
1414

15-
When ingesting assets into the Microsoft Purview data map, different sources updating the same data asset may send similar, but slightly different qualified names. While these qualified names represent the same asset, slight differences such as an extra character or different capitalization may cause these assets on the surface to appear different. To avoid storing duplicate entries and causing confusion when consuming the data catalog, Microsoft Purview applies normalization during ingestion to ensure all fully qualified names of the same entity type are in the same format.
15+
When ingesting assets into the Microsoft Purview data map, different sources updating the same data asset may send similar, but slightly different qualified names. While these qualified names represent the same asset, slight differences such as an extra character may cause these assets on the surface to appear different and cause duplicate entries in Microsoft Purview. To avoid storing duplicate entries and causing confusion when consuming the data catalog, Microsoft Purview applies normalization during ingestion to ensure all fully qualified names of the same entity type are in the same format.
1616

1717
For example, you scan in an Azure Blob with the qualified name `https://myaccount.file.core.windows.net/myshare/folderA/folderB/my-file.parquet`. This blob is also consumed by an Azure Data Factory pipeline that will then add lineage information to the asset. The ADF pipeline may be configured to read the file as `https://myAccount.file.core.windows.net//myshare/folderA/folderB/my-file.parquet`. While the qualified name is different, this ADF pipeline is consuming the same piece of data. Normalization ensures that all the metadata from both Azure Blob Storage and Azure Data Factory is visible on a single asset, `https://myaccount.file.core.windows.net/myshare/folderA/folderB/my-file.parquet`.
1818

19+
>[!IMPORTANT]
20+
>The rules listed below are the only kinds of potential dupilcation Microsoft Purview currently recognizes. If you are experiencing accidental asset duplication, compare the assets fully qualified names to check for caplitalization differences or additional characters. Update any ingestion points, for example your ADF pipelines, so that the qualified names match.
21+
1922
## Normalization rules
2023

2124
Below are the normalization rules applied by Microsoft Purview.

articles/purview/concept-best-practices-asset-lifecycle.md

Lines changed: 11 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -6,12 +6,12 @@ ms.author: jubairpatel
66
ms.service: purview
77
ms.subservice: purview-data-catalog
88
ms.topic: conceptual
9-
ms.date: 01/06/2022
9+
ms.date: 05/26/2023
1010
---
1111

1212
# Business processes for managing data effectively
1313

14-
As data and content has a lifecycle that requires active management (for example, acquisition - processing - disposal) assets in the Microsoft Purview data catalog need active management in a similar way. "Assets" in the catalog include the technical metadata that describes collection, lineage and scan information. Metadata describing the business structure of data such as glossary, classifications and ownership also needs to be managed.
14+
As data and content has a lifecycle that requires active management (for example, acquisition - processing - disposal) assets in the Microsoft Purview Data Catalog need active management in a similar way. "Assets" in the catalog include the technical metadata that describes collection, lineage and scan information. Metadata describing the business structure of data such as glossary, classifications and ownership also needs to be managed.
1515

1616
To manage data assets, responsible people in the organization must understand how and when to apply data governance processes and manage workflows.
1717

@@ -21,7 +21,7 @@ An organization employing [Microsoft Purview data governance solutions](/purview
2121

2222
### Benefits
2323

24-
- Agreed definition and structure of data is required for the Microsoft Purview data catalog to provide effective data search and protection functionality at scale across organizations' data estates.
24+
- Agreed definition and structure of data is required for the Microsoft Purview Data Catalog to provide effective data search and protection functionality at scale across organizations' data estates.
2525

2626
- Defining and using processes for asset lifecycle management is key to maintaining accurate asset metadata, which will improve usability of the catalog and the ability to protect relevant data.
2727

@@ -54,7 +54,7 @@ The [Data Curator](catalog-permissions.md) role in Microsoft Purview controls re
5454

5555
## 1. Capture and maintain assets
5656

57-
This process describes the high-level steps and suggested roles to capture and maintain assets in the Microsoft Purview data catalog.
57+
This process describes the high-level steps and suggested roles to capture and maintain assets in the Microsoft Purview Data Catalog.
5858

5959
:::image type="content" source="media/concept-best-practices/assets-capturing-asset-metadata.png" alt-text="Business Process 1 - Capturing and Maintaining Assets."lightbox="media/concept-best-practices/assets-capturing-asset-metadata.png" border="true":::
6060

@@ -66,13 +66,13 @@ This process describes the high-level steps and suggested roles to capture and m
6666
| 2 | [How to create and manage collections](how-to-create-and-manage-collections.md)
6767
| 3 & 4 | [Understand Microsoft Purview access and permissions](catalog-permissions.md)
6868
| 5 | [Microsoft Purview supported sources](purview-connector-overview.md) <br> [Microsoft Purview private endpoint networking](catalog-private-link.md) |
69-
| 6 | [How to manage multi-cloud data sources](manage-data-sources.md)
69+
| 6 | [How to manage multicloud data sources](manage-data-sources.md)
7070
| 7 | [Best practices for scanning data sources in Microsoft Purview](concept-best-practices-scanning.md)
7171
| 8, 9 & 10 | [Search the data catalog](how-to-search-catalog.md) <br> [Browse the data catalog](how-to-browse-catalog.md)
7272

7373
## 2. Glossary and classification maintenance
7474

75-
This process describes the high-level steps and roles to manage and define the business glossary and classifications metadata to enrich the Microsoft Purview data catalog.
75+
This process describes the high-level steps and roles to manage and define the business glossary and classifications metadata to enrich the Microsoft Purview Data Catalog.
7676

7777
:::image type="content" source="media/concept-best-practices/assets-maintaining-glossary-and-classifications.png" alt-text="Business Process 2 - Maintaining glossary and classifications"lightbox="media/concept-best-practices/assets-maintaining-glossary-and-classifications.png" border="true":::
7878

@@ -94,7 +94,7 @@ This process describes the high-level steps and roles to manage and define the b
9494
9595
## 3. Moving assets between collections
9696

97-
This process describes the high-level steps and roles to move assets between collections using the Microsoft Purview portal.
97+
This process describes the high-level steps and roles to move assets between collections using the Microsoft Purview compliance portal.
9898

9999
:::image type="content" source="media/concept-best-practices/assets-moving-assets-between-collections.png" alt-text="Business Process 3 - Moving assets between collections"lightbox="media/concept-best-practices/assets-moving-assets-between-collections.png" border="true":::
100100

@@ -110,11 +110,11 @@ This process describes the high-level steps and roles to move assets between col
110110
| 7 | [Browse the Microsoft Purview Catalog](how-to-browse-catalog.md)
111111

112112
> [!Note]
113-
> It is not currently possible to bulk move assets from one collection to another using the Microsoft Purview portal.
113+
> It is not currently possible to bulk move assets from one collection to another using the Microsoft Purview compliance portal.
114114
115115
## 4. Deleting asset metadata
116116

117-
This process describes the high-level steps and roles to delete asset metadata from the data catalog using the Microsoft Purview portal.
117+
This process describes the high-level steps and roles to delete asset metadata from the data catalog using the Microsoft Purview compliance portal.
118118

119119
Asset Metadata may need to be deleted manually for many reasons:
120120

@@ -124,7 +124,7 @@ Asset Metadata may need to be deleted manually for many reasons:
124124

125125

126126
> [!Note]
127-
> Before deleting assets, please refer to the how-to guide to review considerations: [How to delete assets](catalog-asset-details.md#deleting-assets)
127+
> Before deleting assets, please refer to the how-to guide to review considerations: [How to delete assets](catalog-asset-details.md#delete-asset)
128128
129129
:::image type="content" source="media/concept-best-practices/assets-deleting-asset-metadata.png" alt-text="Business Process 4 - Deleting Assets in Microsoft Purview"lightbox="media/concept-best-practices/assets-deleting-asset-metadata.png" border="true":::
130130

@@ -135,7 +135,7 @@ Asset Metadata may need to be deleted manually for many reasons:
135135
| 1 & 2 | Manual steps |
136136
| 3 | [Data catalog lineage user guide](catalog-lineage-user-guide.md)
137137
| 4 | Manual step
138-
| 5 | [How to view, edit and delete assets](catalog-asset-details.md#deleting-assets)
138+
| 5 | [How to view, edit and delete assets](catalog-asset-details.md#delete-asset)
139139
| 6 | [Scanning best practices](concept-best-practices-scanning.md)
140140

141141
> [!Note]

0 commit comments

Comments
 (0)