You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/purview/supported-classifications.md
+22-6Lines changed: 22 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -6,7 +6,7 @@ ms.author: ankitgup
6
6
ms.service: purview
7
7
ms.subservice: purview-data-map
8
8
ms.topic: reference
9
-
ms.date: 03/07/2023
9
+
ms.date: 04/10/2023
10
10
#Customer intent: As a data steward or catalog administrator, I need to understand what's supported under classifications.
11
11
---
12
12
@@ -24,9 +24,21 @@ Microsoft Purview classifies data by using [RegEx](https://wikipedia.org/wiki/Re
24
24
25
25
## Bloom Filter based classifications
26
26
27
-
### City, Country, and Place
27
+
### World Cities, Country
28
28
29
-
The City, Country, and Place filters have been prepared using best datasets available for preparing the data.
29
+
The City and Country classifier identifies the data based on their full names as well as short codes.
30
+
31
+
#### Keywords
32
+
- burg
33
+
- city
34
+
- cities
35
+
- city names
36
+
- cosmopolis
37
+
- metropolis
38
+
- municipality
39
+
- place
40
+
- town
41
+
-------------------------------------
30
42
31
43
## Machine Learning based classifications
32
44
@@ -35,10 +47,9 @@ The City, Country, and Place filters have been prepared using best datasets avai
35
47
36
48
### Person's Name
37
49
38
-
Person Name machine learning model has been trained using global datasets of names in English language.
50
+
Person Name machine learning model has been trained using global datasets of names in English language. Microsoft Purview classifies full names stored in the same column as well as first and last names in separate columns.
39
51
40
-
> [!NOTE]
41
-
> Microsoft Purview classifies full names stored in the same column as well as first/last names in separate columns.
52
+
-------------------------------------
42
53
43
54
### Person's Address
44
55
Person's address classification is used to detect full address stored in a single column containing the following elements: House number, Street Name, City, State, Country, Zip Code. Person's Address classifier uses machine learning model that is trained on the global addresses data set in English language.
@@ -52,6 +63,8 @@ Currently the address model supports the following formats in the same column:
52
63
- street, city, pincode or zipcode
53
64
- landmark, city
54
65
66
+
-------------------------------------
67
+
55
68
### Person's Gender
56
69
Person's Gender machine learning model has been trained using US Census data and other public data sources in English language. It supports classifying 50+ genders out of the box.
57
70
@@ -60,6 +73,7 @@ Person's Gender machine learning model has been trained using US Census data and
60
73
- gender
61
74
- orientation
62
75
76
+
-------------------------------------
63
77
64
78
### Person's Age
65
79
Person's Age machine learning model detects age of an individual specified in various different formats. The qualifiers for days, months, and years must be in English language.
@@ -110,6 +124,8 @@ Person's Age machine learning model detects age of an individual specified in va
0 commit comments