MicrosoftDocs
diff --git a/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/includes/5-map-data.md
Lines changed: 22 additions & 14 deletions b/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/includes/5-map-data.md
Lines changed: 22 additions & 14 deletions
diff --git a/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/Automapping_Create_Vocabulary.png
-159 KB b/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/Automapping_Create_Vocabulary.png
-159 KB
diff --git a/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/Automapping_New_Employee.png
-129 KB b/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/Automapping_New_Employee.png
-129 KB
diff --git a/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/CluedIn_Employee_Edge.png
-128 KB b/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/CluedIn_Employee_Edge.png
-128 KB
diff --git a/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/Process_Companies.png
-136 KB b/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/Process_Companies.png
-136 KB
diff --git a/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/Remove_PersonId_Code.png
-131 KB b/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/Remove_PersonId_Code.png
-131 KB
diff --git a/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/automapping-create-vocabulary.png
61.5 KB b/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/automapping-create-vocabulary.png
61.5 KB
diff --git a/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/automapping-new-employee.png
31.9 KB b/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/automapping-new-employee.png
31.9 KB
diff --git a/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/preview-employees.png
224 KB b/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/preview-employees.png
224 KB
diff --git a/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/remove-personid-code.png
71.9 KB b/‎learn-pr/azure/building-end-to-end-data-governance-master-data-stack-with-microsoft-purview-cluedin/media/remove-personid-code.png
71.9 KB
@@ -4,30 +4,30 @@ Interestingly, CluedIn doesn't mandate that we onboard this data in any specific
 
 So, we can onboard the data in the order we find them in the datasources tab:
 
-1. Navigate through Integrations in CluedIn to the ingestion endpoint called Companies.csv. You can do this by selecting **Integrations**, then **Data Sources**, then selecting the data source, then the endpoint that drops down beneath it, and then you'll see the copy data automation from Companies.csv. Select that and you'll see your data!
+1. Navigate through Integrations in CluedIn to the ingestion endpoint called Employees.csv. You can do this by selecting **Integrations**, then **Data Sources**, then selecting the data source, then the endpoint that drops down beneath it, and then you'll see the copy data automation from Employees.csv. Select that and you'll see your data!
 
     >[!NOTE]
     > It's worth mentioning that this isn't a CSV file hosted in CluedIn; it's the raw data hosted in the staging area in CluedIn. You'll see a preview of 10 rows of data, including the companies in our original CSV file.
 
-    :::image type="content" source="../media/Data_Preview_Arca.png" alt-text="Screenshot of the CluedIn window, showing the companies.csv window.":::
+    :::image type="content" source="../media/preview-employees.png" alt-text="Screenshot of the CluedIn window, showing the employees.csv window.":::
 
 1. CluedIn may prompt you to map the data, in which case you'll select the **Create Mapping** button. If not, you can select the **Map** tab and then the **Map data** button.
 
-    :::image type="content" source="../media/Automapping_New_Employee.png" alt-text="Screenshot of the Create Mapping page in CluedIn, showing the configure tab.":::
+    :::image type="content" source="../media/automapping-new-employee.png" alt-text="Screenshot of the Create Mapping page in CluedIn, showing the configure tab.":::
 
 1. Choose **Auto Mapping**.
 
-1. On the **Configure** tab, you'll choose **Entity Type**. Add **Company** here, and select the prompt that says **create "Company"** and select an icon for your new entity type.
+1. On the **Configure** tab, you'll choose **Entity Type**. Add **Employee** here, and select the prompt that says **create "Employee"** and select an icon for your new entity type.
 
     An Entity Type in CluedIn can be thought of as the Domain of the data we're working with. This is typically some type of noun such as a Person, Company, Vendor, or Dog.
 
     CluedIn provides all of the domains in the Microsoft Common Data Model by default and some of the domains you'll need to master won't always fall into the ones that already exist.
 
-1. Next, we're going to create a new **Vocabulary**. Think of this as the structure or "schema" of the domain. In the vocabulary box, you can search **Company** again and select the **create "Company"** option. We're going to create a new structure that matches our data. Follow the prompts to create a new vocabulary, and you'll see a preview of the new structure that contains all the columns names the CSV has.
+1. Next, we're going to create a new **Vocabulary**. Think of this as the structure or "schema" of the domain. In the vocabulary box, you can search **Employee** again and select the **create "Employee"** option. We're going to create a new structure that matches our data. Follow the prompts to create a new vocabulary, and you'll see a preview of the new structure that contains all the columns names the CSV has.
 
-    :::image type="content" source="../media/Automapping_Create_Vocabulary.png" alt-text="Screenshot of the Create Mapping page in CluedIn, showing the vocabulary selector on the configure tab.":::
+    :::image type="content" source="../media/automapping-create-vocabulary.png" alt-text="Screenshot of the Create Mapping page in CluedIn, showing the vocabulary selector on the configure tab.":::
 
-    Instead of creating the model of a **Company** upfront before ingesting data, we'll instead use the data to create our model and as we start to see more data on companies, we'll evolve and change the model as we need to. This agile approach yields an automated way for us to integrate the data - but also allows us to work with changes in model alongside the data.
+    Instead of creating the model of an **Employee** upfront before ingesting data, we'll instead use the data to create our model and as we start to see more data on companies, we'll evolve and change the model as we need to. This agile approach yields an automated way for us to integrate the data - but also allows us to work with changes in model alongside the data.
 
 1. Select **Create Mapping** to accept the default names. If you chose **Automated Mapping** as suggested, CluedIn will provide some automated *smarts* including the automated detection of potential unique identifiers of the data.
 
@@ -37,23 +37,31 @@ So, we can onboard the data in the order we find them in the datasources tab:
 
     1. Select the **Next** button and you'll be taken to the entity mapping where we can see how CluedIn will interpret some of the data in its core properties.
 
-    1. Notice that the Name of the entity in CluedIn has automatched to use the Company Name. Notice under the Origin Code Accordian that CluedIn has chosen the ID column as the Origin Code (Primary Key) and it has also chosen that both the ID and the Email most likely can be used as ways to uniquely look up each of the employees in this list.
+    1. Notice that the Name of the entity in CluedIn has automatched to use the Employee Name. Notice under the Origin Code Accordian that CluedIn has chosen the ID column as the Origin Code (Primary Key) and it has also chosen that both the ID and the Email most likely can be used as ways to uniquely look up each of the employees in this list.
 
-    :::image type="content" source="../media/Remove_PersonId_Code.png" alt-text="Screenshot of the general details of the map entity tab of companies.csv.":::
+    :::image type="content" source="../media/remove-personid-code.png" alt-text="Screenshot of the general details of the map entity tab of employee.csv.":::
+
+1. Under **Entity Codes**, select th edit pencil next to **Person Id**.
+1. Select **Custom** and enter **EmployeeID** as the Custom Origin Value.
+    This tells CluedIn that this value originates from a company employee ID. For key values it can be important to update these origins with labels that specify their actual origins, so CluedIn can match them together. CluedIn's autogenerated origins will be based on the file, but if pieces of data across several files come from the same place, you should specify that with a custom origin.
 
 1. Select **Next** to go to the next section that is called **Edges**.
 
-    In this section we'll specify if there are columns that refer to records that aren't Companies, but are some type of relationship to another record. This could easily be a relationship to a Domain of the same type (Company) or in this example, it seems to be referring to a Person (from the column name person_id). We already know, although it's a person, we could also say that it's referring to an Employee. The good part about CluedIn's mapping process, is that we can choose one type now and we can always come back and change this later, once we know more about our data. CluedIn will handle all the cleanup necessary.
+    In this section we can specify if there are columns that refer to records that aren't Employees, but are some type of relationship to another record.
 
-    :::image type="content" source="../media/Person_Edge_Create.png" alt-text="Screenshot of the Add Edge menu on CluedIn.":::
+    This could easily be a relationship to a Domain of the same type (Employee) or example:
+    The companies .csv file has a column called person_id. It seems to be referring to a Person (from the column name person_id). We already know, although it's a person. When we create a mapping for companies, we could also say that this column is referring to an Employee. The good part about CluedIn's mapping process, is that we can choose one type now and we can always come back and change this later, once we know more about our data. CluedIn will handle all the cleanup necessary.
 
 1. Select **Finish** to leave the mapping.
 
 1. Select the **Process** tab, then select **Process**, and **Confirm**.
 
-    :::image type="content" source="../media/Process_Companies.png" alt-text="Screenshot of the are you sure you want to process page in CluedIn.":::
-
 1. Repeat the steps above for all the other files, until all the data is processed in the platform. For entity and vocabulary for the other data files, you can choose existing entities and vocabularies, or create new ones.
 
 >[!NOTE]
-> For the next part of the learning module, you will need to repeat the process above with the other 6 files with their respective Domains, Vocabularies, Identifiers and more.
+> For the next part of the learning module, you will need to repeat the process above with the other 6 files with their respective Domains, Vocabularies, Identifiers and more.
+
+>[!TIP]
+> When you're mapping your other files, follow these tips:
+>1. Update any key values with custom origin codes where multiple files reference the same information. For example, mark any employee GUIDs as the EmployeeID entity code.
+>1. Add edges where records of one type reference records of another type. Like Companies referencing Employee IDs.