Skip to content

Commit f95f497

Browse files
authored
Update README.md
1 parent ac344f0 commit f95f497

File tree

1 file changed

+15
-16
lines changed

1 file changed

+15
-16
lines changed

README.md

Lines changed: 15 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -5,44 +5,43 @@ An example Data Vault data warehouse modelling Microsoft's Northwind sample data
55

66
### Purpose
77

8-
The objective of this project is to develop an accessible example data warehouse illustrating how to model the well-known Northwind sample database using the Data Vault methodology.
8+
The objective of this project is to develop an easiily accessible example data warehouse illustrating how to model the well-known Northwind sample database using the Data Vault methodology.
99

10-
I intend the repository to mostly be of use to new Data Vault practitioners and go some way to helping join a few in dots in the somewhat steep learning curve that comes with Data Vault modelling. Personally, I found the 'getting the data in' part of Data Vault intuitive enough, but the 'getting the data out' a bit more challenging, and I hope the content of this repository is helpful in that regard.
10+
I intend the repository to mostly be of use to new Data Vault practitioners, providing a convenient option for hands-on experimentation. Hopefully it helps join a few dots on the somewhat steep learning curve that comes with Data Vault modelling.
1111

1212

13-
### Setup and Documentation
13+
### Setup
1414

1515
1. Set up a SQL Server instance to hold the five component databases.
16-
2. Open SQL Server Management Studio and connect to your SQL Server instance.
17-
3. Run the following SQL scripts in this order to create the five databases.
16+
2. Run the following SQL scripts in this order to create the five databases.
1817
* SQL\DDL\Northwind\instnwnd.sql
1918
* SQL\DDL\Create_Database_Stage_Area.sql
2019
* SQL\DDL\Create_Database_Meta_Metrics_Error_Mart.sql
2120
* SQL\DDL\Create_Database_Data_Vault.sql
2221
* SQL\DDL\Create_Database_Information_Mart.sql
23-
4. Confirm setup is correct with an initial load. Run the following SQL script to execute the stored procedures for the Stage_Area and Data_Vault databases.
22+
3. Confirm setup is correct with an initial load. Run the following SQL script to execute the stored procedures for the Stage_Area and Data_Vault databases.
2423
* SQL\ETL\Load_Data_Vault.sql
25-
5. Additionally, check the contents of Meta_Metrics_Error_Mart.error.Error_Log for any unxpected errors.
26-
6. Basic documentation covering the data model and table mapping can be found in the Documentation directory.
27-
7. From here, follow your nose until it all makes sense. I recommend starting by altering some Northwind source data, rerunning the load, and following the changes through the layers into the Information_Mart views.
24+
4. Additionally, check the contents of Meta_Metrics_Error_Mart.error.Error_Log for any unxpected errors.
2825

2926

30-
### Caveats and Notes
27+
### Documentation
28+
Basic documentation covering the data model and table mapping can be found in the Documentation directory.
29+
3130

32-
* First and foremost, if you're interested in working with Data Vault professionally, theres's no replacement for proper training and expert advice. I recommend seeking out a Data Vault Alliance course as your first port of call.
31+
### Caveats and Notes
3332

3433
* My implementation here most closely adheres to Dan Linstedt's Data Vault 2.0 standard, but also takes some inspiration from the work of others, primarily Hans Hultgren and Patrick Cuba.
3534

36-
* This implementation should not be considered 'textbook' or representative of a real production Data Vault data warehouse; it is intended as a simplified example for educational purposes. A full-scale production data warehouse would almost certainly integrate data from multiple source systems, use incremental loads, and employ parallelised ETL scheduling as a just a few examples. My goal here was to have something somebody can set up, load, and be getting info out of the mart with just SQL Server in minutes.
35+
* This implementation should not be considered 'textbook' or representative of a real production Data Vault data warehouse; it is intended as a simplified example for educational purposes. A full-scale production data warehouse would almost certainly integrate data from multiple source systems, use incremental loads, and employ parallelised ETL scheduling as a just a few examples. My goal here was to have something somebody can set up, load, and be getting info out of the information mart with just SQL Server in a matter of minutes.
3736

38-
* This project was coded by hand, so do not be surprised if there are occasional syntax inconsistencies. In a production implementation, I would consider a code automation tool to be a non-negotiable expense (WhereScape 3D/RED, Vaultspeed, dbtvault, etc.).
37+
* This project was coded by hand, so no doubt there are a few inconsistencies in coding conventions. In a professional Data Vault implementation, I would consider a code automation tool to be a non-negotiable expense (WhereScape 3D/RED, Vaultspeed, dbtvault, etc.).
3938

4039
* My approach to Satellite types is simplistic, illustrating a few basic functions they commonly serve - link effectivity (including deletions and driving key relationships), business key deletions, and historisation of contextual attributes.
4140

42-
* The genuine business keys in Northwind are seldom uniquely indexed, so the surrogate IDs have been used where necessary.
41+
* Business keys in Northwind are seldom uniquely indexed, so the surrogate IDs have been used where necessary.
4342

44-
* Information Mart objects are fully virtual, i.e. views. In addition to star schema fact, dimension, and bridge views, I have included 'replica' views, which, as the name suggests, exactly replicate all Northwind database source tables from their respective Data Vault objects.
43+
* Information Mart objects are fully virtual, i.e. views. In addition to star schema facts, dimensions, and bridges, I have included 'replica' views, which, as the name suggests, exactly replicate all Northwind database source tables from their respective Data Vault objects.
4544

4645
* Information Mart views present 'current' data as it existed in the data warehouse as at the specific (load effective) time specified in the 'Value' field of the 'Information_Mart_Load_Effective_Datetime' (ID = 1) record in Meta_Metrics_Error_Mart.meta.Parameter. If this field is left blank, the current datetime will be employed. This functionality is enabled by the use of PIT tables built for each Hub.
4746

48-
* I haven't invested a great deal of time just yet in tweaking indexes for Information_Mart query performance.
47+
* Remaining to-do items include fleshing out the Meta_Metrics_Error_Mart and tweaking indexing for Information_Mart query performance.

0 commit comments

Comments
 (0)