You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 1000_Design_Patterns/Design Pattern - Generic - Loading Persistent Staging Area tables.md
+13-9Lines changed: 13 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -32,23 +32,27 @@ To support these requirements, there are three main processes in place within th
32
32
A safety check to prevent reloading of already loaded data. This is to avoid accidental reruns or out-of-order processing causes errors due to key violations, in combination with the requirement to load multiple changes in a single run.
33
33
A verification if the change provided is really a change. This is to allow the scope of attributes to change, for instance if a specific attribute is removed from the HSTG table (and still exists in STG). As part of the standard ETL requirements, the information provided is always compared against the target scope of attributes.
34
34
An ordering of changes (per key, over time) in which only the latest change is compared against the most recent change as available in HSTG.
35
-
35
+
36
36
The above three components together satisfy the HSTG template requirements. The ETL process can be described as loading delta sets into the source historical archive.
37
-
38
-
37
+
38
+
39
39
## Implementation Guidelines
40
40
Use a single ETL process, module or mapping to load data from a single source system table in the corresponding History Area table.
41
41
The Load Date / Time stamp is the logical ‘effective date’, and is copied from the Staging Area table. The Staging Area handles the correct definition of the time a change has occurred.
42
42
Because of the differences between source interfaces, relying on the CDC Operation (i.e. Insert, Update or Delete) to detect change is not always possible. For this reason all History Area ETL processes need to contain a key lookup to compare values (detect changes).
43
43
44
44
## Consequences and Considerations
45
45
Loading processes towards the Integration Area can either be sourced from the Staging Area or the History Area depending on the scheduling requirements.
46
+
46
47
The History Area can be loaded in parallel with the Integration Area, or between the Staging Area and Integration Area.
47
-
Known uses
48
-
Every source table or file has an accompanying History Area table and ETL process.
48
+
49
+
The 'prevent reprocessing' functionality can also be implemented using the Event Date / Time attribute instead of the Load Date / Time attribute.
50
+
51
+
52
+
53
+
49
54
50
55
## Related Patterns
51
-
Implementation Pattern for SSIS 006 - Loading History Area tables
52
-
Design Pattern 003 – Mapping requirements.
53
-
Design Pattern 006 – Using Start, Process and End dates.
0 commit comments