Skip to content

Commit 8a70139

Browse files
committed
File rename, high level tidy-up
1 parent ab23977 commit 8a70139

File tree

41 files changed

+146
-279
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+146
-279
lines changed

README.md

Lines changed: 14 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
# Data Integration framework - Overview
2+
23
## Introduction
34

45
The Data Integration framework provides a software and methodology independent, structured approach to developing data processes.
@@ -15,15 +16,15 @@ On several occasions, the Data Integration framework makes mention of the ETL pr
1516

1617
*‘If we want better performance we can buy better hardware, unfortunately we cannot buy a more maintainable or reliable system’.*
1718

18-
Design and implementation of data integration can be a labour-intensive activity that typically consumes large amounts of effort in Data Warehouse and data integration projects.
19+
Design and implementation of data integration can be a labour-intensive activity that typically consumes large amounts of effort in Data Warehouse and data integration projects.
1920

20-
Over time, as requirements change and enterprises become more data-driven, the architecture faces challenges in the complexity, consistency and flexibility in the design (and maintenance) of the data integration flows.
21+
Over time, as requirements change and enterprises become more data-driven, the architecture faces challenges in the complexity, consistency and flexibility in the design (and maintenance) of the data integration flows.
2122

2223
These changes can include changes in latency and availability requirements, a bigger variety of sources or the need to expose information in different ways. This typically occurs when adoption of data and information products (i.e. BI, Analytics) matures within an organisation and the need to have up-to-date information becomes more mission critical.
2324

2425
Using a standard data integration approach will meet these challenges by providing structure, flexibility and scalability for the design of data flows.
2526

26-
In a more traditional configuration, data solutions are often designed to store structured data for strategic decision making. This type of solution allows a small number of (expert) users to analyse (historical) data and define reports.
27+
In a more traditional configuration, data solutions are often designed to store structured data for strategic decision making. This type of solution allows a small number of (expert) users to analyse (historical) data and define reports.
2728

2829
Data is typically periodically extracted, cleansed, integrated and transformed in a centralised Data Warehouse from a heterogeneous set of sources. The focus for ETL in these design is typically on ‘correct functionality’ and ‘adequate performance’ - but not necessarily on design elements that are equally important for success.
2930

@@ -41,28 +42,26 @@ The core body of knowledge sits in the various *Design Patterns* (details of spe
4142

4243
The idea is that Design- and Solution patterns are continuously updated and added to. A typical solution design would select the relevant patterns to define the architecture - captured in the Solution Architecture design artefact.
4344

44-
# Data Integration framework components
45+
## Data Integration framework components
4546

4647
The diagram below outlines the Data Integration framework components. These are all required to define a data solution that supports Data Warehouse Automation.
4748

4849
The idea is to enable a standard and structured way for documenting decisions related to system design and operation.
4950

5051
![1547519339316](./Images/5C1547519339316.png)
5152

52-
53-
54-
- **Reference Solution Architecture**; a blueprint for a common data solution architecture such as Data Warehouses, Data Hubs etc. The corresponding documents outline the various layers and areas that define the data solution.
55-
- **Reference Technical Architecture**; capturing the technical details relevant to the Solution Architecture. The intent for this template is to capture the infrastructure and software specifics, as well as context for the physical data models and database / data platform configuration. The Technical Architecture also covers details around the implementation of security, encryption and retention approaches.
56-
- **Design Patterns**; documentation of key design decisions and backgrounds on design principles: the 'how-to's'. This includes the application of data integration and modelling concepts. Design Patterns follow a defined template and are centrally stored and managed.
57-
- **Solution Patterns**; the practical details on how to implement concepts explained in a Design Pattern for a given technology. Similar to Design Patterns, the Solution Patterns all follow the same template. In many cases a single Design Pattern is referred to by multiple Solution Patterns, all of which document how to implement the concept for a specific technology.
58-
- **Documentation templates**, standards and conventions; modelling and technical conventions.
59-
- **ETL templates & patterns**; technical templates that can be used as blueprints to generate data integration processes with or against.
60-
- **ETL mapping metadata**; approaches for managing the source-to-target mappings - vital ETL metadata to enable Data Warehouse Automation / ETL generation.
61-
- **ETL process control framework**; this is the runtime execution, logging and monitoring of data integration processes, including recovery and orchestration. This is further detailed in the DIRECT Github (Data Integration Runtime Execution and Control framework). DIRECT includes a repository for ETL control, integration hooks for ETL processes and automation scripts.
53+
* **Reference Solution Architecture**; a blueprint for a common data solution architecture such as Data Warehouses, Data Hubs etc. The corresponding documents outline the various layers and areas that define the data solution.
54+
* **Reference Technical Architecture**; capturing the technical details relevant to the Solution Architecture. The intent for this template is to capture the infrastructure and software specifics, as well as context for the physical data models and database / data platform configuration. The Technical Architecture also covers details around the implementation of security, encryption and retention approaches.
55+
* **Design Patterns**; documentation of key design decisions and backgrounds on design principles: the 'how-to's'. This includes the application of data integration and modelling concepts. Design Patterns follow a defined template and are centrally stored and managed.
56+
* **Solution Patterns**; the practical details on how to implement concepts explained in a Design Pattern for a given technology. Similar to Design Patterns, the Solution Patterns all follow the same template. In many cases a single Design Pattern is referred to by multiple Solution Patterns, all of which document how to implement the concept for a specific technology.
57+
* **Documentation templates**, standards and conventions; modelling and technical conventions.
58+
* **ETL templates & patterns**; technical templates that can be used as blueprints to generate data integration processes with or against.
59+
* **ETL mapping metadata**; approaches for managing the source-to-target mappings - vital ETL metadata to enable Data Warehouse Automation / ETL generation.
60+
* **ETL process control framework**; this is the runtime execution, logging and monitoring of data integration processes, including recovery and orchestration. This is further detailed in the DIRECT Github (Data Integration Runtime Execution and Control framework). DIRECT includes a repository for ETL control, integration hooks for ETL processes and automation scripts.
6261

6362
In short, the reference Solution Architecture and the corresponding Technical Architecture provide a common framework for all design and development effort. Design Patterns provide the details of how selected concepts are approaches, including considerations and pros and cons. Solution Patterns describe how these approaches are best translated in the selected technology.
6463

65-
# Standards for Design and Solution Patterns
64+
## Standards for Design and Solution Patterns
6665

6766
The pattern structure (Design and Solution Pattern layout) always is as follows:
6867

design-patterns/Design Pattern - Generic - Loading Landing tables.md

Lines changed: 0 additions & 55 deletions
This file was deleted.

design-patterns/Design Pattern - Generic - Loading Staging Area tables.md

Lines changed: 0 additions & 48 deletions
This file was deleted.

design-patterns/Design Pattern - Interfacing - Loading Staging Area tables from transactional CDC.md

Lines changed: 0 additions & 48 deletions
This file was deleted.

0 commit comments

Comments
 (0)