Skip to content

Commit ec94270

Browse files
committed
Documentation updates
1 parent 01167ea commit ec94270

File tree

6 files changed

+22
-45
lines changed

6 files changed

+22
-45
lines changed

Documentation/DIRECT_Setup_Tips.md

Lines changed: 0 additions & 7 deletions
This file was deleted.

Documentation/DIRECT_Functional_Design.md renamed to Documentation/Documentation.md

Lines changed: 6 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,41 +1,18 @@
11
# Introduction of DIRECT
22

3-
DIRECT, the Data Integration & Execution Control Tool, is a data integration control and execution metadata model. It is a core and stand-alone component of the Data Integration Framework.
3+
DIRECT, the Data Integration & Execution Control Tool, is a metadata model for controlling and executing data integration processes ('processes'). A robust data logistics control framework is essential for many data solutions, and DIRECT can serve as such a framework.
44

5-
Every Data Integration / Extract Transform and Load (ETL) process is linked to this model which provides the orchestration and management capabilities for data integration.
5+
Every data logistics process, such as data integration or Extract, Transform, and Load (ETL), can be registered in the DIRECT framework. DIRECT provides orchestration and management capabilities for data integration, ensuring smooth execution and control.
66

7-
Data Integration in this context is a broad definition covering various implementation techniques such as ELT (Extract Load, Transform - push-down into SQL or underlying processing) and LETS (Load-Extract-Transform-Store).
7+
DIRECT features a database repository where each data logistics process is registered, and every runtime execution is tracked. This repository serves as a valuable source of information on platform performance, usage trends, and platform growth in terms of both time and size. At its core, DIRECT focuses on defining and orchestrating processes, while also offering advanced features like continuous and parallel processing, as well as transaction control.
88

9-
Data Integration in this document essentially covers all processes that 'touch' data.
10-
11-
The DIRECT repository captures Data Integration process information, and is an invaluable source of information to monitor how the system is expanding (time, size) but also to drive and monitor processes - a fundamental requirement for parallel processing and transaction control.
12-
13-
The objective of the DIRECT Framework is to provide a structured approach to describing and recording Data integration processes that can be made up of many separate components. This is to be done in such a way that they can be represented and managed as a coherent system.
14-
15-
## Overview
16-
17-
This document covers the design and specifications for the DIRECT metadata repository and the integration (events) for data integration processes.
18-
19-
The DIRECT framework covers a broad variety of process details, including (but not limited to):
20-
21-
* What process information will be stored and how.
22-
* How a process is integrated into the various defined Layers and Areas.
23-
* Of what entities the metadata model consists,
24-
* The available procedures for managing the data solution.
25-
* Concepts and principles.
26-
* The logic which can be used to control the processes.
27-
* Housekeeping functions.
28-
* Reporting.
29-
30-
The position of the control and execution framework in the overall architecture is:
31-
32-
![Positioning](Images/Direct_Documentation_Figure1_Positioning.png)
9+
The primary goal of the DIRECT framework is to provide a structured approach to describing and recording data logistics processes, which may consist of many distinct components. This structure allows these processes to be represented and managed as a cohesive system.
3310

3411
## Concepts
3512

3613
### Purpose
3714

38-
The process control framework supports the ability to trace back what data has been loaded, when and in what way for every individual data integration process.
15+
The framework provides the ability to trace back what data has been processed, when and in what way for every individual data logisitcs process.
3916

4017
Any single data element (e.g. attribute value in a table) should be auditable. It should be possible to track the what processes have been run that has led to the visible result.
4118

@@ -118,7 +95,7 @@ The following diagram illustrates the layers and technologies involved in this p
11895

11996
## Rollback and re-processing
12097

121-
When processing errors occur (a data integration process fails), relevant information about the failure is recorded in the repository by the framework. This information can be used to recover from data loading errors and set the data solution back into the original state prior to the occurrence of the error.
98+
When processing errors occur (a process fails), relevant information about the failure is recorded in the repository by the framework. This information can be used to recover from data loading errors and set the data solution back into the original state prior to the occurrence of the error.
12299

123100
This 'rollback' can be configured at both Batch and Module level.
124101

427 KB
Loading

Documentation/Installation.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
# Installing DIRECT
2+
3+
The DIRECT framework can be deployed from the Visual Studio solution/project, using the publish function. Alternatively, the project can be compiled to a DACPAC and deployed via command-line tools such as sqlcmd.
4+
5+
As part of the installation, post deployment scripts are run to provide the standard framework contents. Pre- and post DACPAC deploment placeholders are also available.

Documentation/Model.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
1-
# Direct Framework ERD Page
1+
# Direct Framework Physical Model
2+
3+
This section contains the DIRECT physical model in MermaidChart format. The contents below can be rendered or pasted in the online editor (https://www.mermaidchart.com/).
24

35
```mermaid
46
---

README.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,17 +1,17 @@
11
# DIRECT
22

3-
Data Integration Run-time Execution Control Tool
3+
The Data Integration Run-time Execution Control Tool (DIRECT) is a framework for defining, orchestrating, and logging of data logistics processes and workflows ('ETL', 'pipelines') so that a full audit trail is created.
4+
5+
The framework provides mechanism to administer the individual processes or workflows, and track their runtime execution.
46

57
This repository contains the following:
68

7-
* Data Model (DDL and DML)
8-
* Management tool (C#)
9-
* SSIS examples with Biml code generation for DIRECT hooks
10-
* T-SQL hooks
11-
* Functional specification
12-
* Support SQL scripts
9+
* Data Model
10+
* Tables and scripts (DDL and DML)
11+
* Examples and support scripts
12+
* Documentation
13+
* Testing script to validate any framework changes
1314

1415
## Learn more
1516

1617
* GitHub information: [https://github.com/data-solution-automation-engine](https://github.com/data-solution-automation-engine)
17-
* More information: [www.roelantvos.com](www.roelantvos.com)

0 commit comments

Comments
 (0)