Skip to content

Commit bda4fa9

Browse files
authored
Merge pull request #47 from WeBankFinTech/1.1.0
Merge 1.1.0 to main banch
2 parents 9567b6b + e7d9db8 commit bda4fa9

File tree

147 files changed

+5938
-664
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

147 files changed

+5938
-664
lines changed

README-ZH.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -75,12 +75,11 @@ DataSphereStudio
7575

7676
- [DSS 的 Exchangis AppConn 插件安装指南](https://github.com/WeDataSphere/Exchangis/blob/master/docs/zh_CN/ch1/exchangis_appconn_deploy_cn.md)
7777

78-
- [DSS 的 Qualitis AppConn 插件安装指南](https://github.com/WeBankFinTech/Qualitis/blob/master/docs/zh_CN/ch1/%E6%8E%A5%E5%85%A5%E5%B7%A5%E4%BD%9C%E6%B5%81%E6%8C%87%E5%8D%97.md)
79-
8078
- [DSS 的 Streamis AppConn 插件安装指南](https://github.com/WeBankFinTech/Streamis/blob/main/docs/zh_CN/0.2.0/development/StreamisAppConn%E5%AE%89%E8%A3%85%E6%96%87%E6%A1%A3.md)
8179

8280
- [DSS 的 Prophecis AppConn 插件安装指南](https://github.com/WeBankFinTech/Prophecis/blob/master/docs/zh_CN/Deployment_Documents/Prophecis%20Appconn%E5%AE%89%E8%A3%85%E6%96%87%E6%A1%A3.md)
8381

82+
- [DSS 的 Dolphinscheduler AppConn 插件安装指南](zh_CN/安装部署/DolphinScheduler插件安装文档.md)
8483

8584
## 谁在使用 DataSphere Studio
8685

README.md

Lines changed: 50 additions & 55 deletions
Large diffs are not rendered by default.
Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
FlowExecution
2+
-------------------------
3+
FlowExecution:The workflow real-time execution module provides interface services for workflow execution, uses the Entrance module of linkis, and inherits some classes of linkis-entrance for adaptive transformation.
4+
Such as inheriting the PersistenceEngine to implement the persistence operation of the dss workflow task, and overriding the EntranceExecutionJob class to implement the workflow node execution, state transition, kill and other operations. Finally, the parsed and processed tasks are delivered to the linkis service through linkis-computation-client.
5+
6+
7+
### 1 business architecture
8+
9+
User use function points:
10+
11+
| component name | First-level module | Second-level module | Function point |
12+
|---------------------|------------------|-----------------|-----------------|
13+
| DataSphereStudio | Workflow | Workflow Execution | Execute |
14+
| | | | Check to execute |
15+
| | | | fail and rerun |
16+
| | | | Execution history view |
17+
18+
![](images/workflow_execution_uml.png)
19+
20+
### 2. Introduction to FlowExecution interface/class function:
21+
22+
| Core Interface/Class | Core Function |
23+
|---------------------------|------------------------------|
24+
| FlowEntranceRestfulApi | Provides a restful interface for workflow execution, such as task execution, status acquisition |
25+
| WorkflowPersistenceEngine | Overrides the persist method of the PersistenceEngine of linkis, converts the jobRequest to a workflowTask and persists it to the workflow_task table of dss |
26+
| WorkflowQueryService | Provides interface services such as workflow task creation and status update |
27+
| WorkflowExecutionInfoService | Provides services such as creation and query of workflow execution information |
28+
| FlowEntranceEngine | Executor inherited from linkis, implements the execute method, will call flowParser to parse the flow and use it as the entry of the runJob method |
29+
| FlowExecutionExecutorManagerImpl | Inherit the ExecutorManager of the child linkis, rewrite the createExecutor method, so that the created executor is the FlowEntranceEngine of dss |
30+
| FlowExecutionParser | CommonEntranceParser inherited from linkis, rewrites the parseToJob method to return FlowEntranceJob of dss |
31+
| DefaultFlowExecution | Provides the runJob() method to convert all scheduled nodes of FlowEntranceJob to runnable state, and add the running nodes to the task queue. The timing thread pool polls the linkis task status corresponding to each node. If the status task is completed is removed from the queue. |
32+
| FlowEntranceJobParser | Define the parse() method, the subclass contains various parsing implementation classes, such as parsing workflow from jobRequest, parsing the params attribute of workflow nodes |
33+
| FlowEntranceJob | EntranceExecutionJob inherited from linkis, rewrites run(), kill(), onStatusChanged() and other methods to provide job execution entry and status callback processing. |
34+
| DefaultNodeRunner | The node task running thread, converts the node task to LinkisJob and delivers it to linkis, and provides methods for initiating task status query and task cancellation to linkis |
35+
| NodeSkipStrategy | Strategy interface for judging whether a node is skipped, including three implementation strategies: execution, rerun on failure, and selected execution |
36+
| FlowContext | Contains workflow context status information, and provides methods such as getRunningNodes and getSucceedNodes to obtain information such as running nodes and successful nodes of the workflow |
37+
38+
39+
40+
### 3. Workflow execution process link:
41+
![](images/flowexecution.drawio.png)
42+
43+
### 4. Data Structure/Storage Design
44+
Workflow execution information table:
45+
```roomsql
46+
CREATE TABLE `dss_workflow_execute_info` (
47+
`id` bigint(20) NOT NULL AUTO_INCREMENT,
48+
`task_id` bigint(20) NOT NULL COMMENT 'task id',
49+
`status` int(1) DEFAULT NULL COMMENT 'status,0:faild 1:success,',
50+
`flow_id` bigint(20) NOT NULL COMMENT 'flowId',
51+
`version` varchar(200) DEFAULT NULL COMMENT 'Workflow bml version number',
52+
`failed_jobs` text COMMENT 'execution failed node',
53+
`Pending_jobs` text COMMENT 'Node not executed',
54+
`skipped_jobs` text COMMENT 'execute skip node',
55+
`succeed_jobs` text COMMENT 'Execute successful node',
56+
`createtime` datetime NOT NULL COMMENT 'create time',
57+
`running_jobs` text COMMENT 'running jobs',
58+
`updatetime` datetime DEFAULT NULL COMMENT 'update time',
59+
PRIMARY KEY (`id`)
60+
) ENGINE=InnoDB DEFAULT CHARSET=utf8;
61+
```
62+
Workflow task information table:
63+
```roomsql
64+
CREATE TABLE `dss_workflow_task` (
65+
`id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'Primary Key, auto increment',
66+
`instance` varchar(50) DEFAULT NULL COMMENT 'An instance of Entrance, consists of IP address of the entrance server and port',
67+
`exec_id` varchar(50) DEFAULT NULL COMMENT 'execution ID, consists of jobID(generated by scheduler), executeApplicationName , creator and instance',
68+
`um_user` varchar(50) DEFAULT NULL COMMENT 'User name',
69+
`submit_user` varchar(50) DEFAULT NULL COMMENT 'submitUser name',
70+
`execution_code` text COMMENT 'Run script. When exceeding 6000 lines, script would be stored in HDFS and its file path would be stored in database',
71+
`progress` float DEFAULT NULL COMMENT 'Script execution progress, between zero and one',
72+
`log_path` varchar(200) DEFAULT NULL COMMENT 'File path of the log files',
73+
`result_location` varchar(200) DEFAULT NULL COMMENT 'File path of the result',
74+
`status` varchar(50) DEFAULT NULL COMMENT 'Script execution status, must be one of the following: Inited, WaitForRetry, Scheduled, Running, Succeed, Failed, Cancelled, Timeout',
75+
`created_time` datetime DEFAULT NULL COMMENT 'Creation time',
76+
`updated_time` datetime DEFAULT NULL COMMENT 'Update time',
77+
`run_type` varchar(50) DEFAULT NULL COMMENT 'Further refinement of execution_application_time, e.g, specifying whether to run pySpark or SparkR',
78+
`err_code` int(11) DEFAULT NULL COMMENT 'Error code. Generated when the execution of the script fails',
79+
`err_desc` text COMMENT 'Execution description. Generated when the execution of script fails',
80+
`execute_application_name` varchar(200) DEFAULT NULL COMMENT 'The service a user selects, e.g, Spark, Python, R, etc',
81+
`request_application_name` varchar(200) DEFAULT NULL COMMENT 'Parameter name for creator',
82+
`script_path` varchar(200) DEFAULT NULL COMMENT 'Path of the script in workspace',
83+
`params` text COMMENT 'Configuration item of the parameters',
84+
`engine_instance` varchar(50) DEFAULT NULL COMMENT 'An instance of engine, consists of IP address of the engine server and port',
85+
`task_resource` varchar(1024) DEFAULT NULL,
86+
`engine_start_time` time DEFAULT NULL,
87+
`label_json` varchar(200) DEFAULT NULL COMMENT 'label json',
88+
PRIMARY KEY (`id`),
89+
KEY `created_time` (`created_time`),
90+
KEY `um_user` (`um_user`)
91+
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4;
92+
```
135 KB
Loading
111 KB
Loading
Lines changed: 156 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,156 @@
1+
Orchestrator Architecture Design
2+
-------------------------
3+
Orchestrator:The orchestration module provides interface services such as adding, deleting, modifying, querying, importing and exporting orchestrated under the project, and serves as a unified input for each orchestration implementation (such as workflow). Connect to the project service upward, and connect to the specific orchestration implementation (such as workflow service) downward.
4+
5+
### 1 business architecture
6+
User use function point:
7+
8+
9+
|Component name | First-level module | Second-level module | Function point |
10+
|---------------------|------------------|-----------------|-----------------|
11+
| DataSphereStudio | Orchestration Mode | New Orchestration Mode | Create a New Orchestration |
12+
| | | Edit Arrangement Mode | Edit Arranged Field Information |
13+
| | | Delete Arrangement Mode | Delete Arrangement |
14+
| | | Open Arrangement Mode | Open Arrangement to perform drag-and-drop development of choreography nodes |
15+
| | | View the list of orchestration versions | View the historical versions of the orchestration mode, you can open and view a version, or roll back a version |
16+
| | | Orchestration mode rollback | Roll back to a historical version of the orchestration (a version will be added after the orchestration is released) |
17+
18+
![](images/orchestrator_uml.png)
19+
20+
### 一、Orchestrator Architecture:
21+
![](images/orchestrator_arch.png)
22+
23+
### 二、Orchestrator module design:
24+
Second-level module core class introduction:
25+
26+
**dss-orchestrator-core**
27+
28+
The core module of Orchestrator defines top-level interfaces such as DSSOrchestrator, DSSOrchestratorContext, and DSSOrchestratorPlugin.
29+
30+
| Core top-level interface/class | Core functionality |
31+
|---------------------------|------------------------------|
32+
| DSSOrchestrator | Defines the method of obtaining the properties of the orchestration, such as obtaining the editorial ranking, associated appconn, context information, etc. |
33+
| DSSOrchestratorContext | Defines the context information of the orchestrator, and provides methods such as obtaining the orchestration plug-in class |
34+
| DSSOrchestratorPlugin | The top-level interface for orchestrating plugins, defines the init method, and the subclass contains the imported and exported plugin implementation classes |
35+
| DSSOrchestratorRelation |Defines the method of obtaining the associated properties of the orchestration, such as obtaining the orchestration mode, and obtaining the appconn associated with the orchestration |
36+
37+
**dss-orchestrator-db**
38+
39+
Defines the unified entry of the dao layer method of orchestration.
40+
41+
**dss-orchestrator-conversion-standard**
42+
43+
Defines the interface specification for orchestration and conversion to third-party systems, including top-level interfaces such as ConversionOperation, ConversionRequestRef, and ConversionService.
44+
The core module of Orchestrator defines top-level interfaces such as DSSOrchestrator, DSSOrchestratorContext, and DSSOrchestratorPlugin.
45+
46+
| Core Interface/Class | Core Function |
47+
|--------------------- |------------------------------------------|
48+
| ConversionOperation | Defines the conversion core convert method, the input parameter is ConversionRequestRef, and returns ResponseRef |
49+
| DSSToRelConversionRequestRef | Defines the basic parameters of the conversion request, such as userName, workspace, dssProject and other information |
50+
| ConversionIntegrationStandard | The following core methods are defined: getDSSToRelConversionService (used to support the orchestration of DSS and convert it to the workflow of the scheduling system) |
51+
| ConversionService | Defines methods for getting labels and getting ConversionIntegrationStandard |
52+
53+
54+
**dss-orchestrator-loader**
55+
56+
Used to load orchestration-related appconn, such as workflow-appconn, and load subclasses of DSSOrchestratorPlugin, such as ExportDSSOrchestratorPlugin.
57+
58+
| Core Interface/Class | Core Function |
59+
|--------------------- |---------------------------------------------|
60+
| OrchestratorManager | The getOrCreateOrchestrator method is defined to load the appconn associated with the orchestrator, which will be cached after the first load to avoid repeated loading |
61+
| LinkedAppConnResolver | Defines the interface for obtaining appconn according to the user |
62+
| SpringDSSOrchestratorContext | The initialization method of the class will load all subclasses of DSSOrchestratorPlugin and cache them in memory |
63+
64+
**dss-framework-orchestrator-server**
65+
66+
The Orchestrator framework service provides interfaces such as adding, deleting, modifying, checking, and rolling back the orchestration front-end, as well as rpc services such as orchestration import and export.
67+
68+
**dss-framework-orchestraotr-publish**
69+
70+
Provides publishing-related plug-ins, such as arranging import and export implementation classes, arranging compressed package generation, and parsing implementation classes.
71+
72+
| Core Interface/Class | Core Function |
73+
|--------------------- |---------------------------------------------|
74+
| ExportDSSOrchestratorPlugin | Defines the orchestration export interface |
75+
| ImportDSSOrchestratorPlugin | The orchestration import interface is defined |
76+
| MetaWriter | Provides the function of outputting the arranged table field information to the metadata file in a specific format|
77+
| MetaReader | Provides the function of parsing the arranged metadata file to generate table field content|
78+
79+
#### Create an orchestration sequence diagram (delete and edit operations are similar):
80+
81+
![](images/Create_an_orchestration_sequence_diagram.png)
82+
83+
#### Import and arrange sequence diagrams (export operations are similar):
84+
85+
![](images/Import_Orchestration_Sequence_Diagram.png)
86+
87+
### Data Structure/Storage Design
88+
orchestrator information sheet:
89+
```roomsql
90+
CREATE TABLE `dss_orchestrator_info` (
91+
`id` bigint(20) NOT NULL AUTO_INCREMENT,
92+
`name` varchar(255) NOT NULL COMMENT 'orchestrator name',
93+
`type` varchar(255) NOT NULL COMMENT 'orchestrator type,E.g:workflow',
94+
`desc` varchar(1024) DEFAULT NULL COMMENT 'description',
95+
`creator` varchar(100) NOT NULL COMMENT 'creator',
96+
`create_time` datetime DEFAULT NULL COMMENT 'create time',
97+
`project_id` bigint(20) DEFAULT NULL COMMENT 'project id',
98+
`uses` varchar(500) DEFAULT NULL COMMNET 'uses',
99+
`appconn_name` varchar(1024) NOT NULL COMMENT 'Orchestrate the associated appconn,E.g:workflow',
100+
`uuid` varchar(180) NOT NULL COMMENT 'uuid',
101+
`secondary_type` varchar(500) DEFAULT NULL COMMENT 'Orchestrate of the second type,E.g:workflow-DAG',
102+
`is_published` tinyint(1) NOT NULL DEFAULT '0' COMMENT 'Is it published',
103+
`workspace_id` int(11) DEFAULT NULL COMMENT 'workspace id',
104+
`orchestrator_mode` varchar(100) DEFAULT NULL COMMENT 'orchestrator mode,The value obtained is dic_key(parent_key=p_arrangement_mode) in dss_dictionary',
105+
`orchestrator_way` varchar(256) DEFAULT NULL COMMENT 'orchestrator way',
106+
`orchestrator_level` varchar(32) DEFAULT NULL COMMENT 'orchestrator level',
107+
`update_user` varchar(100) DEFAULT NULL COMMENT 'update user',
108+
`update_time` datetime DEFAULT CURRENT_TIMESTAMP COMMENT 'update time',
109+
PRIMARY KEY (`id`) USING BTREE,
110+
UNIQUE KEY `unique_idx_uuid` (`uuid`)
111+
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=COMPACT;
112+
```
113+
114+
orchestrator version information table:
115+
```roomsql
116+
CREATE TABLE `dss_orchestrator_version_info` (
117+
`id` bigint(20) NOT NULL AUTO_INCREMENT,
118+
`orchestrator_id` bigint(20) NOT NULL COMMENT 'associated orchestration id',
119+
`app_id` bigint(20) DEFAULT NULL COMMENT 'The id of the orchestration implementation, such as flowId',
120+
`source` varchar(255) DEFAULT NULL COMMENT 'source',
121+
`version` varchar(255) DEFAULT NULL COMMENT 'verison',
122+
`comment` varchar(255) DEFAULT NULL COMMENT 'description',
123+
`update_time` datetime DEFAULT NULL COMMENT 'update time',
124+
`updater` varchar(32) DEFAULT NULL COMMENT 'updater',
125+
`project_id` bigint(20) DEFAULT NULL COMMENT 'project id',
126+
`content` varchar(255) DEFAULT NULL COMMENT '',
127+
`context_id` varchar(200) DEFAULT NULL COMMENT 'context id',
128+
`valid_flag` INT(1) DEFAULT '1' COMMENT 'Version valid flag, 0: invalid; 1: valid',
129+
PRIMARY KEY (`id`) USING BTREE
130+
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=COMPACT;
131+
```
132+
133+
And the scheduling system orchestration association table:
134+
```roomsql
135+
CREATE TABLE `dss_orchestrator_ref_orchestration_relation` (
136+
`id` bigint(20) NOT NULL AUTO_INCREMENT COMMENT 'primary key ID',
137+
`orchestrator_id` bigint(20) NOT NULL COMMENT 'The orchestration mode id of dss',
138+
`ref_project_id` bigint(20) DEFAULT NULL COMMENT 'The project ID associated with the scheduling system',
139+
`ref_orchestration_id` int(11) DEFAULT NULL COMMENT 'The id of the scheduling system workflow (the orchestrationId returned by calling the OrchestrationOperation service of SchedulerAppConn)',
140+
PRIMARY KEY (`id`)
141+
) ENGINE=InnoDB DEFAULT CHARSET=utf8mb4 ROW_FORMAT=COMPACT;
142+
```
143+
144+
### 5.interface design
145+
146+
147+
### 6.non-functional design
148+
#### 6.1 Safety
149+
Using the special ID in the cookie, GateWay needs to use a special decryption algorithm to identify it.
150+
#### 6.2 Performance
151+
can meet performance requirements.
152+
#### 6.3 Capacity
153+
not involving
154+
#### 6.4 High Availability
155+
Deployable Multi-Active
156+
86.9 KB
Loading
166 KB
Loading
120 KB
Loading
128 KB
Loading

0 commit comments

Comments
 (0)