You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DataSphere Studio (DSS for short) is WeDataSphere, a big data platform of WeBank, a self-developed one-stop data application development management portal.
10
+
DataSphere Studio (DSS for short) is WeDataSphere, a big data platform of WeBank, a self-developed one-stop data application development management portal.
11
11
12
-
Based on [Linkis](https://github.com/WeBankFinTech/Linkis) computation middleware, DSS can easily integrate upper-level data application systems, making data application development simple and easy to use.
12
+
DataSphere Studio is positioned as a data application development framework, and the closed loop covers the entire process of data application development. With a unified UI, the workflow-like graphical drag-and-drop development experience meets the entire lifecycle of data application development from data import, desensitization cleaning, data analysis, data mining, quality inspection, visualization, scheduling to data output applications, etc.
13
13
14
-
DataSphere Studio is positioned as a data application development portal, and the closed loop covers the entire process of data application development. With a unified UI, the workflow-like graphical drag-and-drop development experience meets the entire lifecycle of data application development from data import, desensitization cleaning, data analysis, data mining, quality inspection, visualization, scheduling to data output applications, etc.
14
+
With a pluggable framework architecture, DSS is designed to allow users to quickly integrate new data application tools, or replace various tools that DSS has integrated.
15
15
16
-
With the connection, reusability, and simplification capabilities of Linkis, DSS is born with financial-grade capabilities of high concurrency, high availability, multi-tenant isolation, and resource management.
16
+
## Integrated data application components
17
17
18
-
## UI preview
18
+
DSS has integrated a variety of upper-layer data application systems by implementing multiple AppConns, which can basically meet the data development needs of users.
19
19
20
-
Please be patient, it will take some time to load gif.
20
+
**If desired, new data application systems can also be easily integrated to replace or enrich DSS's data application development process.**[Click me to learn how to quickly integrate new application systems](en_US/Development_Documentation/Third-party_System_Access_Development_Guide.md)
|[**DataApiService**](en_US/Using_Document/DataApiService_Usage_Documentation)| Data API service. The SQL script can be quickly published as a Restful interface, providing Rest access capability to the outside world. | Not supported | >=1.0.0 | Released |
25
+
|[**Scriptis**](https://github.com/WeBankFinTech/Scriptis)| Support online script writing such as SQL, Pyspark, HiveQL, etc., submit to [Linkis](https://github.com/WeBankFinTech/Linkis) to perform data analysis web tools. | >=0.5.0 | >=1.0.0 | Released |
26
+
|[**Schedulis**](https://github.com/WeBankFinTech/Schedulis)| Workflow task scheduling system based on Azkaban secondary development, with financial-grade features such as high performance, high availability and multi-tenant resource isolation. | >=0.5.0 | >=1.0.0 | Released |
27
+
|**EventCheck**| Provides cross-business, cross-engineering, and cross-workflow signaling capabilities. | >=0.5.0 | >=1.0.0 | Released |
28
+
|**SendEmail**| Provides the ability to send data, all the result sets of other workflow nodes can be sent by email | >=0.5.0 | >=1.0.0 | Released |
29
+
|[**Qualitis**](https://github.com/WeBankFinTech/Qualitis)| Data quality verification tool, providing data verification capabilities such as data integrity and correctness | >=0.5.0 | 1.0.1(Version currently in preparation) |**Expected end of January**|
30
+
|[**Streamis**](https://github.com/WeBankFinTech/Streamis)| Streaming application development management tool. It supports the release of Flink Jar and Flink SQL, and provides the development, debugging and production management capabilities of streaming applications, such as: start-stop, status monitoring, checkpoint, etc. | Not supported | 1.0.1(Version currently in preparation) |**Expected end of January**|
31
+
|[**Exchangis**](https://github.com/WeBankFinTech/Exchangis)| A data exchange platform that supports data transmission between structured and unstructured heterogeneous data sources, the upcoming Exchangis1. 0, will be connected with DSS workflow | not supported | Planned in 1.0.2 |**In Development**|
32
+
|[**Visualis**](https://github.com/WeBankFinTech/Visualis)| A data visualization BI tool based on the second development of Davinci, an open source project of CreditEase, provides users with financial-level data visualization capabilities in terms of data security. | >=0.5.0 | Planned in 1.0.2 |**In Development**|
33
+
|[**Prophecis**](https://github.com/WeBankFinTech/Prophecis)| A one-stop machine learning platform that integrates multiple open source machine learning frameworks. Prophecis' MLFlow can be connected to DSS workflow through AppConn. | Not supported | Planned in 1.0.2 |**In Development**|
34
+
|**UserManager**| Automatically initialize all user environments necessary for a new DSS user, including: creating Linux users, various user paths, directory authorization, etc. | >=0.9.1 | Planned in 1.0.2 |**In Development**|
35
+
|**DolphinScheduler**| Apache DolphinScheduler, a distributed and scalable visual workflow task scheduling platform, supports one-click publishing of DSS workflows to DolphinScheduler. | Not supported | Planned in 1.1.0 |**In Development**|
36
+
|**UserGuide**| It mainly provides help documentation, beginner's guide, Dark mode skinning, etc. | Not supported | Planning in 1.1.0 |**In Development**|
37
+
|**DataModelCenter**| It mainly provides the capabilities of data warehouse planning, data model development and data asset management. Data warehouse planning includes subject domains, data warehouse layers, modifiers, etc.; data model development includes indicators, dimensions, metrics, wizard-based table building, etc.; data assets are connected to Apache Atlas to provide data lineage capabilities. | Not supported | Planning in 1.2.0 |**In Development**|
38
+
|**Airflow**| Supports publishing DSS workflows to Airflow for scheduling. | >=0.9.1, not yet merged | Not supported |**No plans yet**|
23
39
24
-
## Core features
25
40
26
-
### 1. One-stop, full-process application development management UI
41
+
##Download
27
42
28
-
DSS is highly integrated. Currently integrated systems include:
29
-
30
-
a. [Scriptis](https://github.com/WeBankFinTech/Scriptis) - Data Development IDE Tool.
31
-
32
-
b. [Visualis](https://github.com/WeBankFinTech/Visualis) - Data Visualization Tool(Based on the open source project [Davinci](https://github.com/edp963/davinci) contributed by CreditEase)
33
-
34
-
c. [Qualitis](https://github.com/WeBankFinTech/Qualitis) - Data Quality Management Tool
Please go to the [DSS Releases Page](https://github.com/WeBankFinTech/DataSphereStudio/releases) to download a compiled version or a source code package of DSS.
41
44
42
-
### 2. AppConn, based on Linkis,defines a unique design concept
45
+
##Compile and deploy
43
46
44
-
AppConn——application joint, defining unified front-end and back-end
45
-
integration specifications, can quickly and easily integrate with external data application systems,
46
-
making them as part of DSS data application development.
47
+
Please follow [Compile Guide](en_US/Development_Documentation/Compilation_Documentation.md) to compile DSS from source code.
47
48
48
-
DSS arranges multiple AppConns in series to form a workflow that supports real-time execution and scheduled execution. Users can complete the entire process development of data applications with simple drag and drop operations.
49
+
Please refer to [Deployment Documents](en_US/Installation_and_Deployment/DSS_Single-Server_Deployment_Documentation.md) to do the deployment.
49
50
50
-
Since AppConn is integrated with Linkis, the external data application system shares the capabilities of resource management, concurrent limiting, and high performance. AppConn also allows sharable context across system level and completely gets away from application silos.
51
+
## Demo Trial environment
51
52
52
-
### 3. Workspace, as the management unit
53
+
The function of DataSphere Studio supporting script execution has high security risks, and the isolation of the WeDataSphere Demo environment has not been completed. Considering that many users are inquiring about the Demo environment, we decided to first issue invitation codes to the community and accept trial applications from enterprises and organizations.
53
54
54
-
With Workspace as the management unit, DSS organizes and manages the business applications of each data application system, and defines a set of common standards for collaborative development of Workspaces across data application systems.
55
+
If you want to try out the Demo environment, please join the DataSphere Studio community user group (**Please refer to the end of the document**), and contact **WeDataSphere Group Robot** to get an invitation code.
55
56
56
-
### 4. Integrated data application components
57
+
DataSphereStudio Demo environment user registration page: [click me to enter](https://dss-open.wedatasphere.com/#/register)
DataSphereStudio Demo environment login page: [click me to enter](https://dss-open.wedatasphere.com/#/login)
59
60
60
-
Many data applications developed by users usually require periodic scheduling capability.
61
-
62
-
At present, the open source scheduling system in the community is pretty unfriendly to integrate with other data application systems.
63
-
64
-
DSS implements Schedulis AppConn, which allows users to publish DSS workflows to Schedulis for regular scheduling.
65
-
66
-
DSS also defines standard and generic workflow parsing and publishing specifications for scheduling systems, allowing other scheduling systems to easily achieve low-cost integration with DSS.
b. Scriptis AppConn —— Data Development IDE Tool
71
-
72
-
What is [Scriptis](https://github.com/WeBankFinTech/Scriptis)?
73
-
74
-
Scriptis is for interactive data analysis with script development(SQL, Pyspark, HiveQL), task submission(Spark, Hive), UDF, function, resource management and intelligent diagnosis.
75
-
76
-
Scriptis AppConn integrates the data development capabilities of Scriptis to DSS, and allows various script types of Scriptis to serve as nodes in the DSS workflow to participate in the application development process.
77
-
78
-
Currently supports HiveSQL, SparkSQL, Pyspark, Scala and other script node types.
c. Visualis AppConn —— Data Visualization Tool
83
-
84
-
What is [Visualis](https://github.com/WeBankFinTech/Visualis)?
85
-
86
-
Visualis is a BI tool for data visualization. It provides financial-grade data visualization capabilities on the basis of data security and permissions, based on the open source project Davinci contributed by CreditEase.
87
-
88
-
Visualis AppConn integrates data visualization capabilities to DSS, and allows displays and dashboards, as nodes of DSS workflows, to be associated with upstream data market.
e. Data Sender——Sender AppConn
99
-
100
-
Sender AppConn provides data delivery capability for DSS. Currently it supports the SendEmail node type, and the result sets of all other nodes can be sent via email.
101
-
102
-
For example, the SendEmail node can directly send the screen shot of a display as an email.
103
-
104
-
f. Signal AppConn —— Signal Nodes
61
+
## Documents
105
62
106
-
Signal AppConn is used to strengthen the correlation between business and process while keeping them decoupled.
107
-
108
-
DataChecker Node:Checks whether a table or partition exists.
109
-
110
-
EventSender Node: Messaging nodes across workflows and projects.
111
-
112
-
EventReceiver: Receive nodes for messages across workflows and projects.
113
-
114
-
g. Function node
115
-
116
-
Empty nodes, sub workflow nodes.
63
+
For a complete list of documents for DSS1.0, see [DSS-Doc](en_US)
117
64
118
-
## Compared with similar systems
65
+
The following is the installation guide for DSS-related AppConn plugins:
119
66
120
-
DSS is an open source project leading the direction of data application development and management.
121
-
The open source community currently does not have similar products.
1. Scenarios in which big data platform capability is being prepared or initialized but no data application tools are available.
2. Scenarios in which users already have big data foundation platform capabilities but with only a few data application tools.
75
+
## Who is using DSS
130
76
131
-
3. Scenarios in which users have the ability of big data foundation platform and comprehensive data application tools, but suffers strong isolation and and high learning costs because those tools have not been integrated together.
77
+
We opened an issue for users to feedback and record who is using DSS.
132
78
133
-
4. Scenarios in which users have the capabilities of big data foundation platform and comprehensive data application tools. but lacks unified and standardized specifications, while a part of these tools have been integrated.
79
+
Since the first release of DSS in 2019, it has accumulated more than 700 trial companies and 1000+ sandbox trial users, which involving diverse industries, from finance, banking, tele-communication, to manufactory, internet companies and so on.
134
80
135
81
136
-
## Quick start
82
+
## Document statement
137
83
138
-
Click to [Quick start](https://github.com/WeBankFinTech/DataSphereStudio/blob/master/docs/en_US/ch2/DSS%20Quick%20Installation%20Guide.md)
84
+
DataSphere Studio uses GitBook for management, and the entire project will be organized into a GitBook e-book for everyone to download and use.
139
85
140
-
## Architecture
86
+
WeDataSphere will provide a unified document reading entry in the future. For the usage of GitBook, please refer to: [GitBook Documentation](http://caibaojian.com/gitbook/)。
Contributions are always welcomed, we need more contributors to build DSS together. either code, or doc, or other supports that could help the community.
[Quick integration with DSS for external systems](https://github.com/WeBankFinTech/DataSphereStudio/blob/master/docs/en_US/ch4/The%20Guide%20for%20Third-party%20Systems%20accessing%20DSS.md)
96
+
For any questions or suggestions, please kindly submit an issue.
151
97
152
-
## Communication
98
+
You can scan the QR code below to join our WeChat and QQ group to get more immediate response.
DSS is under the Apache 2.0 license. See the [License](https://github.com/WeBankFinTech/DataSphereStudio/blob/master/LICENSE) file for details.
103
+
DSS is under the Apache 2.0 license. See the [License](https://github.com/WeBankFinTech/DataSphereStudio/blob/master/LICENSE) file for details.
0 commit comments