|
1 | 1 | # Features |
2 | 2 |
|
3 | | -## Introduction |
| 3 | +**DUUI-Gateway** includes a range of features which facilitate its effective and easy use of DUUI in various contexts and application areas. |
4 | 4 |
|
5 | | -DUUI Gateway includes a range of features which facilitate its effective and easy use in various contexts and application areas. |
| 5 | +## User management |
6 | 6 |
|
7 | | -### Cluster Management |
| 7 | +**DUUI-Gateway** has a relatively straightforward user management system in which a distinction is maintained between the roles of **user** and **admin**. At the same time, groups can be created and users can be assigned to them. |
8 | 8 |
|
| 9 | +* Role **user**: Users can use all functions of DUUI-Gateway to construct pipelines, create connectors and execute processes. The available resources in the cluster, as well as all other system parameters, are configured by the **admins**. |
| 10 | +* Role **admin**: Administrators also have the ability to make global settings, manage groups as well as assign users to groups. |
9 | 11 |
|
| 12 | +## Web + REST interface |
10 | 13 |
|
11 | | -### User Management |
| 14 | + |
12 | 15 |
|
| 16 | +The web interface and the REST API are the core components of DUUI Gateway. |
| 17 | +Both features are interlinked and the web interface provides a general and generic accessibility of DUUI Gateway, which can also be used via the API after sessions and user accounts have been created. |
| 18 | + |
| 19 | +<figure> |
| 20 | + <img src="images/REST.png" alt="Rest" style="width:100%"> |
| 21 | + <figcaption>Extract from the REST API.</figcaption> |
| 22 | +</figure> |
| 23 | + |
| 24 | + |
| 25 | +Both interfaces allow pipelines to be created, managed, [DUUI components](https://github.com/texttechnologylab/duui-uima) to be added or modified and processes to be started or monitored. |
| 26 | + |
| 27 | + |
| 28 | + |
| 29 | + |
| 30 | +### Client libraries |
| 31 | + |
| 32 | + |
| 33 | + |
| 34 | +## Dynamic pipeline construction |
| 35 | + |
| 36 | +<figure> |
| 37 | + <img src="images/Pipeline.png" alt="Pipeline" style="width:100%"> |
| 38 | + <figcaption>In order to process texts, various pipelines can be created and assembled using DUUI components. |
| 39 | +</figcaption> |
| 40 | +</figure> |
| 41 | + |
| 42 | + |
| 43 | +<figure> |
| 44 | + <img src="images/Pipelines.png" alt="Pipelines" style="width:100%"> |
| 45 | + <figcaption>Pipelines can also be saved as templates for future use. |
| 46 | +</figcaption> |
| 47 | +</figure> |
| 48 | + |
| 49 | + |
| 50 | +<figure> |
| 51 | + <img src="images/Create_Process.png" alt="Pipelines" style="width:100%"> |
| 52 | + <figcaption>Once a pipeline has been created, these can be executed as processes, where the source and destination of the files which are to be processed can be selected from a set of existing connectors. |
| 53 | +</figcaption> |
| 54 | +</figure> |
| 55 | + |
| 56 | + |
| 57 | +<figure> |
| 58 | + <img src="images/nextcloud_gerparcor.png" alt="nextcloud" style="width:100%"> |
| 59 | + <figcaption> |
| 60 | + This involves selecting a folder in a Nextcloud instance added by the user via the browser and selecting further parameters for the selection. |
| 61 | + </figcaption> |
| 62 | +</figure> |
| 63 | + |
| 64 | + |
| 65 | +## Result and monitoring |
| 66 | + |
| 67 | +After or during the execution of a pipeline, the process progress and its status can be visualized and queried. Processed documents can be selected and examined. |
| 68 | + |
| 69 | +<figure> |
| 70 | + <img src="images/Result.png" alt="Result" style="width:100%"> |
| 71 | + <figcaption>The progress of the individual processed documents is displayed and the results are also visualized by selecting a document.</figcaption> |
| 72 | +</figure> |
| 73 | + |
| 74 | +<figure> |
| 75 | + <img src="images/document_view.png" alt="Result" style="width:100%"> |
| 76 | + <figcaption>The results of the annotation are visualized at document level with highlighting based on the selected annotation class.</figcaption> |
| 77 | +</figure> |
| 78 | + |
| 79 | +<figure> |
| 80 | + <img src="images/ResultStatistic.png" alt="Result" style="width:100%"> |
| 81 | + <figcaption>At the same time, statistical information on all annotations in the respective document is also visualized graphically. </figcaption> |
| 82 | +</figure> |
| 83 | + |
| 84 | +### Notification |
| 85 | + |
| 86 | +Due to the user-related processing of DUUI processes, processes can be monitored live and the owners of the processes are also informed of the result of the processing via e-mail via DUUI Gateway. |
| 87 | + |
| 88 | +<figure> |
| 89 | + <img src="images/Notification.png" alt="Notification" style="width:100%"> |
| 90 | + <figcaption>A Result email after processing a pipeline defined in DUUI Gateway. </figcaption> |
| 91 | +</figure> |
| 92 | + |
| 93 | + |
| 94 | +## Connectors |
| 95 | +DUUI-Gateway is capable of connecting to various cloud-based systems listed below, which can be individually configured and connected by the user in order to read in corpora for processing or subsequently serialize them again. |
| 96 | + |
| 97 | +* Google Drive |
| 98 | +* Nextcloud |
| 99 | +* Dropbox |
| 100 | +* Amazon Simple Storage Service (Amazon S3) |
| 101 | + * _minio_ for personal use |
| 102 | + |
| 103 | +<figure> |
| 104 | + <img src="images/Nextcloud_Signup.png" alt="Result" style="width:100%"> |
| 105 | + <figcaption>Exemplary connection to a Nextcloud instance</figcaption> |
| 106 | +</figure> |
13 | 107 |
|
14 | | -### API |
15 | 108 |
|
16 | 109 | Besides the web interface, DUUI-Gateway also includes an API that allows usage based on user authentication. |
17 | 110 |
|
| 111 | +___ |
| 112 | + |
| 113 | +All of these features can be used by anyone. DUUI-Gateway is freely available and can be easily instantiated via Docker. Instructions can be found under [Setup](setup.md). |
| 114 | + |
| 115 | +If you use DUUI Gateway, refer to the specified [citation](publications.md). |
| 116 | + |
| 117 | + |
| 118 | +[//]: # (#### Python-Example) |
| 119 | + |
| 120 | +[//]: # () |
| 121 | +[//]: # () |
| 122 | +[//]: # (### Connectors) |
| 123 | + |
| 124 | +[//]: # () |
| 125 | +[//]: # () |
| 126 | +[//]: # (#### Dropbox) |
| 127 | + |
| 128 | +[//]: # () |
| 129 | +[//]: # () |
| 130 | +[//]: # (#### Nextcloud) |
| 131 | + |
| 132 | +[//]: # () |
| 133 | +[//]: # () |
| 134 | +[//]: # (#### GoogleDrive) |
| 135 | + |
| 136 | +[//]: # () |
| 137 | +[//]: # () |
| 138 | +[//]: # () |
| 139 | +[//]: # (## Pipeline) |
| 140 | + |
| 141 | +[//]: # () |
| 142 | +[//]: # (A pipeline is a collection of components or Analysis Engines that can be executed. During an analysis process, the components in the pipeline are executed one after) |
| 143 | + |
| 144 | +[//]: # (another annotating documents. Pipelines do not interact with the input data directly but build the structure for an NLP workflow.) |
| 145 | + |
| 146 | +[//]: # () |
| 147 | +[//]: # (Creating a pipeline with this web-interface can be done in the Builder. It is a three-step form that guides you through building a pipeline either from scratch or) |
| 148 | + |
| 149 | +[//]: # (using a template as the starting point.) |
| 150 | + |
| 151 | +[//]: # () |
| 152 | +[//]: # (>Choosing a template as a starting point copies all predefined settings into a fresh) |
| 153 | + |
| 154 | +[//]: # (pipeline.) |
| 155 | + |
| 156 | +[//]: # () |
| 157 | +[//]: # (In the second step pipeline specific properties like name, description, tags and settings can be edited.) |
| 158 | + |
| 159 | +[//]: # (Only a name is required to proceed but adding a short description is recommended to serve as documentation) |
| 160 | + |
| 161 | +[//]: # (and help others when sharing a pipeline. Tags can help document and find pipelines) |
| 162 | + |
| 163 | +[//]: # (in the Dashboard.) |
| 164 | + |
| 165 | +[//]: # () |
| 166 | +[//]: # (## Component) |
| 167 | + |
| 168 | +[//]: # () |
| 169 | +[//]: # (Components are the part of DUUI that actually do the processing and therefore offer) |
| 170 | + |
| 171 | +[//]: # (the most settings. When creating a pipeline you can choose from a set of predefined) |
| 172 | + |
| 173 | +[//]: # (components or create your own. Once added to the pipeline, a component can be edited) |
| 174 | + |
| 175 | +[//]: # (by clicking the <img src="./images/fa-edit.svg" width="14"> icon. This will open a drawer on) |
| 176 | + |
| 177 | +[//]: # (the right, that allows for modification of a component.) |
| 178 | + |
| 179 | +[//]: # () |
| 180 | +[//]: # (Settings include:) |
| 181 | + |
| 182 | +[//]: # () |
| 183 | +[//]: # (**Name**) |
| 184 | + |
| 185 | +[//]: # () |
| 186 | +[//]: # (**Driver** — The Driver is responsible for the instantiation) |
| 187 | + |
| 188 | +[//]: # (of a component during a process.) |
18 | 189 |
|
19 | | -#### Python-Example |
| 190 | +[//]: # () |
| 191 | +[//]: # (**Target** — The component's target depends on the selected) |
20 | 192 |
|
| 193 | +[//]: # (driver. For Docker, Kubernetes and Swarm Drivers, the target is the full image name.) |
21 | 194 |
|
22 | | -### Connectors |
| 195 | +[//]: # (For UIMA it is the class path to the Annotator represented by this component and for) |
23 | 196 |
|
| 197 | +[//]: # (a Remote Driver the URL has to be specified.) |
24 | 198 |
|
25 | | -#### Dropbox |
| 199 | +[//]: # () |
| 200 | +[//]: # (**Tags**) |
26 | 201 |
|
| 202 | +[//]: # () |
| 203 | +[//]: # (**Description**) |
27 | 204 |
|
28 | | -#### Nextcloud |
| 205 | +[//]: # () |
| 206 | +[//]: # (**Options**) |
29 | 207 |
|
| 208 | +[//]: # () |
| 209 | +[//]: # (**Parameters**) |
30 | 210 |
|
31 | | -#### GoogleDrive |
| 211 | +[//]: # () |
| 212 | +[//]: # (Options are specific to the selected driver. Most of the time the default options) |
32 | 213 |
|
| 214 | +[//]: # (are sufficient and modifications are only for special uses cases. Parameters are) |
33 | 215 |
|
| 216 | +[//]: # (useful if the component requires settings that are not controlled by DUUI.) |
34 | 217 |
|
35 | | -## Pipeline |
| 218 | +[//]: # () |
| 219 | +[//]: # (>When editing a specific pipeline, clicking the <img src="./images/fa-clone.svg" width="14"> icon) |
36 | 220 |
|
37 | | -A pipeline is a collection of components or Analysis Engines that can be executed. During an analysis process, the components in the pipeline are executed one after |
38 | | -another annotating documents. Pipelines do not interact with the input data directly but build the structure for an NLP workflow. |
| 221 | +[//]: # (clones the component's settings and prefills the creation form.) |
39 | 222 |
|
40 | | -Creating a pipeline with this web-interface can be done in the Builder. It is a three-step form that guides you through building a pipeline either from scratch or |
41 | | -using a template as the starting point. |
| 223 | +[//]: # () |
| 224 | +[//]: # (## Process) |
42 | 225 |
|
43 | | ->Choosing a template as a starting point copies all predefined settings into a fresh |
44 | | -pipeline. |
| 226 | +[//]: # () |
| 227 | +[//]: # (A process manages the flow of data and pipeline execution. Starting a process is) |
45 | 228 |
|
46 | | -In the second step pipeline specific properties like name, description, tags and settings can be edited. |
47 | | -Only a name is required to proceed but adding a short description is recommended to serve as documentation |
48 | | -and help others when sharing a pipeline. Tags can help document and find pipelines |
49 | | -in the Dashboard. |
| 229 | +[//]: # (possible on a pipeline page. On the process creation screen you are asked to select) |
50 | 230 |
|
51 | | -## Component |
| 231 | +[//]: # (an input, output and optionally settings that influence the process behavior.) |
52 | 232 |
|
53 | | -Components are the part of DUUI that actually do the processing and therefore offer |
54 | | -the most settings. When creating a pipeline you can choose from a set of predefined |
55 | | -components or create your own. Once added to the pipeline, a component can be edited |
56 | | -by clicking the <img src="./images/fa-edit.svg" width="14"> icon. This will open a drawer on |
57 | | -the right, that allows for modification of a component. |
| 233 | +[//]: # () |
| 234 | +[//]: # (### Input and Output) |
58 | 235 |
|
59 | | -Settings include: |
| 236 | +[//]: # () |
| 237 | +[//]: # (Any process must be provided with an input source to be started. Each requires) |
60 | 238 |
|
61 | | -**Name** |
| 239 | +[//]: # (different properties to be set. The available input sources are:) |
62 | 240 |
|
63 | | -**Driver** — The Driver is responsible for the instantiation |
64 | | -of a component during a process. |
| 241 | +[//]: # () |
| 242 | +[//]: # (#### Text) |
65 | 243 |
|
66 | | -**Target** — The component's target depends on the selected |
67 | | -driver. For Docker, Kubernetes and Swarm Drivers, the target is the full image name. |
68 | | -For UIMA it is the class path to the Annotator represented by this component and for |
69 | | -a Remote Driver the URL has to be specified. |
| 244 | +[//]: # () |
| 245 | +[//]: # (For simple and quick analysis you can choose to process plain text. The text) |
70 | 246 |
|
71 | | -**Tags** |
| 247 | +[//]: # (to be analyzed can be entered in a text area.) |
72 | 248 |
|
73 | | -**Description** |
| 249 | +[//]: # () |
| 250 | +[//]: # (#### File) |
74 | 251 |
|
75 | | -**Options** |
| 252 | +[//]: # () |
| 253 | +[//]: # (Selecting file as the input source allows for the upload of one or multiple) |
76 | 254 |
|
77 | | -**Parameters** |
| 255 | +[//]: # (files.) |
78 | 256 |
|
79 | | -Options are specific to the selected driver. Most of the time the default options |
80 | | -are sufficient and modifications are only for special uses cases. Parameters are |
81 | | -useful if the component requires settings that are not controlled by DUUI. |
| 257 | +[//]: # () |
| 258 | +[//]: # (#### Cloud) |
82 | 259 |
|
83 | | ->When editing a specific pipeline, clicking the <img src="./images/fa-clone.svg" width="14"> icon |
84 | | -clones the component's settings and prefills the creation form. |
| 260 | +[//]: # () |
| 261 | +[//]: # (There are currently four cloud storage providers available to use: Dropbox and) |
85 | 262 |
|
86 | | -## Process |
| 263 | +[//]: # (Min.io (s3), Google Drive, and NextCloud. More will be added in the future. To use your cloud storage) |
87 | 264 |
|
88 | | -A process manages the flow of data and pipeline execution. Starting a process is |
89 | | -possible on a pipeline page. On the process creation screen you are asked to select |
90 | | -an input, output and optionally settings that influence the process behavior. |
| 265 | +[//]: # (provider of choice, a connection must be established on your Account page.) |
91 | 266 |
|
92 | | -### Input and Output |
| 267 | +[//]: # () |
| 268 | +[//]: # (>With the exception of text, all input sources require a file extension to be) |
93 | 269 |
|
94 | | -Any process must be provided with an input source to be started. Each requires |
95 | | -different properties to be set. The available input sources are: |
| 270 | +[//]: # (selected.) |
96 | 271 |
|
97 | | -#### Text |
| 272 | +[//]: # () |
| 273 | +[//]: # (### Settings) |
98 | 274 |
|
99 | | -For simple and quick analysis you can choose to process plain text. The text |
100 | | -to be analyzed can be entered in a text area. |
| 275 | +[//]: # () |
| 276 | +[//]: # (Settings can be changed for both the input and output. Their main purpose is to) |
101 | 277 |
|
102 | | -#### File |
| 278 | +[//]: # (filter the files that are processed. This can be done by setting a minimum file) |
103 | 279 |
|
104 | | -Selecting file as the input source allows for the upload of one or multiple |
105 | | -files. |
| 280 | +[//]: # (size or ignoring files that may be at the output location.) |
106 | 281 |
|
107 | | -#### Cloud |
| 282 | +[//]: # () |
| 283 | +[//]: # (Process related settings include the option to use multiple workers for parallel) |
108 | 284 |
|
109 | | -There are currently four cloud storage providers available to use: Dropbox and |
110 | | -Min.io (s3), Google Drive, and NextCloud. More will be added in the future. To use your cloud storage |
111 | | -provider of choice, a connection must be established on your Account page. |
| 285 | +[//]: # (processing or ignoring errors that occur by skipping to next docment instead of) |
112 | 286 |
|
113 | | ->With the exception of text, all input sources require a file extension to be |
114 | | -selected. |
| 287 | +[//]: # (failing the entire pipeline.) |
115 | 288 |
|
116 | | -### Settings |
| 289 | +[//]: # () |
| 290 | +[//]: # (Note that the amount of workers or threads that can be used is limited by the) |
117 | 291 |
|
118 | | -Settings can be changed for both the input and output. Their main purpose is to |
119 | | -filter the files that are processed. This can be done by setting a minimum file |
120 | | -size or ignoring files that may be at the output location. |
| 292 | +[//]: # (system!) |
121 | 293 |
|
122 | | -Process related settings include the option to use multiple workers for parallel |
123 | | -processing or ignoring errors that occur by skipping to next docment instead of |
124 | | -failing the entire pipeline. |
125 | 294 |
|
126 | | -Note that the amount of workers or threads that can be used is limited by the |
127 | | -system! |
|
0 commit comments