5
5
Science labs organize their projects as a sequence of activities of experiment design,
6
6
data acquisition, and processing and analysis.
7
7
8
- <figure markdown >
9
- {: style="width:520px; align:center"}
10
- <figcaption>Workflow and dataflow in a common findings-centered approach to data science in a science lab.</figcaption>
11
- </figure >
8
+ ![ data science in a science lab] ( ../images/data-science-before.png ) {: style="width:510px; display: block ; margin: 0 auto;"}
9
+
10
+ <figcaption style =" text-align : center ;" >Workflow and dataflow in a common findings-centered approach to data science in a science lab.</figcaption >
12
11
13
12
Many labs lack a uniform data management strategy that would span longitudinally across
14
13
the entire project lifecycle as well as laterally across different projects.
@@ -29,10 +28,9 @@ This approach requires formulating a general data science plan and upfront inves
29
28
for setting up resources and processes and training the teams.
30
29
The team uses DataJoint to build data pipelines to support multiple projects.
31
30
32
- <figure markdown >
33
- {: style="width:510px; align:center"}
34
- <figcaption>Workflow and dataflow in a data pipeline-centered approach.</figcaption>
35
- </figure >
31
+ ![ data science in a science lab] ( ../images/data-science-after.png ) {: style="width:510px; display: block ; margin: 0 auto;"}
32
+
33
+ <figcaption style =" text-align : center ;" >Workflow and dataflow in a data pipeline-centered approach.</figcaption >
36
34
37
35
Data pipelines support project data across their entire lifecycle, including the
38
36
following functions
@@ -55,42 +53,41 @@ data integrity.
55
53
The adoption of a uniform data management framework allows separation of roles and
56
54
division of labor among team members, leading to greater efficiency and better scaling.
57
55
58
- <figure markdown >
59
- {: style="width:350px; align:center"}
60
- <figcaption>Distinct responsibilities of data science and data engineering.</figcaption>
61
- </figure >
56
+ ![ data science in a science lab] ( ../images/data-engineering.png ) {: style="width:510px; display: block ; margin: 0 auto;"}
57
+
58
+ <figcaption style =" text-align : center ;" >Distinct responsibilities of data science and data engineering.</figcaption >
62
59
63
- Scientists
60
+ ### Scientists
64
61
65
- design and conduct experiments, collecting data.
66
- They interact with the data pipeline through graphical user interfaces designed by
67
- others.
68
- They understand what analysis is used to test their hypotheses.
62
+ Design and conduct experiments, collecting data.
63
+ They interact with the data pipeline through graphical user interfaces designed by
64
+ others.
65
+ They understand what analysis is used to test their hypotheses.
69
66
70
- Data scientists
67
+ ### Data scientists
71
68
72
- have the domain expertise and select and implement the processing and analysis
73
- methods for experimental data.
74
- Data scientists are in charge of defining and managing the data pipeline using
75
- DataJoint's data model, but they may not know the details of the underlying
76
- architecture.
77
- They interact with the pipeline using client programming interfaces directly from
78
- languages such as MATLAB and Python.
69
+ Have the domain expertise and select and implement the processing and analysis
70
+ methods for experimental data.
71
+ Data scientists are in charge of defining and managing the data pipeline using
72
+ DataJoint's data model, but they may not know the details of the underlying
73
+ architecture.
74
+ They interact with the pipeline using client programming interfaces directly from
75
+ languages such as MATLAB and Python.
79
76
80
- The bulk of this manual is written for working data scientists, except for System
81
- Administration.
77
+ The bulk of this manual is written for working data scientists, except for System
78
+ Administration.
82
79
83
- Data engineers
80
+ ### Data engineers
84
81
85
- work with the data scientists to support the data pipeline.
86
- They rely on their understanding of the DataJoint data model to configure and
87
- administer the required IT resources such as database servers, data storage
88
- servers, networks, cloud instances, [Globus](https://globus.org) endpoints, etc.
89
- Data engineers can provide general solutions such as web hosting, data publishing,
90
- interfaces, exports and imports.
82
+ Work with the data scientists to support the data pipeline.
83
+ They rely on their understanding of the DataJoint data model to configure and
84
+ administer the required IT resources such as database servers, data storage
85
+ servers, networks, cloud instances, [ Globus] ( https://globus.org ) endpoints, etc.
86
+ Data engineers can provide general solutions such as web hosting, data publishing,
87
+ interfaces, exports and imports.
91
88
92
- The System Administration section of this tutorial contains materials helpful in
93
- accomplishing these tasks.
89
+ The System Administration section of this tutorial contains materials helpful in
90
+ accomplishing these tasks.
94
91
95
92
DataJoint is designed to delineate a clean boundary between ** data science** and ** data
96
93
engineering** .
0 commit comments