Skip to content

Commit 6fea058

Browse files
add OVERVIEW.md
1 parent fbdef21 commit 6fea058

File tree

1 file changed

+4
-8
lines changed

1 file changed

+4
-8
lines changed

OVERVIEW.md

Lines changed: 4 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,24 +1,20 @@
11
# DataJoint Overview
22

3-
DataJoint is a library for interacting with scientific databases integrating computational dependencies as part of the data model. It is an ideal tool for team projects working on shared data-centric computational workflows.
3+
DataJoint is a library for interacting with scientific databases that support computational dependencies as part of the data model.
4+
DataJoint serves as a principal framework for organizing data and computations in team projects.
45

5-
## Why use databases in scientific studes?
6-
7-
Many scientists are reluctant to use databases due to their perceived unwieldiness, opting instead to use file repositories for managing their shared data. [Gray, 2005](https://arxiv.org/abs/cs/0502008)
8-
9-
Yet databases provide several key advantages when it comes to sharing structured dynamic data:
6+
Databases provide several key advantages when it comes to sharing structured dynamic data:
107

118
1. **Data structure:** databases communicate and enforce structure reflecting the logic of the scientific study.
129
2. **Concurrent access:** databases support transactions to allow multiple agents to read and write the data concurrently.
1310
3. **Consistency and integrity:** database provide ways to ensure that data operations from multiple parties are combined correctly without loss, misidentification, or mismatches.
1411
4. **Queries:** Databases simplify and accelerate data queries -- functions on data to obtain precise slices of the data without needing to send the entire dataset for analysis.
1512

16-
## What does DataJoint bring?
1713
DataJoint solves several key problems for using databases effectively in scientific projects:
1814

1915
1. **Complete relational data model:** database programming directly from a scientific computing language such as MATLAB and Python without the need for SQL.
2016
2. **Data definition language:** to define tables and dependencies in simple and consistent ways.
2117
3. **Diagramming notation:** to visualize and navigate tables and dependencies.
2218
4. **Query language:** to create flexible and precise queries with only a few operators.
2319
5. **Serialization framework:** to store and retrieve numerical arrays and other data structures directly in the database.
24-
6. **Support for automated distributed computations:** for computational dependencies in the data.
20+
6. **Support for automated distributed computations:** for computational dependencies in the data.

0 commit comments

Comments
 (0)