Skip to content

Commit 67e8838

Browse files
committed
Add design pages
1 parent 900da04 commit 67e8838

File tree

9 files changed

+175
-0
lines changed

9 files changed

+175
-0
lines changed

docs/src/design/alter.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
## Work in progress
2+
You may ask questions in the chat window below or
3+
refer to [legacy documentation](https://docs.datajoint.org/)

docs/src/design/attribute-types.md

Lines changed: 77 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,77 @@
1+
# Datatypes
2+
3+
Throughout the DataJoint ecosystem, there are several datatypes that are used to define
4+
tables with cross-platform support (i.e. Python, MATLAB). It is important to understand
5+
these types as they can have implications in the queries you form and the capacity of
6+
their storage.
7+
8+
## Standard Types
9+
10+
These types are largely wrappers around existing types in the current
11+
[query backend](../../ref-integrity/query-backend) for [data pipelines](../../getting-started/data-pipelines).
12+
13+
### Common Types
14+
15+
| Datatype | Description | Size | Example | Range |
16+
| --- | --- | --- | --- | --- |
17+
| <span id="int">int</span> | integer | 4 bytes | `8` | -2<sup>31</sup> to 2<sup>31</sup>-1 |
18+
| <span id="enum">enum</span>[^1] | category |1-2 bytes| `M`, `F`| -2<sup>31</sup> to 2<sup>31</sup>-1 |
19+
| <span id="datetime">datetime</span>[^2]| date and time in `YYYY-MM-DD HH:MM:SS` format | 5 bytes | `'2020-01-02 03:04:05'` | |
20+
| <span id="varchar">varchar(N)</span> | string of length *M*, up to *N* | *M* + 1-2 bytes| `text`| |
21+
| <span id="float">float</span>[^3] | floating point number | 4 bytes| `2.04`| 3.40E+38 to -1.17E-38, 0, and 1.17E-38 to 3.40E+38 |
22+
| <span id="longblob">longblob</span>[^4] | arbitrary numeric data| ≤ 4 GiB | | |
23+
24+
### Less Common Types
25+
26+
The following types add more specificity to the options above. Note that any integer
27+
type can be unsigned, shifting their range from the listed ±2<sup>n</sup> to from 0 -
28+
2<sup>n+1</sup>. Float and decimal types can be similarly unsigned
29+
30+
| Datatype | Description | Size | Example | Range |
31+
| --- | --- | --- | --- | --- |
32+
| <span id="tiny-int">tinyint</span> |tiny integer | 1 byte | `2` | -2<sup>7</sup> to 2<sup>7</sup>-1 |
33+
| <span id="small-int">smallint</span> |small integer | 2 bytes | `21,000`| -2<sup>15</sup> to 2<sup>15</sup>-1 |
34+
| <span id="medium-int">mediumint</span> |medium integer| 3 bytes |`401,000`| -2<sup>23</sup> to 2<sup>23</sup>-1 |
35+
| <span id="date">date</span> |date | 5 bytes | `'2020-01-02'` | |
36+
| <span id="time">time</span> |time | 5 bytes | `'03:04:05'` | |
37+
| <span id="datetime">datetime</span>[^5]|date and time | 5 bytes | `'2020-01-02 03:04:05'` | |
38+
| <span id="char(N)">char(N)</span> |string of exactly length *N* | *N* bytes| `text` | |
39+
| <span id="double">double</span> |double-precision floating point number | 8 bytes | | |
40+
| <span id="decimalnf">decimal(N,F)</span> |a fixed-point number with *N* total and *F* fractional digits | 4 bytes per 9 digits | | |
41+
| <span id="tinyblob">tinyblob</span>[^4] | arbitrary numeric data| ≲ 256 bytes | | |
42+
| <span id="blob">blob</span>[^4] | arbitrary numeric data| ≤ 64 KiB | | |
43+
| <span id="mediumblob">mediumblob</span>[^4]| arbitrary numeric data| ≤ 16 MiB | | |
44+
45+
## Unique Types
46+
47+
| Datatype | Description | Size | Example |
48+
| --- | --- | --- | --- |
49+
| <span id="uuid">uuid</span> | a unique GUID value | 16 bytes | `6ed5ed09-e69c-466f-8d06-a5afbf273e61` |
50+
| <span id="attach">attach</span> | file attachment | | |
51+
| <span id="filepath">filepath</span> | path to external file | | |
52+
53+
## Unsupported Datatypes (for now)
54+
55+
- binary
56+
- text
57+
- longtext
58+
- bit
59+
60+
For more information about datatypes, see
61+
[additional documentation](https://dev.mysql.com/doc/refman/5.6/en/data-types.html)
62+
63+
[^1]: *enum* datatypes can be useful to standardize spelling with limited categories,
64+
but use with caution. *enum* should not be included in primary keys, as specified values
65+
cannot be changed later.
66+
67+
[^2]: The default *datetime* value may be set to `CURRENT_TIMESTAMP`.
68+
69+
[^3]: Because equality comparisons are error-prone, neither *float* nor *double* should
70+
be used in primary keys. For these cases, consider *decimal*.
71+
72+
[^4]: Numeric arrays (e.g. matrix, image, structure) are compatible between MATLAB and
73+
Python(NumPy). The *longblob* and other *blob* datatypes can be configured to store
74+
data externally by using the `blob@store` syntax. For more information on storage limits
75+
see [this article](https://en.wikipedia.org/wiki/Byte#Multiple-byte_units)
76+
77+
[^5]: Unlike *datetime*, a *timestamp* value will be adjusted to the local time zone.

docs/src/design/diagrams.md

Lines changed: 80 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,80 @@
1+
# Diagrams
2+
3+
Diagrams are a great way to visualize all or part of a pipeline and understand the flow
4+
of data. DataJoint diagrams are based on **entity relationship diagram** (ERD), with
5+
some minor departures fom this standard.
6+
7+
Here, tables are depicted as nodes and [dependencies](../dependencies) as directed edges
8+
between them. The `draw` method plots the graph, with many other methods (
9+
[Python](https://datajoint.com/docs/core/datajoint-python/latest/api/datajoint/diagram/),
10+
[Matlab](https://github.com/datajoint/datajoint-matlab/blob/master/%2Bdj/ERD.m)) to
11+
save or adjust the output.
12+
13+
Because DataJoint pipelines are directional (see [DAG](../../../glossary#dag)), the
14+
tables at the top will need to be populated first, followed by those tables one step
15+
below and so forth until the last table is populated at the bottom of the pipeline. The
16+
top of the pipeline tends to be dominated by Lookup and manual tables. The middle has
17+
many imported tables, and the bottom has computed tables.
18+
19+
## Notation
20+
21+
DataJoint uses the following conventions:
22+
23+
- [Tables](../table-definitions) are indicated as nodes in the graph. The
24+
corresponding class name is indicated by each node.
25+
26+
- [Table type](../../reproduce/table-tiers) is indicated by colors and symbols, with some
27+
differences across Python and Matlab:
28+
29+
- **Lookup**: gray, rectangle or asterisk
30+
31+
- **Manual**: green, rectangle or square
32+
33+
- **Imported**: blue, circle or oval
34+
35+
- **Computed**: red, rectangle or star
36+
37+
- **Part**: black dot with smaller font or black text
38+
39+
- [Dependencies](../dependencies) indicated as edges in the graph and always
40+
directed downward (see [DAG](../../glossary#dag))
41+
42+
- Dependency type is indicated by the line.
43+
44+
- **Solid lines**: The [foreign key](../../glossary#foreign-key) in the
45+
[primary key](../../glossary#primary-key).
46+
47+
- **Dashed lines**: The [foreign key](../../glossary#foreign-key) outside the
48+
[primary key](../../glossary#primary-key).
49+
50+
- **Thick line**: The [foreign key](../../glossary#foreign-key) the only item in
51+
the [primary key](../../glossary#primary-key). This is a 1-to-1 relationship.
52+
53+
- **Dot on the line**: The [foreign key](../../glossary#foreign-key) was renamed
54+
via the [projection](../../query-lang/operators#proj)
55+
56+
## Example
57+
58+
The following diagram example is an approximation of a DataJoint diagram using
59+
[Mermaid](https://mermaid-js.github.io/mermaid/#/).
60+
61+
--8<-- "src/images/concepts-table-tiers-diagram.md"
62+
63+
Here, we see ...
64+
65+
1. A 1-to-1 relationship between *Session* and *Scan*, as designated by the thick edge.
66+
67+
2. A non-primary foreign key linking *SegmentationMethod* and *Segmentation*
68+
69+
3. Manual tables for *Mouse*, *Session*, *Scan*, and *Stimulus*.
70+
71+
4. A Lookup table: *SegmentationMethod*
72+
73+
5. An Imported table: *Alignment*
74+
75+
6. Several Computed tables: *Segmentation*, *Trace*, and *RF*
76+
77+
7. A part table: *Field*
78+
79+
For examples calling `Diagram` in Python and Matlab, please visit the documentation for
80+
the respective API.

docs/src/design/drop.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
## Work in progress
2+
You may ask questions in the chat window below or
3+
refer to [legacy documentation](https://docs.datajoint.org/)

docs/src/design/integrity.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
## Work in progress
2+
You may ask questions in the chat window below or
3+
refer to [legacy documentation](https://docs.datajoint.org/)

docs/src/design/normalization.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
## Work in progress
2+
You may ask questions in the chat window below or
3+
refer to [legacy documentation](https://docs.datajoint.org/)

docs/src/design/recall.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
## Work in progress
2+
You may ask questions in the chat window below or
3+
refer to [legacy documentation](https://docs.datajoint.org/)

docs/src/design/schema.md

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
## Work in progress
2+
You may ask questions in the chat window below or
3+
refer to [legacy documentation](https://docs.datajoint.org/)
File renamed without changes.

0 commit comments

Comments
 (0)