|
| 1 | +# Datatypes |
| 2 | + |
| 3 | +Throughout the DataJoint ecosystem, there are several datatypes that are used to define |
| 4 | +tables with cross-platform support (i.e. Python, MATLAB). It is important to understand |
| 5 | +these types as they can have implications in the queries you form and the capacity of |
| 6 | +their storage. |
| 7 | + |
| 8 | +## Standard Types |
| 9 | + |
| 10 | +These types are largely wrappers around existing types in the current |
| 11 | +[query backend](../../ref-integrity/query-backend) for [data pipelines](../../getting-started/data-pipelines). |
| 12 | + |
| 13 | +### Common Types |
| 14 | + |
| 15 | +| Datatype | Description | Size | Example | Range | |
| 16 | +| --- | --- | --- | --- | --- | |
| 17 | +| <span id="int">int</span> | integer | 4 bytes | `8` | -2<sup>31</sup> to 2<sup>31</sup>-1 | |
| 18 | +| <span id="enum">enum</span>[^1] | category |1-2 bytes| `M`, `F`| -2<sup>31</sup> to 2<sup>31</sup>-1 | |
| 19 | +| <span id="datetime">datetime</span>[^2]| date and time in `YYYY-MM-DD HH:MM:SS` format | 5 bytes | `'2020-01-02 03:04:05'` | | |
| 20 | +| <span id="varchar">varchar(N)</span> | string of length *M*, up to *N* | *M* + 1-2 bytes| `text`| | |
| 21 | +| <span id="float">float</span>[^3] | floating point number | 4 bytes| `2.04`| 3.40E+38 to -1.17E-38, 0, and 1.17E-38 to 3.40E+38 | |
| 22 | +| <span id="longblob">longblob</span>[^4] | arbitrary numeric data| ≤ 4 GiB | | | |
| 23 | + |
| 24 | +### Less Common Types |
| 25 | + |
| 26 | +The following types add more specificity to the options above. Note that any integer |
| 27 | +type can be unsigned, shifting their range from the listed ±2<sup>n</sup> to from 0 - |
| 28 | +2<sup>n+1</sup>. Float and decimal types can be similarly unsigned |
| 29 | + |
| 30 | +| Datatype | Description | Size | Example | Range | |
| 31 | +| --- | --- | --- | --- | --- | |
| 32 | +| <span id="tiny-int">tinyint</span> |tiny integer | 1 byte | `2` | -2<sup>7</sup> to 2<sup>7</sup>-1 | |
| 33 | +| <span id="small-int">smallint</span> |small integer | 2 bytes | `21,000`| -2<sup>15</sup> to 2<sup>15</sup>-1 | |
| 34 | +| <span id="medium-int">mediumint</span> |medium integer| 3 bytes |`401,000`| -2<sup>23</sup> to 2<sup>23</sup>-1 | |
| 35 | +| <span id="date">date</span> |date | 5 bytes | `'2020-01-02'` | | |
| 36 | +| <span id="time">time</span> |time | 5 bytes | `'03:04:05'` | | |
| 37 | +| <span id="datetime">datetime</span>[^5]|date and time | 5 bytes | `'2020-01-02 03:04:05'` | | |
| 38 | +| <span id="char(N)">char(N)</span> |string of exactly length *N* | *N* bytes| `text` | | |
| 39 | +| <span id="double">double</span> |double-precision floating point number | 8 bytes | | | |
| 40 | +| <span id="decimalnf">decimal(N,F)</span> |a fixed-point number with *N* total and *F* fractional digits | 4 bytes per 9 digits | | | |
| 41 | +| <span id="tinyblob">tinyblob</span>[^4] | arbitrary numeric data| ≲ 256 bytes | | | |
| 42 | +| <span id="blob">blob</span>[^4] | arbitrary numeric data| ≤ 64 KiB | | | |
| 43 | +| <span id="mediumblob">mediumblob</span>[^4]| arbitrary numeric data| ≤ 16 MiB | | | |
| 44 | + |
| 45 | +## Unique Types |
| 46 | + |
| 47 | +| Datatype | Description | Size | Example | |
| 48 | +| --- | --- | --- | --- | |
| 49 | +| <span id="uuid">uuid</span> | a unique GUID value | 16 bytes | `6ed5ed09-e69c-466f-8d06-a5afbf273e61` | |
| 50 | +| <span id="attach">attach</span> | file attachment | | | |
| 51 | +| <span id="filepath">filepath</span> | path to external file | | | |
| 52 | + |
| 53 | +## Unsupported Datatypes (for now) |
| 54 | + |
| 55 | +- binary |
| 56 | +- text |
| 57 | +- longtext |
| 58 | +- bit |
| 59 | + |
| 60 | +For more information about datatypes, see |
| 61 | +[additional documentation](https://dev.mysql.com/doc/refman/5.6/en/data-types.html) |
| 62 | + |
| 63 | +[^1]: *enum* datatypes can be useful to standardize spelling with limited categories, |
| 64 | +but use with caution. *enum* should not be included in primary keys, as specified values |
| 65 | +cannot be changed later. |
| 66 | + |
| 67 | +[^2]: The default *datetime* value may be set to `CURRENT_TIMESTAMP`. |
| 68 | + |
| 69 | +[^3]: Because equality comparisons are error-prone, neither *float* nor *double* should |
| 70 | +be used in primary keys. For these cases, consider *decimal*. |
| 71 | + |
| 72 | +[^4]: Numeric arrays (e.g. matrix, image, structure) are compatible between MATLAB and |
| 73 | +Python(NumPy). The *longblob* and other *blob* datatypes can be configured to store |
| 74 | +data externally by using the `blob@store` syntax. For more information on storage limits |
| 75 | +see [this article](https://en.wikipedia.org/wiki/Byte#Multiple-byte_units) |
| 76 | + |
| 77 | +[^5]: Unlike *datetime*, a *timestamp* value will be adjusted to the local time zone. |
0 commit comments