Skip to content

Commit 5c929c4

Browse files
committed
Add external store page
1 parent 99114f9 commit 5c929c4

File tree

2 files changed

+294
-6
lines changed

2 files changed

+294
-6
lines changed

docs/mkdocs.yaml

Lines changed: 1 addition & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -15,9 +15,7 @@ nav:
1515
- System Administration:
1616
- Database Administration: sysadmin/dba.md
1717
- Bulk Storage Systems: sysadmin/bulk-storage.md
18-
- File Storage: sysadmin/filestore.md
19-
- Backups and Recovery: sysadmin/backup.md
20-
- Database Server Hosting: sysadmin/hosting.md
18+
- External Store: sysadmin/external-store.md
2119
- Client Configuration:
2220
- Install: client/install.md
2321
- Credentials: client/credentials.md

docs/src/sysadmin/external-store.md

Lines changed: 293 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,293 @@
1-
## Work in progress
2-
You may ask questions in the chat window below or
3-
refer to [legacy documentation](https://docs.datajoint.org/)
1+
# External Store
2+
3+
DataJoint organizes most of its data in a relational database.
4+
Relational databases excel at representing relationships between entities and storing
5+
structured data.
6+
However, relational databases are not particularly well-suited for storing large
7+
continuous chunks of data such as images, signals, and movies.
8+
An attribute of type `longblob` can contain an object up to 4 GiB in size (after
9+
compression) but storing many such large objects may hamper the performance of queries
10+
on the entire table.
11+
A good rule of thumb is that objects over 10 MiB in size should not be put in the
12+
relational database.
13+
In addition, storing data in cloud-hosted relational databases (e.g. AWS RDS) may be
14+
more expensive than in cloud-hosted simple storage systems (e.g. AWS S3).
15+
16+
DataJoint allows the use of `external` storage to store large data objects within its
17+
relational framework but outside of the main database.
18+
19+
Defining an externally-stored attribute is used using the notation `blob@storename`
20+
(see also: [definition syntax](../design/tables/declare.md)) and works the same way as
21+
a `longblob` attribute from the users perspective. However, its data are stored in an
22+
external storage system rather than in the relational database.
23+
24+
Various systems can play the role of external storage, including a shared file system
25+
accessible to all team members with access to these objects or a cloud storage
26+
solutions such as AWS S3.
27+
28+
For example, the following table stores motion-aligned two-photon movies.
29+
30+
```python
31+
# Motion aligned movies
32+
-> twophoton.Scan
33+
---
34+
aligned_movie : blob@external # motion-aligned movie in 'external' store
35+
```
36+
37+
All [insert](../manipulation/insert.md) and [fetch](../query/fetch.md) operations work
38+
identically for `external` attributes as they do for `blob` attributes, with the same
39+
serialization protocol.
40+
Similar to `blobs`, `external` attributes cannot be used in restriction conditions.
41+
42+
Multiple external storage configurations may be used simultaneously with the
43+
`@storename` portion of the attribute definition determining the storage location.
44+
45+
```python
46+
# Motion aligned movies
47+
-> twophoton.Scan
48+
---
49+
aligned_movie : blob@external-raw # motion-aligned movie in 'external-raw' store
50+
```
51+
52+
## Principles of operation
53+
54+
External storage is organized to emulate individual attribute values in the relational
55+
database.
56+
DataJoint organizes external storage to preserve the same data integrity principles as
57+
in relational storage.
58+
59+
1. The external storage locations are specified in the DataJoint connection
60+
configuration with one specification for each store.
61+
62+
```python
63+
dj.config['stores'] = {
64+
'external': dict( # 'regular' external storage for this pipeline
65+
protocol='s3',
66+
endpoint='s3.amazonaws.com:9000',
67+
bucket = 'testbucket',
68+
location = 'datajoint-projects/lab1',
69+
access_key='1234567',
70+
secret_key='foaf1234'),
71+
'external-raw': dict( # 'raw' storage for this pipeline
72+
protocol='file',
73+
location='/net/djblobs/myschema')
74+
}
75+
# external object cache - see fetch operation below for details.
76+
dj.config['cache'] = '/net/djcache'
77+
```
78+
79+
2. Each schema corresponds to a dedicated folder at the storage location with the same
80+
name as the database schema.
81+
82+
3. Stored objects are identified by the [SHA-256](https://en.wikipedia.org/wiki/SHA-2)
83+
hashes (in web-safe base-64 ASCII) of their serialized contents.
84+
This scheme allows for the same object—used multiple times in the same schema—to be
85+
stored only once.
86+
87+
4. In the `external-raw` storage, the objects are saved as files with the hash as the
88+
filename.
89+
90+
5. In the `external` storage, external files are stored in a directory layout
91+
corresponding to the hash of the filename. By default, this corresponds to the first 2
92+
characters of the hash, followed by the second 2 characters of the hash, followed by
93+
the actual file.
94+
95+
6. Each database schema has an auxiliary table named `~external_<storename>` for each
96+
configured external store.
97+
98+
It is automatically created the first time external storage is used.
99+
The primary key of `~external_<storename>` is the hash of the data (for blobs and
100+
attachments) or of the relative paths to the files for filepath-based storage.
101+
Other attributes are the `count` of references by tables in the schema, the `size`
102+
of the object in bytes, and the `timestamp` of the last event (creation, update, or
103+
deletion).
104+
105+
Below are sample entries in `~external_<storename>`.
106+
107+
| HASH | size | filepath | contents_hash | timestamp |
108+
| -- | -- | -- | -- | -- |
109+
| 1GEqtEU6JYEOLS4sZHeHDxWQ3JJfLlH VZio1ga25vd2 | 1039536788 | NULL | NULL | 2017-06-07 23:14:01 |
110+
111+
The fields `filepath` and `contents_hash` relate to the
112+
[filepath](../design/tables/filepath.md) datatype, which will be discussed
113+
separately.
114+
115+
7. Attributes of type `@<storename>` are declared as renamed
116+
[foreign keys](../design/tables/dependencies.md) referencing the
117+
`~external_<storename>` table (but are not shown as such to the user).
118+
119+
8. The [insert](../manipulation/insert.md) operation encodes and hashes the blob data.
120+
If an external object is not present in storage for the same hash, the object is saved
121+
and if the save operation is successful, corresponding entities in table
122+
`~external_<storename>` for that store are created.
123+
124+
9. The [delete](../manipulation/delete.md) operation first deletes the foreign key
125+
reference in the target table. The external table entry and actual external object is
126+
not actually deleted at this time (`soft-delete`).
127+
128+
10. The [fetch](../query/fetch.md) operation uses the hash values to find the data.
129+
In order to prevent excessive network overhead, a special external store named
130+
`cache` can be configured.
131+
If the `cache` is enabled, the `fetch` operation need not access
132+
`~external_<storename>` directly.
133+
Instead `fetch` will retrieve the cached object without downloading directly from
134+
the `real` external store.
135+
136+
11. Cleanup is performed regularly when the database is in light use or off-line.
137+
138+
12. DataJoint never removes objects from the local `cache` folder.
139+
The `cache` folder may just be periodically emptied entirely or based on file
140+
access date.
141+
If dedicated `cache` folders are maintained for each schema, then a special
142+
procedure will be provided to remove all objects that are no longer listed in
143+
`~external_<storename>`.
144+
145+
Data removal from external storage is separated from the delete operations to ensure
146+
that data are not lost in race conditions between inserts and deletes of the same
147+
objects, especially in cases of transactional processing or in processes that are
148+
likely to get terminated.
149+
The cleanup steps are performed in a separate process when the risks of race conditions
150+
are minimal.
151+
The process performing the cleanups must be isolated to prevent interruptions resulting
152+
in loss of data integrity.
153+
154+
## Configuration
155+
156+
The following steps must be performed to enable external storage:
157+
158+
1. Assign external location settings for each storage as shown in the
159+
[Step 1](#principles-of-operation) example above. Use `dj.config` for configuration.
160+
161+
- `protocol` [`s3`, `file`] Specifies whether `s3` or `file` external storage is
162+
desired.
163+
- `endpoint` [`s3`] Specifies the remote endpoint to the external data for all
164+
schemas as well as the target port.
165+
- `bucket` [`s3`] Specifies the appropriate `s3` bucket organization.
166+
- `location` [`s3`, `file`] Specifies the subdirectory within the root or bucket of
167+
store to preserve data. External objects are thus stored remotely with the following
168+
path structure:
169+
`<bucket (if applicable)>/<location>/<schema_name>/<subfolding_strategy>/<object>`.
170+
- `access_key` [`s3`] Specifies the access key credentials for accessing the external
171+
location.
172+
- `secret_key` [`s3`] Specifies the secret key credentials for accessing the external
173+
location.
174+
- `secure` [`s3`] Optional specification to establish secure external storage
175+
connection with TLS (aka SSL, HTTPS). Defaults to `False`.
176+
177+
2. Optionally, for each schema specify the `cache` folder for local fetch cache.
178+
179+
This is done by saving the path in the `cache` key of the DataJoint configuration
180+
dictionary:
181+
182+
```python
183+
dj.config['cache'] = '/temp/dj-cache'
184+
```
185+
186+
## Cleanup
187+
188+
Deletion of records containing externally stored blobs is a `soft-delete` which only
189+
removes the database-side records from the database.
190+
To cleanup the external tracking table or the actual external files, a separate process
191+
is provided as follows.
192+
193+
To remove only the tracking entries in the external table, call `delete`
194+
on the `~external_<storename>` table for the external configuration with the argument
195+
`delete_external_files=False`.
196+
197+
Note: Currently, cleanup operations on a schema's external table are not 100%
198+
transaction safe and so must be run when there is no write activity occurring
199+
in tables which use a given schema / external store pairing.
200+
201+
```python
202+
schema.external['external_raw'].delete(delete_external_files=False)
203+
```
204+
205+
To remove the tracking entries as well as the underlying files, call `delete`
206+
on the external table for the external configuration with the argument
207+
`delete_external_files=True`.
208+
209+
```python
210+
schema.external['external_raw'].delete(delete_external_files=True)
211+
```
212+
213+
Note: Setting `delete_external_files=True` will always attempt to delete
214+
the underlying data file, and so should not typically be used with
215+
the `filepath` datatype.
216+
217+
## Migration between DataJoint v0.11 and v0.12
218+
219+
Note: Please read carefully if you have used external storage in DataJoint v0.11!
220+
221+
The initial implementation of external storage was reworked for
222+
DataJoint v0.12. These changes are backward-incompatible with DataJoint
223+
v0.11 so care should be taken when upgrading. This section outlines
224+
some details of the change and a general process for upgrading to a
225+
format compatible with DataJoint v0.12 when a schema rebuild is not
226+
desired.
227+
228+
The primary changes to the external data implementation are:
229+
230+
- The external object tracking mechanism was modified. Tracking tables
231+
were extended for additional external datatypes and split into
232+
per-store tables to improve database performance in schemas with
233+
many external objects.
234+
235+
- The external storage format was modified to use a nested subfolder
236+
structure (`folding`) to improve performance and interoperability
237+
with some filesystems that have limitations or performance problems
238+
when storing large numbers of files in single directories.
239+
240+
Depending on the circumstances, the simplest way to migrate data to
241+
v0.12 may be to drop and repopulate the affected schemas. This will construct
242+
the schema and storage structure in the v0.12 format and save the need for
243+
database migration. When recreation is not possible or is not preferred
244+
to upgrade to DataJoint v0.12, the following process should be followed:
245+
246+
1. Stop write activity to all schemas using external storage.
247+
248+
2. Perform a full backup of your database(s).
249+
250+
3. Upgrade your DataJoint installation to v0.12
251+
252+
4. Adjust your external storage configuration (in `datajoint.config`)
253+
to the new v0.12 configuration format (see above).
254+
255+
5. Migrate external tracking tables for each schema to use the new format. For
256+
instance in Python:
257+
258+
```python
259+
import datajoint.migrate as migrate
260+
db_schema_name='schema_1'
261+
external_store='raw'
262+
migrate.migrate_dj011_external_blob_storage_to_dj012(db_schema_name, external_store)
263+
```
264+
265+
6. Verify pipeline functionality after this process has completed. For instance in
266+
Python:
267+
268+
```python
269+
x = myschema.TableWithExternal.fetch('external_field', limit=1)[0]
270+
```
271+
272+
Note: This migration function is provided on a best-effort basis, and will
273+
convert the external tracking tables into a format which is compatible
274+
with DataJoint v0.12. While we have attempted to ensure correctness
275+
of the process, all use-cases have not been heavily tested. Please be sure to fully
276+
back-up your data and be prepared to investigate problems with the
277+
migration, should they occur.
278+
279+
Please note:
280+
281+
- The migration only migrates the tracking table format and does not
282+
modify the backing file structure to support `folding`. The DataJoint
283+
v0.12 logic is able to work with this format, but to take advantage
284+
of the new backend storage, manual adjustment of the tracking table
285+
and files, or a full rebuild of the schema should be performed.
286+
287+
- Additional care to ensure all clients are using v0.12 should be
288+
taken after the upgrade. Legacy clients may incorrectly create data
289+
in the old format which would then need to be combined or otherwise
290+
reconciled with the data in v0.12 format. You might wish to take
291+
the opportunity to version-pin your installations so that future
292+
changes requiring controlled upgrades can be coordinated on a system
293+
wide basis.

0 commit comments

Comments
 (0)