@@ -69,34 +69,37 @@ Object storage is configured in `datajoint.json` using the existing settings sys
6969 "database.host" : " localhost" ,
7070 "database.user" : " datajoint" ,
7171
72+ "object_storage.project_name" : " my_project" ,
7273 "object_storage.protocol" : " s3" ,
7374 "object_storage.endpoint" : " s3.amazonaws.com" ,
7475 "object_storage.bucket" : " my-bucket" ,
7576 "object_storage.location" : " my_project" ,
76- "object_storage.partition_pattern" : " subject {subject_id}/session {session_id}"
77+ "object_storage.partition_pattern" : " {subject_id}/{session_id}"
7778}
7879```
7980
8081For local filesystem storage:
8182
8283``` json
8384{
85+ "object_storage.project_name" : " my_project" ,
8486 "object_storage.protocol" : " file" ,
8587 "object_storage.location" : " /data/my_project" ,
86- "object_storage.partition_pattern" : " subject {subject_id}/session {session_id}"
88+ "object_storage.partition_pattern" : " {subject_id}/{session_id}"
8789}
8890```
8991
9092### Settings Schema
9193
9294| Setting | Type | Required | Description |
9395| ---------| ------| ----------| -------------|
96+ | ` object_storage.project_name ` | string | Yes | Unique project identifier (must match store metadata) |
9497| ` object_storage.protocol ` | string | Yes | Storage backend: ` file ` , ` s3 ` , ` gcs ` , ` azure ` |
9598| ` object_storage.location ` | string | Yes | Base path or bucket prefix |
9699| ` object_storage.bucket ` | string | For cloud | Bucket name (S3, GCS, Azure) |
97100| ` object_storage.endpoint ` | string | For S3 | S3 endpoint URL |
98101| ` object_storage.partition_pattern ` | string | No | Path pattern with ` {attribute} ` placeholders |
99- | ` object_storage.hash_length ` | int | No | Random suffix length for filenames (default: 8, range: 4-16) |
102+ | ` object_storage.token_length ` | int | No | Random suffix length for filenames (default: 8, range: 4-16) |
100103| ` object_storage.access_key ` | string | For cloud | Access key (can use secrets file) |
101104| ` object_storage.secret_key ` | string | For cloud | Secret key (can use secrets file) |
102105
@@ -139,6 +142,90 @@ s3://my-bucket/my_project/subject123/session45/schema_name/objects/Recording-raw
139142
140143If no partition pattern is specified, files are organized directly under ` {location}/{schema}/objects/ ` .
141144
145+ ## Store Metadata (` dj-store-meta.json ` )
146+
147+ Each object store contains a metadata file at its root that identifies the store and enables verification by DataJoint clients.
148+
149+ ### Location
150+
151+ ```
152+ {location}/dj-store-meta.json
153+ ```
154+
155+ For cloud storage:
156+ ```
157+ s3://bucket/my_project/dj-store-meta.json
158+ ```
159+
160+ ### Content
161+
162+ ``` json
163+ {
164+ "project_name" : " my_project" ,
165+ "created" : " 2025-01-15T10:30:00Z" ,
166+ "format_version" : " 1.0" ,
167+ "datajoint_version" : " 0.15.0" ,
168+ "schemas" : [" schema1" , " schema2" ]
169+ }
170+ ```
171+
172+ ### Schema
173+
174+ | Field | Type | Required | Description |
175+ | -------| ------| ----------| -------------|
176+ | ` project_name ` | string | Yes | Unique project identifier |
177+ | ` created ` | string | Yes | ISO 8601 timestamp of store creation |
178+ | ` format_version ` | string | Yes | Store format version for compatibility |
179+ | ` datajoint_version ` | string | Yes | DataJoint version that created the store |
180+ | ` schemas ` | array | No | List of schemas using this store (updated on schema creation) |
181+
182+ ### Store Initialization
183+
184+ The store metadata file is created when the first ` file ` attribute is used:
185+
186+ ```
187+ ┌─────────────────────────────────────────────────────────┐
188+ │ 1. Client attempts first file operation │
189+ ├─────────────────────────────────────────────────────────┤
190+ │ 2. Check if dj-store-meta.json exists │
191+ │ ├─ If exists: verify project_name matches │
192+ │ └─ If not: create with current project_name │
193+ ├─────────────────────────────────────────────────────────┤
194+ │ 3. On mismatch: raise DataJointError │
195+ └─────────────────────────────────────────────────────────┘
196+ ```
197+
198+ ### Client Verification
199+
200+ All DataJoint clients must use ** identical ` project_name ` ** settings to ensure store-database cohesion:
201+
202+ 1 . ** On connect** : Client reads ` dj-store-meta.json ` from store
203+ 2 . ** Verify** : ` project_name ` in client settings matches store metadata
204+ 3 . ** On mismatch** : Raise ` DataJointError ` with descriptive message
205+
206+ ``` python
207+ # Example error
208+ DataJointError: Object store project name mismatch.
209+ Client configured: " project_a"
210+ Store metadata: " project_b"
211+ Ensure all clients use the same object_storage.project_name setting.
212+ ```
213+
214+ ### Schema Registration
215+
216+ When a schema first uses the ` file ` type, it is added to the ` schemas ` list in the metadata:
217+
218+ ``` python
219+ # After creating Recording table with file attribute in my_schema
220+ # dj-store-meta.json is updated:
221+ {
222+ " project_name" : " my_project" ,
223+ " schemas" : [" my_schema" ] # my_schema added
224+ }
225+ ```
226+
227+ This provides a record of which schemas have data in the store.
228+
142229## Syntax
143230
144231``` python
@@ -211,7 +298,7 @@ Storage paths are **deterministically constructed** from record metadata, enabli
2112985 . ** Table name** - the table class name
2122996 . ** Primary key encoding** - remaining PK attributes and values
2133007 . ** Field name** - the attribute name
214- 8 . ** Suffixed filename** - original name with random hash suffix
301+ 8 . ** Suffixed filename** - original name with random token suffix
215302
216303### Path Template
217304
@@ -310,7 +397,7 @@ description=a1b2c3d4_abc123 # long string truncated + hash
310397
311398### Filename Collision Avoidance
312399
313- To prevent filename collisions, each stored file receives a ** random hash suffix** appended to its basename:
400+ To prevent filename collisions, each stored file receives a ** random token suffix** appended to its basename:
314401
315402```
316403original: recording.dat
@@ -320,10 +407,10 @@ original: image.analysis.tiff
320407stored: image.analysis_pL9nR4wE.tiff
321408```
322409
323- #### Hash Suffix Specification
410+ #### Token Suffix Specification
324411
325412- ** Alphabet** : URL-safe and filename-safe Base64 characters: ` A-Z ` , ` a-z ` , ` 0-9 ` , ` - ` , ` _ `
326- - ** Length** : Configurable via ` object_storage.hash_length ` (default: 8, range: 4-16)
413+ - ** Length** : Configurable via ` object_storage.token_length ` (default: 8, range: 4-16)
327414- ** Generation** : Cryptographically random using ` secrets.token_urlsafe() `
328415
329416At 8 characters with 64 possible values per character: 64^8 = 281 trillion combinations.
@@ -511,12 +598,13 @@ class ObjectStorageSettings(BaseSettings):
511598 validate_assignment = True ,
512599 )
513600
601+ project_name: str | None = None # Must match store metadata
514602 protocol: Literal[" file" , " s3" , " gcs" , " azure" ] | None = None
515603 location: str | None = None
516604 bucket: str | None = None
517605 endpoint: str | None = None
518606 partition_pattern: str | None = None
519- hash_length : int = Field(default = 8 , ge = 4 , le = 16 )
607+ token_length : int = Field(default = 8 , ge = 4 , le = 16 )
520608 access_key: str | None = None
521609 secret_key: SecretStr | None = None
522610```
0 commit comments