Skip to content

Commit 052a40b

Browse files
committed
Add access control patterns section to spec
- Document why single backend is important for integrity - Add Access Control Patterns section explaining prefix-based policies - Show how schema/table-level access maps to IAM policies - Add row-level access via signed URLs to Future Extensions
1 parent 54460ed commit 052a40b

File tree

1 file changed

+37
-0
lines changed

1 file changed

+37
-0
lines changed

docs/src/design/tables/object-type-spec.md

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -54,6 +54,42 @@ This is fundamentally different from **external references**, where DataJoint me
5454

5555
Each DataJoint pipeline has **one** associated storage backend configured in `datajoint.json`. DataJoint fully controls the path structure within this backend.
5656

57+
**Why single backend?** The object store is a logical extension of the schema—its integrity must be verifiable as a unit. With a single backend:
58+
- Schema completeness can be verified with one listing operation
59+
- Orphan detection is straightforward
60+
- Migration requires only config changes, not mass URL updates in the database
61+
62+
### Access Control Patterns
63+
64+
The deterministic path structure (`project/schema/Table/objects/pk=val/...`) enables **prefix-based access control policies** on the storage backend.
65+
66+
**Supported access control levels:**
67+
68+
| Level | Implementation | Example Policy Prefix |
69+
|-------|---------------|----------------------|
70+
| Project-level | IAM/bucket policy | `my-bucket/my_project/*` |
71+
| Schema-level | IAM/bucket policy | `my-bucket/my_project/lab_internal/*` |
72+
| Table-level | IAM/bucket policy | `my-bucket/my_project/schema/SensitiveTable/*` |
73+
| Row-level | Per-object ACL or signed URLs | Future enhancement |
74+
75+
**Example: Private and public data in one bucket**
76+
77+
Rather than using separate buckets, use prefix-based policies:
78+
79+
```
80+
s3://my-bucket/my_project/
81+
├── internal_schema/ ← restricted IAM policy
82+
│ └── ProcessingResults/
83+
│ └── objects/...
84+
└── publications/ ← public bucket policy
85+
└── PublishedDatasets/
86+
└── objects/...
87+
```
88+
89+
This achieves the same access separation as multiple buckets while maintaining schema integrity in a single backend.
90+
91+
**Row-level access control** (access to objects for specific primary key values) is not directly supported by object store policies. Future versions may address this via DataJoint-generated signed URLs that project database permissions onto object access.
92+
5793
### Supported Backends
5894

5995
DataJoint uses **[`fsspec`](https://filesystem-spec.readthedocs.io/en/latest/)** to ensure compatibility across multiple storage backends:
@@ -1337,3 +1373,4 @@ arr = da.from_zarr(obj_ref.store, component='spikes')
13371373
- [ ] Checksum verification on fetch
13381374
- [ ] Cache layer for frequently accessed files
13391375
- [ ] Parallel upload/download for large folders
1376+
- [ ] Row-level object access control via signed URLs (project DB permissions onto object access)

0 commit comments

Comments
 (0)