You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
- Document why single backend is important for integrity
- Add Access Control Patterns section explaining prefix-based policies
- Show how schema/table-level access maps to IAM policies
- Add row-level access via signed URLs to Future Extensions
Copy file name to clipboardExpand all lines: docs/src/design/tables/object-type-spec.md
+37Lines changed: 37 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -54,6 +54,42 @@ This is fundamentally different from **external references**, where DataJoint me
54
54
55
55
Each DataJoint pipeline has **one** associated storage backend configured in `datajoint.json`. DataJoint fully controls the path structure within this backend.
56
56
57
+
**Why single backend?** The object store is a logical extension of the schema—its integrity must be verifiable as a unit. With a single backend:
58
+
- Schema completeness can be verified with one listing operation
59
+
- Orphan detection is straightforward
60
+
- Migration requires only config changes, not mass URL updates in the database
61
+
62
+
### Access Control Patterns
63
+
64
+
The deterministic path structure (`project/schema/Table/objects/pk=val/...`) enables **prefix-based access control policies** on the storage backend.
65
+
66
+
**Supported access control levels:**
67
+
68
+
| Level | Implementation | Example Policy Prefix |
| Row-level | Per-object ACL or signed URLs | Future enhancement |
74
+
75
+
**Example: Private and public data in one bucket**
76
+
77
+
Rather than using separate buckets, use prefix-based policies:
78
+
79
+
```
80
+
s3://my-bucket/my_project/
81
+
├── internal_schema/ ← restricted IAM policy
82
+
│ └── ProcessingResults/
83
+
│ └── objects/...
84
+
└── publications/ ← public bucket policy
85
+
└── PublishedDatasets/
86
+
└── objects/...
87
+
```
88
+
89
+
This achieves the same access separation as multiple buckets while maintaining schema integrity in a single backend.
90
+
91
+
**Row-level access control** (access to objects for specific primary key values) is not directly supported by object store policies. Future versions may address this via DataJoint-generated signed URLs that project database permissions onto object access.
92
+
57
93
### Supported Backends
58
94
59
95
DataJoint uses **[`fsspec`](https://filesystem-spec.readthedocs.io/en/latest/)** to ensure compatibility across multiple storage backends:
0 commit comments