You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add foreign-key-only primary key constraint to spec
Auto-populated tables must have primary keys composed entirely of
foreign key references. This ensures 1:1 job correspondence and
enables proper referential integrity for the jobs table.
Copy file name to clipboardExpand all lines: docs/src/design/autopopulate-2.0-spec.md
+89-5Lines changed: 89 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -27,11 +27,54 @@ The existing `~jobs` table has significant limitations:
27
27
28
28
### Core Design Principles
29
29
30
-
1.**Per-table jobs**: Each computed table gets its own hidden jobs table
31
-
2.**Native primary keys**: Jobs table uses the same primary key structure as its parent table (no hashes)
32
-
3.**Referential integrity**: Jobs are foreign-key linked to parent tables with cascading deletes
33
-
4.**Rich status tracking**: Extended status values for full lifecycle visibility
34
-
5.**Automatic refresh**: `populate()` automatically refreshes the jobs queue
30
+
1.**Foreign-key-only primary keys**: Auto-populated tables cannot introduce new primary key attributes; their primary key must comprise only foreign key references
31
+
2.**Per-table jobs**: Each computed table gets its own hidden jobs table
32
+
3.**Native primary keys**: Jobs table uses the same primary key structure as its parent table (no hashes)
33
+
4.**Referential integrity**: Jobs are foreign-key linked to parent tables with cascading deletes
34
+
5.**Rich status tracking**: Extended status values for full lifecycle visibility
35
+
6.**Automatic refresh**: `populate()` automatically refreshes the jobs queue
36
+
37
+
### Primary Key Constraint
38
+
39
+
**Auto-populated tables (`dj.Imported` and `dj.Computed`) must have primary keys composed entirely of foreign key references.**
40
+
41
+
This constraint ensures:
42
+
-**1:1 key_source mapping**: Each entry in `key_source` corresponds to exactly one potential job
43
+
-**Deterministic job identity**: A job's identity is fully determined by its parent records
44
+
-**Simplified jobs table**: The jobs table can directly reference the same parents as the computed table
45
+
46
+
```python
47
+
# VALID: Primary key is entirely foreign keys
48
+
@schema
49
+
classFilteredImage(dj.Computed):
50
+
definition ="""
51
+
-> Image
52
+
---
53
+
filtered_image : <djblob>
54
+
"""
55
+
56
+
# VALID: Multiple foreign keys in primary key
57
+
@schema
58
+
classComparison(dj.Computed):
59
+
definition ="""
60
+
-> Image.proj(image_a='image_id')
61
+
-> Image.proj(image_b='image_id')
62
+
---
63
+
similarity : float
64
+
"""
65
+
66
+
# INVALID: Additional primary key attribute not allowed
67
+
@schema
68
+
classAnalysis(dj.Computed):
69
+
definition ="""
70
+
-> Recording
71
+
analysis_method : varchar(32) # NOT ALLOWED - adds to primary key
72
+
---
73
+
result : float
74
+
"""
75
+
```
76
+
77
+
**Migration note**: Existing tables that violate this constraint will continue to work but cannot use the new jobs system. A deprecation warning will be issued.
35
78
36
79
## Architecture
37
80
@@ -525,3 +568,44 @@ The current system hashes primary keys to support arbitrary key types. The new s
525
568
2.**Query efficiency**: Native keys can use table indexes
526
569
3.**Foreign keys**: Hash-based keys cannot participate in foreign key relationships
527
570
4.**Simplicity**: No need for hash computation and comparison
571
+
572
+
### Why Require Foreign-Key-Only Primary Keys?
573
+
574
+
Restricting auto-populated tables to foreign-key-only primary keys provides:
575
+
576
+
1.**1:1 job correspondence**: Each `key_source` entry maps to exactly one job, eliminating ambiguity about what constitutes a "job"
577
+
2.**Proper referential integrity**: The jobs table can reference the same parent tables, enabling cascading deletes
578
+
3.**Eliminates key_source complexity**: No need for custom `key_source` definitions to enumerate non-foreign-key combinations
579
+
4.**Clearer data model**: The computation graph is fully determined by table dependencies
580
+
5.**Simpler populate logic**: No need to handle partial key matching or key enumeration
581
+
582
+
**What if I need multiple outputs per parent?**
583
+
584
+
Use a part table pattern instead:
585
+
586
+
```python
587
+
# Instead of adding analysis_method to primary key:
0 commit comments