You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Finalize staged_insert1 API for direct object storage writes
- Use dedicated staged_insert1 method instead of co-opting insert1
- Add StagedInsert class with rec dict, store(), and open() methods
- Document rationale for separate method (explicit, backward compatible, type safe)
- Add examples for Zarr and multiple object fields
- Note that staged inserts are limited to insert1 (no multi-row)
Copy file name to clipboardExpand all lines: docs/src/design/tables/file-type-spec.md
+97-33Lines changed: 97 additions & 33 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -480,56 +480,99 @@ The file/folder is copied to storage **before** the database insert is attempted
480
480
481
481
### Staged Insert (Direct Write Mode)
482
482
483
-
For large objects like Zarr arrays, copying from local storage is inefficient. **Staged insert** allows writing directly to the destination:
483
+
For large objects like Zarr arrays, copying from local storage is inefficient. **Staged insert** allows writing directly to the destination.
484
+
485
+
#### Why a Separate Method?
486
+
487
+
Staged insert uses a dedicated `staged_insert1` method rather than co-opting `insert1` because:
488
+
489
+
1.**Explicit over implicit** - Staged inserts have fundamentally different semantics (file creation happens during context, commit on exit). A separate method makes this explicit.
490
+
2.**Backward compatibility** - `insert1` returns `None` and doesn't support context manager protocol. Changing this could break existing code.
491
+
3.**Clear error handling** - The context manager semantics (success = commit, exception = rollback) are obvious with `staged_insert1`.
492
+
4.**Type safety** - The staged context exposes `.store()` for object fields. A dedicated method can return a properly-typed `StagedInsert` object.
493
+
494
+
**Staged inserts are limited to `insert1`** (one row at a time). Multi-row inserts are not supported for staged operations.
495
+
496
+
#### Basic Usage
484
497
485
498
```python
486
-
# Stage an object for direct writing
487
-
with Recording.stage_object(
488
-
{"subject_id": 123, "session_id": 45},
489
-
"raw_data",
490
-
"my_array.zarr"
491
-
) as staged:
492
-
# Write directly to object storage (no local copy)
493
-
import zarr
494
-
z = zarr.open(staged.store, mode='w', shape=(10000, 10000), dtype='f4')
499
+
# Stage an insert with direct object storage writes
500
+
with Recording.staged_insert1 as staged:
501
+
# Set primary key values
502
+
staged.rec['subject_id'] =123
503
+
staged.rec['session_id'] =45
504
+
505
+
# Create object storage directly using store()
506
+
z = zarr.open(staged.store('raw_data', 'my_array.zarr'), mode='w', shape=(10000, 10000), dtype='f4')
495
507
z[:] = compute_large_array()
496
508
509
+
# Assign the created object to the record
510
+
staged.rec['raw_data'] = z
511
+
497
512
# On successful exit: metadata computed, record inserted
498
513
# On exception: storage cleaned up, no record inserted
499
514
```
500
515
501
-
#### StagedObject Interface
516
+
#### StagedInsert Interface
502
517
503
518
```python
504
-
@dataclass
505
-
classStagedObject:
506
-
"""Handle for staged write operations."""
519
+
classStagedInsert:
520
+
"""Context manager for staged insert operations."""
507
521
508
-
path: str# Reserved storage path
509
-
full_path: str# Full URI (e.g., 's3://bucket/path')
510
-
fs: fsspec.AbstractFileSystem # fsspec filesystem
511
-
store: fsspec.FSMap # FSMap for Zarr/xarray
522
+
rec: dict[str, Any] # Record dict for setting attribute values
0 commit comments