-
Notifications
You must be signed in to change notification settings - Fork 7
Labels
bugSomething isn't workingSomething isn't workingdocumentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is neededquestionFurther information is requestedFurther information is requested
Milestone
Description
Summary
Tighten semantics and performance for time-travel and valid-time queries. Specifically:
- Make
FOR SYSTEM_TIME AS OF …andVALID PERIOD [from,to)provably correct against row version bounds. - Add temporal index pruning and predicate pushdown so scans avoid non-overlapping versions.
- Introduce snapshot tokens to guarantee reproducible results for “as of TX/Timestamp” queries.
- Harden tuple (record) parsing around open/closed interval edges and null/±∞ bounds.
Goals
- Correct evaluation of overlap between
(tx_from, tx_to]and(valid_from, valid_to]vs. query cut(s). - Optimizer can prune segments/rowgroups that cannot match temporal predicates.
- Stable SnapshotHandle (logical tx id + wallclock + epoch) that pins MVCC horizon.
- Clear rules for inclusivity:
FOR SYSTEM_TIME AS OF tusestx_from ≤ t ≤ tx_to(ortx_to=∞);VALID PERIOD [a,b)uses half-open. - Observable speed-ups on temporal filters in
tests/microbenchmarks.
Non-Goals
- Full lineage GUI—CLI/SQL
lineage_explain()only. - Multi-shard causal snapshots (future work).
Design
1) Canonical Temporal Overlap
Implement helpers (header-only temporal.h):
struct Interval { Timestamp lo; Timestamp hi; enum Bound {Open, Closed}; Bound lo_b, hi_b; };
inline bool overlaps(const Interval& row, const Interval& q);
inline bool contains_point(const Interval& row, Timestamp t);- Represent “infinite” as
kMaxTsandkMinTs. - Centralize inclusivity so executor and optimizer share logic.
2) Storage: Temporal Min/Max on Chunks
Augment row-page/columnar metadata with:
chunk.tx_min, chunk.tx_max
chunk.valid_min, chunk.valid_max
- Filled during append/compaction.
- Exposed to optimizer as TemporalStats.
3) Optimizer: Temporal Pruning & Pushdown
- Translate
FOR SYSTEM_TIME AS OF ttocontains_point(tx_interval,t). - Translate
VALID PERIOD [a,b)tooverlaps(valid_interval,[a,b)). - Add a pruning rule: if
overlaps(chunk.interval, query.interval)==false→ skip. - Push comparisons down to scan so the executor short-circuits early.
4) Snapshot Reproducibility
- Add
CREATE SNAPSHOT(internal) generatingSnapshotHandle { lsn, clock_ts, epoch }. SET SNAPSHOT '<handle>'pins MVCC visibility + disables “fuzzy” reads.FOR SYSTEM_TIME AS OF TX <id>resolved via handle→LSN table when available.
5) Edge-Case Parsing Hardening
- Normalize tuples with missing
*_toto∞. - Guarantee
from ≤ toinvariant; flip or error with diagnostics if violated. - Unit tests for boundary conditions: exact equality on
lo/hi, open vs closed.
6) Observability
- Tracing spans:
temporal.prune,temporal.overlap_check,snapshot.pin. - Metrics:
temporal_pruned_chunks_total,temporal_scan_rows_filtered_total.
Acceptance Criteria
- ✅
FOR SYSTEM_TIME AS OFandVALID PERIODpass new spec tests for boundaries, null/∞, and equal endpoints. - ✅ Temporal pruning reduces scanned chunks by ≥80% on synthetic skewed history (added benchmark).
- ✅
SET SNAPSHOTguarantees identical results across repeated runs while active. - ✅ No regressions on non-temporal queries (CI).
Tasks
Engine / Executor
- Implement
Interval+ overlap/contain helpers. - Wire helpers into scan node temporal predicate evaluation.
Storage
- Persist
tx_min/max,valid_min/maxin chunk/segment metadata. - Populate during append/compaction; backfill on load if absent.
Optimizer
- Rule: derive temporal constraints from SQL → logical predicates.
- Rule: chunk pruning using
TemporalStats. - Pushdown: attach fast path to scan node.
Snapshotting
-
SnapshotHandlestruct + serialization. -
SET SNAPSHOT/RESET SNAPSHOTplumbing to MVCC visibility. - Map
AS OF TXto handle when present; fall back to current mapping.
Tests
- Unit: interval math (open/closed, ∞, equality).
- Integration:
FOR SYSTEM_TIME AS OF&VALID PERIODedge cases. - Benchmark: history depth vs. pruned chunks and wall time.
- Fuzz: random intervals, ensure no mismatches vs. reference evaluator.
Docs
- Update
README.md(“Bitemporal Time Travel & Lineage”) with inclusivity table. - Add
docs/temporal-semantic-tests.mdand optimizer note on pruning.
Risks & Mitigations
- Incorrect inclusivity semantics → lock them in docs + golden tests.
- Stats staleness after compaction → rebuild on compaction; keep conservative.
- Snapshot leaks → RAII guard that auto-resets at transaction end.
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workingdocumentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestgood first issueGood for newcomersGood for newcomershelp wantedExtra attention is neededExtra attention is neededquestionFurther information is requestedFurther information is requested
Projects
Status
Backlog