Skip to content

Support time travel queries #179

@sfc-gh-okalaci

Description

@sfc-gh-okalaci

from @sfc-gh-mslot

Since all data and metadata is immutable, Iceberg snapshots remain queryable until explicitly deleted. In principle, this allows querying the table in a past state (time travel).

We could add enable time travel through a dummy filter like `WHERE as_of('2024-01-01 00:00') and ...` which would trigger planner/executor hooks to use an older snapshot.

Time travel also is a form of restoring from backup.

from @sfc-gh-abozkurt

Should we support branching and tagging as well? There are 2 ways to time travel query an Iceberg table. `AS OF <branch or tag or snapshotid>` or `AS OF <timestamp>`

from @sfc-gh-mslot

Since we already support schema changes, we should probably consider that past snapshots/branches will have different schemas and cannot be queries via a filter.

It seems preferable to introduce something like:
```sql
create table branch ()
using iceberg
with (branch_from = 'iceberg_table');

create table past_branch ()
using iceberg
with (branch_from = 'iceberg_table', as_of = '2024-11-01 03:00:00');

which are wired into to the same metadata JSON, probably via a new catalog table that maps between PostgreSQL tables and branch names.


from @sfc-gh-okalaci 

which are wired into to the same metadata JSON,
probably via a new catalog table that maps between PostgreSQL tables and branch names.

I think this makes sense (and very smart 💯 ). It (probably) saves us from a lot of trouble. For example, we could be able to keep the relationId->currentSnapshot mapping in the code (e.g., GetCurrentSnapshot). So, majority of the code flow would not change.

Still, GetCurrentSnapshot should be updated, such that for tables originated from a branch, we should find the snapshot that is associated with the branch in refs such as:

  "refs": {
    "test_branch": {
      "snapshot-id": 3638654959223475932,
      "type": "branch"
    },
    "main": {
      "snapshot-id": 2774914780655428277,
      "type": "branch"
    }

And, in case we use metadata.json -> current-schema-id, we should now find the schema-id of the given snapshot:

"snapshots": [
    {
      "sequence-number": 1,
      "snapshot-id": 3638654959223475932,
      "timestamp-ms": 1732190894178,
      "summary": {....
      "manifest-list": "s3://testbucketcdw/spark_test/public/sample2/metadata/snap-3638654959223475932-1-91fa9309-7fa0-4a1b-aa9f-35942d756766.avro",
      "schema-id": 0
    },

Some things to be careful about:

  • Snapshot Expiration - Deletion queue: We currently add files to deletion queue when a file is referred in expired snapshots, but not in non-expired snapshots. Now, we should probably expand the definition of expired/non-expired snapshots and follow this algorithm: https://iceberg.apache.org/spec/#snapshot-retention-policy

  • Locking: Now that we have multiple tables modifying the same metadata.json, we should be careful with locking. Though, the select .. for update that we already have while accessing iceberg_tables might be good enough. Still, let's be careful about concurrent DML/DDL/SELECTs etc.

  • iceberg_files() UDF?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions