Skip to content

Design dump  #6

@rickyyx

Description

@rickyyx

Data structures and high level design

Catalog

  • Catalog will include have a layout_versoin column for each table, that is queried by each transaction before calling the SqlTable API

SqlTable

  • SqlTable will now have multiple DataTable, each new DataTable created at the update of a new schema.
  • Those DataTable will be ordered by the schema version
  • Sqltable API will now have additional layout_version field that is used to find the corresponding DataTable.

DataTable

  • DataTable will have mappings and reverse mappings between col_id and col_oid for schema version alignment
  • DataTable will have reference to its SqlTable so that SlotIterator will know the next DataTable to iterate through at table boundary (since tuples now might exist in another DataTable)

Update the schema

  • Since we are only supporting safe schema updates, eg., Add / Drop Column , schemachange txn will not have conflict with non-schemachange txn.
  • Concurrent schemachange txn (write-write) conflict will be detected and syncronized at the catalog table.
  • Updating the schema only involves adding a new DataTable in the SqlTable, and the corresponding meta data (e.g., col_id maps)

Reading a tuple from SqlTable

  • If the tuple is in the correct layout_version, simply select the tuple and return
  • If the tuple is not in the correct layout_version, one can find the correct DataTable from the tupleslot, and then read the current tuple with intersection of the old and new schemas, and fill up the default columns.

Updating a tuple

  • If the tuple is in the correct layout_versoin, normal case
  • If the tuple is in an older layout_version, we will need to first logically delete the tuple from the old DataTable, and then insert that tuple in the current DataTable. The tuple in the old DataTable could still be accessed by a concurrent read Transaction

Background migration

  • When to migrate:
    • Same as GC, when all running txns have layout_version larger than the DataTable ?
  • How to migrate:
    • Background thread Delete/Insert one tuple at a time?

Edge cases sanity check

  1. What if concurrent Update and Read to a tuple(a1) in the old schema?
T1:               BEGIN                           READ in old -> chase version pointer-> a1
T2 :  BEGIN              DELETE a1 in old                                                                INSERT a2 in new -> ok 

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions