forked from tarantool/tarantool
-
Notifications
You must be signed in to change notification settings - Fork 0
Vinyl Disk Layout
Roman Tsisyk edited this page Mar 28, 2017
·
45 revisions
Tarantool 1.7.4 has the following disk layout:
├── <wal_dir>
├── 00000000000000000000.xlog
├── 00000000000000000047.xlog
├── 00000000000000000050.xlog
├── <wal_lsn>.xlog
├── 00000000000000000000.xctl
├── 00000000000000000050.xctl
├── <checkpoint_lsn>.xctl
├── <memtx_dir>
├── 00000000000000000000.snap
├── 00000000000000000050.snap
└── <checkpoint_lsn>.snap
├── <vinyl_dir>
└── 512 <!-- space_id
├── 0 <!-- primary key
| ├── 00000000000000000000.index
| ├── 00000000000000000000.run
| ├── 00000000000000000055.index
| ├── 00000000000000000055.run
| ├── <dump_lsn>.index
| └── <dump_lsn>.run
├── 1 <!-- secondary index
| ├── 00000000000000000000.index
| ├── 00000000000000000000.run
| ├── 00000000000000000032.index
| ├── 00000000000000000032.run
| ├── <dump_lsn>.index
| └── <dump_lsn>.run
-
.xlog- write-ahead-log (common for all storage engines). -
.snap- consistent snapshot of all tuples from all Memtx spaces. -
.run- consistent snapshot of all tuples from a Vinyl range, like SST in LevelDB terminology. Contains tuples ordered by the key definition and grouped by pages. -
.index- contains the index of all pages in corresponding.runfile and general information about this run. -
.xctl- physical journal of all operations with.runand.indexfiles..xlog,.snap,.indexfiles will be stored in this journal in the future versions of Tarantool. -
.xctlsnap- consistent snapshot of.xctljournal.
Current format:
INDEX
0.13
Server: 39887eac-7447-4d74-bd54-485484b9887a
VClock: {}
<FIXHEADER>
<run_info>
<page_info>
...
<page_info>
<EOF>
Proposed format:
INDEX
0.13
Version: 1.7.4
Server: 39887eac-7447-4d74-bd54-485484b9887a
<FIXHEADER>
<run_info>
<page_info>
...
<page_info>
<EOF>
Changes:
- Add
Version: 1.7.4; - Remove
VClock:; - Move
<run_info>into a separate<FIXHEADER>.
run_info is a xrow which contains general information about a Vinyl's run.
Current format:
- xrow header: map
- IPROTO_REQUEST_TYPE: unsigned = IPROTO_REPLACE
- IPROTO_LSN: unsigned = run_info->min_lsn
- xrow body: map
-
IPROTO_TUPLE: array
- 0: map:
- VY_RUN_MIN_LSN: unsigned = run_info->min_lsn
- VY_RUN_MAX_LSN: unsinged = run_info->max_lsn
- VY_RUN_PAGE_COUNT: unsinged = run_info->cou
-
VY_RUN_BLOOM: map
- VY_RUN_BLOOM_TABLE_SIZE: unsinged
- VY_RUN_BLOOM_HASH_COUNT: unsinged
- VY_RUN_BLOOM_VERSION: unsinged
- VY_RUN_BLOOM_TABLE: raw
- 0: map:
-
IPROTO_TUPLE: array
Proposed format:
- xrow header: map
- IPROTO_REQUEST_TYPE: unsigned = VY_INDEX_RUN_INFO = 100
- xrow body: map
- VY_RUN_MIN_LSN = 1: unsigned = run_info->min_lsn
- VY_RUN_MAX_LSN = 2: unsinged = run_info->max_lsn
- VY_RUN_PAGE_COUNT = 3: unsinged = run_info->count
-
VY_RUN_BLOOM = 4: array
- 0: unsigned = bloom->table_size
- 1: unsigned = bloom->hash_count
- 2: raw = raw bloom filter table in bigindian format
Changes:
- Remove
a map in a map in an array in a mapoverengineering; - Re-enumerate xrow body keys;
- Convert VY_RUN_BLOOM into an array.
page_info is a xrow which contains information about a page in .run file.
Current format:
- xrow header: map
- IPROTO_REQUEST_TYPE: unsigned = IPROTO_REPLACE
- xrow body: map
-
IPROTO_TUPLE: array
- 0: unsigned = page_info->offset;
- 1: unsigned = page_info->size;
- 2: map:
- VY_PAGE_REQUEST_COUNT: unsigned = page_info->request_count
- VY_PAGE_MIN_KEY: array
- VY_PAGE_DATA_SIZE: unsigned
- VY_PAGE_ROW_INDEX_OFFSET: unsigned
-
IPROTO_TUPLE: array
Proposed format:
- xrow header: map
- IPROTO_REQUEST_TYPE: unsigned = VY_INDEX_PAGE_INFO = 101
- xrow body: map
- VY_PAGE_OFFSET: unsigned = page_info->offset;
- VY_PAGE_SIZE: unsigned = page_info->size;
- VY_PAGE_UNPACKED_SIZE: unsigned = page_info->unpacked_size;
- VY_PAGE_REQUEST_COUNT: unsigned = page_info->request_count
- VY_PAGE_MIN_KEY: array
- VY_PAGE_DATA_SIZE: unsigned
- VY_PAGE_INDEX_OFFSET: unsigned <!-- an offset to row index, see below
Changes:
- Remove
a map in a map in an array in a mapoverengineering; - Re-enumerate xrow body keys;
- Rename VY_PAGE_ROW_INDEX_OFFSET into VY_PAGE_INDEX_OFFSET.
Current format:
RUN
0.13
Server: 39887eac-7447-4d74-bd54-485484b9887a
VClock: {}
<FIXHEADER> <!-- a page
<stmt>
..
<stmt>
<page_index>
...
<FIXHEADER>
<stmt>
..
<stmt>
<page_index>
<EOF>
Proposed format:
RUN
0.13
Version: 1.7.4
Server: 39887eac-7447-4d74-bd54-485484b9887a
<FIXHEADER> <!-- a page
<stmt>
..
<stmt>
<page_index>
...
<FIXHEADER>
<stmt>
..
<stmt>
<page_index>
<EOF>
Changes:
- Add
Version: 1.7.4; - Remove
VClock: {}.
stmt is a xrow which contains a single database operation in the format similar to WAL.
Current format:
- xrow header: map
- IPROTO_REQUEST_TYPE: unsigned = IPROTO_REPLACE|IPROTO_UPSERT|IPROTO_DELETE
- IPROTO_LSN: stmt->lsn
- xrow body: map
- IPROTO_SPACE_ID: unsigned = key_def->space_id;
- IPROTO_INDEX_ID: unsigned = key_def->id;
- IPROTO_TUPLE: array -- REPLACE or UPSERT
- IPROTO_KEY: array -- DELETE only
- IPROTO_OPS: array -- UPSERT only
Proposed format:
- xrow header: map
- IPROTO_REQUEST_TYPE: unsigned = IPROTO_REPLACE|UPSERT|DELETE
- IPROTO_LSN: stmt->lsn
- xrow body: map
- IPROTO_TUPLE: array -- for REPLACE or UPSERT
- IPROTO_KEY: array -- for DELETE only
- IPROTO_OPS: array -- for UPSERT only
Changes:
- Remove IPROTO_SPACE_ID and IPROTO_INDEX_ID to save space.
page_index - page index is a xrow which contains offsets for the current Vinyl page.
Current format:
- xrow header: map
- IPROTO_REQUEST_TYPE: unsigned = IPROTO_REPLACE
- xrow body: map
-
IPROTO_TUPLE: array
- 0: raw = row index in big endian
-
IPROTO_TUPLE: array
Proposed format:
- xrow header: map
- IPROTO_REQUEST_TYPE: unsigned = VY_RUN_PAGE_INDEX = 102
- xrow body: raw = row index in big endian
Changes:
- Remove
a raw in an array in a mapoverengineering.
Current format:
VYMETA
0.13
Server: 39887eac-7447-4d74-bd54-485484b9887a
<FIXHEADER>
<xctl_request>
...
<FIXHEADER>
<xctl_request>
<EOF>
Proposed format:
XCTL
0.13
Version: 1.7.4
Server: 39887eac-7447-4d74-bd54-485484b9887a
<FIXHEADER>
<xctl_request>
...
<FIXHEADER>
<xctl_request>
<EOF>
Changes:
- Add
Version: 1.7.4instead ofv13; - Remove
VClock: {}; - Rename VYMETA to XCTL.
Current format:
- xrow header: map
- IPROTO_REQUEST_TYPE: unsigned = IPROTO_INSERT
- xrow body: map
-
IPROTO_TUPLE: array
- 0: unsigned = record->type;
- 1: map:
- VY_LOG_KEY_INDEX_ID: unsigned = record->index_id
- VY_LOG_KEY_RANGE_ID: unsigned = record->range_id
- VY_LOG_KEY_RUN_ID: unsigned = record->run_id
- VY_LOG_KEY_RANGE_BEGIN: tuple
- VY_LOG_KEY_RANGE_END: tuple
-
IPROTO_TUPLE: array
Proposed format:
- xrow header: map
- IPROTO_REQUEST_TYPE: unsigned = record->type
- xrow body: map
- VY_XCTL_PATH: unsigned = record->path <!-- a relative path to the file
- VY_XCTL_RUN_ID: unsigned = record->run_id
- VY_XCTL_SPACE_ID: unsigned = record->space_id
- VY_XCTL_INDEX_ID: unsigned = record->index_id
- VY_XCTL_RANGE_ID: unsigned = record->range_id
- VY_XCTL_RANGE_BEGIN: array
- VY_XCTL_RANGE_END: array
Changes:
- Remove
a map in an array in a mapoverengineering; - Re-enumerate xrow body keys;
- Rename VY_LOG_KEY to VY_XCTL.
- Assign numbers for all VY_XXX keys:
- Can VY_XXX keys intersect with IPROTO_XXX keys?
- Think how to use .mlog for Memtx and WAL;
- Think how to re-design or remove page_index;
- Patch xlog module to support RUN, INDEX, MLOG, MSNAP files;
- Add a test case in the same way as
xlog/upgrade.test.lua.
Architecture Specifications
- Server architecture
- Feature specifications
- What's in a good specification
- Functional indexes
- Space _index structure
- R tree index quick start and usage
- LuaJIT
- Vinyl
- SQL
- Testing
- Performance
How To ...?
- ... add new fuzzers
- ... build RPM or Deb package using packpack
- ... calculate memory size
- ... debug core dump of stripped tarantool
- ... debug core from different OS
- ... debug Lua state with GDB
- ... generate new bootstrap snapshot
- ... use Address Sanitizer
- ... collect a coredump
Lua modules
Useful links