Skip to content
cowtowncoder edited this page Aug 29, 2012 · 9 revisions

BDB Data Format

General

All values are in Big-Endian format, that is, starting with the Most Significant Bytes (and bits)

Variable-length Integers are only used for lengths, and thus only support positive integers (removing need for using Zigzag encoding). Encoding is done using sign-bit to denote the last byte; all bytes have 7 data bits.

BDB Entries

Entry metadata is stored in "raw" format as follows

First section: fixed data

  • 0-7: long Last modified timestamp (used for secondary index)
  • 8-11: Status section
  • 8: Version number: hard-coded to 0x11 for the current version, reserved for future compatibility needs
  • 9: Entry status, with allowed values of:
  • 0: active entry
  • 1: soft-deleted entry ("tombstone")
  • 10: Compression method, with allowed values of:
  • 0: no compression ("identity")
  • 1: LZF (https://github.com/ning/compress)
  • 2: GZIP (or to be precise, "deflate")
  • 11: 8-bit unsigned length of external storage path; or 0 for inlined storage
  • 12-15: int 32-bit Murmur3/32 hash code calculated over uncompressed content; 0 means "not available" (hash value of 0 must be "masked", i.e. is converted to value 1)

Second section: variable length

Third section: optional

Clone this wiki locally