Skip to content

Object Design And Schema

snej edited this page Dec 30, 2011 · 12 revisions

These are the primary classes and SQL tables used by TouchDB. I’m describing them in language-neutral terms since I expect there to be multiple implementations.

Database

The heart of TouchDB is the Database class. On disk flash, it consists of a SQLite database file (and in the future, an associated directory containing attachments). In memory it has:

  • a database connection handle
  • a set of View objects representing rows in the Views table
  • a set of Replicator objects representing active replication tasks

Table: “docs”

This table stores document ID strings so they can be represented more compactly as foreign keys in the “revs” table.

Column Type Description
doc_id integer Primary key
docid text Document ID string

Table: “revs”

Each row in this table is a revision of a document. It’s equivalent to CouchDB’s “by-sequence” B-tree.

Column Type Description
sequence integer Sequence number (this is the primary key; it is set to auto-increment without reusing any values)
doc_id integer Document ID (foreign key)
revid text Revision ID string
parent integer Parent revision’s sequence number, or null if no parent (foreign key)
current boolean Is this a current (leaf) revision?
deleted boolean Does this revision represent a deletion?
json blob Document contents in UTF-8 encoded JSON

Note: To save space, the JSON does not include the `id`, `rev`, `deleted` or `attachments` properties; those are added when the JSON is returned from the API.

Table: “attachments”

Tracks attachments of revisions and their keys in the content-addressable attachment store.

Column Type Description
sequence integer Revision that owns this attachment (foreign key)
filename text Filename of the attachment
key blob Contents’ key in attachment store (SHA-1 digest of contents)
type text MIME type
length integer Content length in bytes
revpos integer Generation number (numeric revision prefix) where this attachment was added or changed

Every ‘revs’ row has associated ‘attachments’ rows for every attachment it contains, not just for attachments added or modified in that revision. This does mean a lot of duplicate ‘attachments’ rows, but it makes attachment lookup faster, and compaction easier.

Table: “views”

Each row in this table is a view definition.

Column Type Description
view_id integer Primary key
name text Name of view (unique)
version text Version ID of view definition function; must be changed if the function’s semantics change
lastsequence integer The last sequence number in “docs” that has been indexed by this view (foreign key)

Unlike in CouchDB, view definitions are not stored in the database as source code. They are native functions, represented by function pointers or their equivalent. The client must register each function with its named view when the database is opened.

Table: “maps”

Each row in this table is a key/value pair emitted by a view’s map function.

Column Type Description
view_id integer View that emitted this row (foreign key)
sequence integer Revision that emitted this row (foreign key)
key text JSON-encoded emitted key
value text JSON-encoded emitted value

Table: “replicators”

Stores persistent state of replications to/from other databases. The Replicator class uses this.

Column Type Description
remote text URL of remote database
push boolean Is this a “push” replication, i.e. is ‘remote’ the destination?
last_sequence text Last sequence processed from the source database (which may or may not be local.)

View

The View class is closely tied to the Database. It’s just broken out to give each view a place to store transient data (most importantly the map function pointer) and to make the API and implementation a bit clearer. Each View instance is associated with a row in the “views” table.

Instead of keeping a separate B-tree index for every view, TouchDB has a single “maps” table. It contains a row for every key/value pair that was emitted by a map function of any view. There is no storage of intermediate results from the reduce function, though (at least not yet.)

Before a query, the View object compares its saved last_sequence value against the highest sequence number in the ‘revs’ table. If they don’t match, it needs to rebuild the index. To do this it first deletes map rows emitted by obsolete revisions (ones that appear as ‘parent’ values in revs added since last_sequence). Then it iterates over every rev since last_sequence, calls the map function on it, and adds any emitted key/value pairs to ‘maps’. Finally it updates its last_sequence.

Server

The Server object is fairly simple. It’s generally a singleton created at app launch time, which lets the app open / close / create / delete databases. It has:

  • A reference to its root directory (which contains the database files)
  • A dictionary mapping names to database objects.

Replicator

Replicator is an abstract class representing an active replication. Its concrete subclasses are Pusher and Puller. Its properties are:

  • the local database object
  • the remote database URL
  • a flag indicating whether the replication is continuous
  • the last revision sequence number/ID transferred (persisted in the “replicators” table)
Clone this wiki locally