-
-
Notifications
You must be signed in to change notification settings - Fork 1.8k
understanding the data model
Interested in adding a new table to our schema? Check out this reference PR: https://github.com/internetarchive/openlibrary/pull/7928/files
The bookshelves core model demonstrates how to use a database connection on the backend to query for data.
from openlibrary.core import db
oldb = db.get_db() # i.e. web.database(**web.config.db_parameters)
query = "SELECT count(*) from bookshelves_books"
oldb.query(query)From within routers/controllers, it's much more common to use the web.ctx.site object to fetch individual or multiple records.
doc = web.ctx.site.get("/works/OL5285479W")
keys = ["/works/OL5285479W", "/works/OL257943W", "/works/OL27448W"]
docs = web.ctx.site.get_many(keys)Open Library is built using a wiki engine called infogami, which sits on top of the web.py Python micro-web framework (comparable to Flask). Web.py uses a variable called web.ctx to maintain the application context during an HTTP request. Web.py also maintains a PostgreSQL database connection using web.db. Infogami extends web.db by offering a system called infobase, which behaves like an ORM (database wrapper) to define arbitrary data types such as works, editions, and authors.
At its core, Infobase relies on two tables: things and data:
-
thingsassigns every object in the system an ID, a type, and a reference to its data in thedatatable. -
datais just a massive catalog of json data that can be accessed by querying and joining with thethingstable.
Infogami injects a utility called site into web.py's web.ctx variable (see web.py ctx documentation), which maintains information and connections specific to the current client. The web.ctx.site utility handles queries and joins, allowing you to request any key from the things table, fetch its corresponding data, and leverage the models and functions defined for that thing's type.
Every Infogami page on Open Library (anything with a URL) has an associated type. Each type contains a schema that defines which fields can be used and their formats. These schemas generate view and edit templates, which can be further customized as needed. Infogami provides a generic way to create new types through its wiki interface.
Aside from the tables listed in the Open Library Feature Tables section, Open Library essentially has only two database tables. By default, they provide basic functionality through Infogami.
The thing table defines types such as editions, works, authors, users, and languages. It also tracks instances of things by their identifiers, registering their IDs in the table.
Entries in a sample thing table
| id | key | type | latest_revision | created | last_modified |
|---|---|---|---|---|---|
| 2 | /type/key | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
| 3 | /type/string | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
| 4 | /type/text | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
| 5 | /type/int | 1 | 1 | 2013-03-20 10:27:01.322813 | 2013-03-20 10:27:01.322813 |
The data table maps each type to all associated data.
Entry in a sample data table
| thing_id | revision | data |
|---|---|---|
| 1 | 1 | {"created": {"type": "/type/datetime", "value": "2013-03-20T10:27:01.223351"}, "last_modified": {"type": "/type/datetime", " value": "2013-03-20T10:27:01.223351"}, "latest_revision": 1, "key": "/type/type", "type": {"key": "/type/type"}, "id": 1, "revision": 1} |
Read further about Infogami and type on: https://openlibrary.org/dev/docs/infogami
Open Library has a number of additional tables that are used to support a variety of features. The DDL for these tables can be found in schema.sql.


These tables store the books on patrons' "Want to Read", "Currently Reading", and "Already Read" reading log shelves. The bookshelves_books table contains most of this data, with bookshelves serving as a lookup table for shelf names.
bookshelves.py provides functions which interact with the reading log tables.

This table stores the target number of books a patron commits to reading in a given year. Functions that interact with the yearly_reading_goals table are in yearly_reading_goals.py.

A patron can track the last date they finished any book on their "Already Read" shelf. The bookshelves_events table stores these dates and may later be used to track additional dates (such as when they started reading, or start and finish dates of re-reads).
Related code can be found in bookshelves_events.py.

Patrons can provide structured reviews by attaching pre-defined tags to a work. These are stored in the observations table.
The code that interacts with this table, as well as the definitions for the tags, are found in observations.py.

A patron can add private notes to any work. The booknotes table stores these notes. booknotes.py contains the code that interacts with this table.

Patrons can submit star ratings for works. The ratings table stores these ratings. See ratings.py for related code.

This table holds librarian requests, which populate the librarian request table at https://openlibrary.org/merges. Code that interacts directly with this table is in edits.py.