Skip to content

Commit d1afeca

Browse files
committed
Added section of Snapshot Isolation
1 parent d77cd38 commit d1afeca

File tree

1 file changed

+19
-5
lines changed

1 file changed

+19
-5
lines changed

README.md

Lines changed: 19 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -11,10 +11,11 @@ If you are reading this and taking the effort to understand these papers, we wou
1111
3. [Classic System Design](#system-design)
1212
4. [Columnar Databases](#column)
1313
5. [Data-Parallel Computation](#data-parallel)
14-
6. [Consensus and Consistency](#consensus)
15-
7. [Trends (Cloud Computing, Warehouse-scale Computing, New Hardware)](#trends)
16-
8. [Miscellaneous](#misc)
17-
9. [External Reading Lists](#external)
14+
6. [Snapshot Isolation](#si)
15+
7. [Consensus and Consistency](#consensus)
16+
8. [Trends (Cloud Computing, Warehouse-scale Computing, New Hardware)](#trends)
17+
9. [Miscellaneous](#misc)
18+
10. [External Reading Lists](#external)
1819

1920

2021
## <a name='basic-and-algo'> Basics and Algorithms
@@ -53,7 +54,6 @@ If you are reading this and taking the effort to understand these papers, we wou
5354
* [Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications](http://www.cs.berkeley.edu/~rxin/db-papers/Chord-DHT.pdf) (2001) and [Dynamo: Amazon’s Highly Available Key-value Store](http://www.cs.berkeley.edu/~rxin/db-papers/Dynamo.pdf) (2007): Chord was born in the days when distributed hash tables was a hot research. It does one thing, and does it really well: how to look up the location of a key in a completely distributed setting (peer-to-peer) using consistent hashing. The Dynamo paper explains how to build a distributed key-value store using Chord. Note some design decisions change from Chord to Dynamo, e.g. finger table O(logN) vs O(N), because in Dynamo's case, Amazon has more control over nodes in a data center, while Chord assumes peer-to-peer nodes in wide area networks.
5455

5556

56-
5757
## <a name='column'> Columnar Databases
5858

5959
Columnar storage and column-oriented query engine are critical to analytical workloads, e.g. OLAP. It's been almost 15 years since it first came out (the MonetDB paper in 1999), and almost every commercial warehouse database has a columnar engine by now.
@@ -76,6 +76,20 @@ Columnar storage and column-oriented query engine are critical to analytical wor
7676
* [Spanner](http://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf) (2012): Spanner is "a scalable, multi-version, globally distributed, and synchronously replicated database". The linchpin that allows all this functionality is the TrueTime API which lets Spanner order events between nodes without having them communicate. [There is some speculation that the TrueTime API is very similar to a vector clock but each node has to store less data](http://www.cse.buffalo.edu/~demirbas/publications/augmentedTime.pdf). Sadly, a paper on TrueTime is promised, but hasn't yet been released.
7777

7878

79+
## <a name='si'> Snapshot Isolation
80+
81+
* [A Critique of ANSI SQL Isolation Levels](http://research.microsoft.com/pubs/69541/tr-95-51.pdf) (1995): Defines isolation levels in terms of phenomena, and shows that these and the ANSI SQL definitions fail to characterize several popular isolation levels. It also defines an important multiversion isolation type: *Snapshot Isolation (SI)*.
82+
83+
* [A Read-Only Transaction Anomaly Under Snapshot Isolation](http://www.sigmod.org/publications/sigmod-record/0409/2.ROAnomONeil.pdf) (2004): Disproves the assumption that under Snapshot Isolation, read-only transactions always execute serializably provided the concurrent update transactions are serializable.
84+
85+
* [Serializable Isolation for Snapshot Databases (SSI)](https://courses.cs.washington.edu/courses/cse444/08au/544M/READING-LIST/fekete-sigmod2008.pdf) (2008) and ([revised 2009 (ESSI)](http://dl.acm.org/citation.cfm?doid=1620585.1620587)): Describes a concurrency control algorithm that detects and prevents Snapshot Isolation anomalies at run-time, thus providing serializable isolation. Both papers are included for comparison, yet the second paper is more comprehensive and includes protection against additional phenomena and could be regarded as *Enhanced Serializable Snapshot Isolation (ESSI)*.
86+
87+
* [Precisely Serializable Snapshot Isolation (PSSI)](http://www.cs.umb.edu/~eoneil/PSSI_ICDE11_Numbered.pdf) (2011): Defines an algorithm for precisely detecting Snapshot Isolation anomalies, resulting in less false-positive aborts than ESSI. Discuesses implementation of the algorithm in MySQL's InnoDB.
88+
89+
* [Serializable Isolation in PostgreSQL](http://drkp.net/papers/ssi-vldb12.pdf) (2012):
90+
Discusses the trade-offs between SSI, ESSI and PSSI and the approach to implementation of SSI in PostgresSQL.
91+
92+
7993
## <a name='consensus'> Consensus and Consistency
8094

8195
* [Paxos Made Simple](http://www.cs.berkeley.edu/~rxin/db-papers/Paxos.pdf) (2001): Paxos is a fault-tolerant distributed consensus protocol. It forms the basis of a wide variety of distributed systems. The idea is simple, but notoriously difficult to understand (perhaps due to the way the original Paxos paper was written).

0 commit comments

Comments
 (0)