A GHC-Haskell implementation of Dotted Version Vectors (DVV), a data structure for tracking causality and resolving conflicts in distributed systems updating a single piece of data.
This library is inspired by the canonical Erlang DVVSet example implementation.
In distributed systems, multiple actors can concurrently update a piece of data. Traditional version vectors can track which updates happened after others, but they struggle to represent multiple concurrent "siblings" when conflicts occur.
Dotted Version Vectors solve this by combining:
- History (Context): A summary of all events seen by the system.
- Dots: Discrete events (actor + sequence number) that updated specific values
- Siblings: A set of values that are concurrent (i.e., none of them has "seen" the others).
- Dot: A pair
Actor x Countrepresenting a single write. - History (Version Vector): A mapping from actors to their latest sequence numbers.
Map Actor Count - DVV: A structure containing a history and a set of active siblings.
Map Actor Count x Map Dot Value
DVV provides a mechanism for causal ordering. When an actor writes a value, it creates a "Dot" – a unique identifier for that specific version of the data. Usually the "server", i.e. the thing doing the tracking of state is the "actor".
graph TD
A[DVV State] --> B(Causal History)
A --> C(Active Siblings)
B --> B1[Actor A: 5]
B --> B2[Actor B: 3]
C --> C1["Dot(A, 6): Value 'X'"]
C --> C2["Dot(B, 4): Value 'Y'"]
- Event Creation: When actor
Aupdates a value, it increments its counter in the DVV's history. The new value is stored associated with the new Dot(A, next_counter). - Causality: If version
V2is created knowing about versionV1(e.g.,V2was created usingV1as context),V2supersedesV1. - Synchronization: When two DVVs are merged (
sync), the resulting history is the point-wise maximum of both. Any sibling from one DVV is retained in the merged result only if it hasn't been superseded by the other DVV's history.
You can start with an empty DVV or a singleton, i.e. no causal history, without a value or with the first value (count implicitly = 1):
import Data.DVV
import qualified Data.HashMap.Strict as Map
-- An empty DVV without any causal history
empty :: DVV String Int
empty = EmptyDVV
-- A singleton DVV (initial write with no prior causal history)
initial :: DVV String Int
initial = SingletonDVV "actor1" 42Use event to record new updates. You can optionally provide a VersionVector as context to indicate which versions the new update supersedes.
-- Recording a new event on top of existing state
-- This will increment actor1's counter.
v1 = event initial Nothing "actor1" 43
-- Providing context to prune siblings
-- If 'v1' had multiple siblings, passing its 'context' here would replace them
v2 = event v1 (Just (context v1)) "actor2" 44Merge two DVVs to resolve their histories and collect concurrent siblings.
let d1 = SingletonDVV "A" 10
d2 = SingletonDVV "B" 20
merged = sync d1 d2
-- 'merged' now contains both values as siblings because they are concurrent.
-- values merged == [10, 20]When a sync results in multiple siblings (a conflict), you can reconcile them.
-- Pick the maximum value from siblings, reconcile just takes a deterministic "decider" function over a pair of `concurrent values`.
let resolved = reconcile max "resolver-actor" mergedIf your values have timestamps or you just want a deterministic winner:
-- Assuming values are (Timestamp, ActualValue)
let lwwResolved = lww (\(t1, _) (t2, _) -> compare t1 t2) "actor" conflictDvvDVVs implement a Partial Order to represent the causal relationship between versions. This is exposed via the PartialOrd typeclass (from lattices or Algebra.PartialOrd).
leq(Less than or Equal):A <= Bmeans thatAis a causal ancestor of (or equal to)B. In other words,B"knows" everythingAknows.- Implementation: For every actor in
A's history,B's history must have a counter greater than or equal toA's.
- Implementation: For every actor in
- Strict Inequality:
A < BmeansA <= BANDA /= B.Ahappened strictly beforeB. - Concurrent (Incomparable): If neither
A <= BnorB <= Aholds, thenAandBare concurrent (||). This indicates a conflict that needs to be resolved.
The sync operation computes the Least Upper Bound (LUB) of two DVVs, effectively merging their histories and preserving all concurrent values (siblings) until they are reconciled.
This library uses HashMap internally for efficiency and requires actor IDs to be instances of Hashable.
- Singleton Optimization:
SingletonDVVis a specialized constructor for the common case of a single value with no complex history, reducing memory overhead. - Pruning: Use the
prunefunction to truncate old causal history if the number of actors grows too large, though this should be used with caution as it affects future synchronization accuracy.
This project is licensed under the MIT License - see the LICENSE file for details.