Skip to content
This repository was archived by the owner on Oct 14, 2020. It is now read-only.

Automatic Conflict Resolution

satabin edited this page Mar 22, 2013 · 11 revisions

Introduction

This feature is experimental! Do not hesitate to report any problem you encounter.

When one wants to save a document in a couchdb instance, the revision identifier of the document to save is used to determine whether this is the most recent version. If the revision from the document to save defers from the one stored in the database instance, a conflict happens and the document is not save into the database.

Usually one has to resolve the conflict by hand and retry to save the document, hoping that nobody else changed the document during resolving. Repeat this until saving works.

The conflict resolution process may be tedious and implement it in each client application is quite repetitive.

Sohva introduces a way to manage conflict automatically so that client code does not have to do. The only thing to do is to choose how the conflict are resolved and how many attempts to store are made. By default, Sohva only tries once before throwing a conflict exception, but this is configurable. This configuration can be individual to each database even inside the same client. This allows for quite fine-grained conflict management with different settings for different databases

val couch = new CouchClient
val db = couch.database("my_database", credit = 5, strategy = MyStrategy)

Setting the credit to 0 or less gives infinite try credits to this database.

Several strategies are available by default:

  • BarneyStinsonStrategy – This strategy is based on the simple rule New is always better. If a different revision of the document was saved in the database, Sohva will simply overwrite it with the new document.
  • StructuralMergeStrategy – This strategy implements a simple structural merge of the patches (see the rules below). Basically, it tries to be equivalent to what would have been done if you had modified the revision that was saved before yours.

It is easy to add new strategies by implementing gnieh.sohva.strategy.Strategy

Structural Merge Strategy Rules

In the following table BD stands for the patch from base document to the one in the database, BC stands for the patch from the base document to the one one wants to save, and DC is the computed patch from the document in the database to the one one wants to save. The rules for the structural merge strategy are:

`BD` `BC` `DC`
Not Changed Not Changed Not Changed
Not Changed Changed Changed
Not Changed Deleted Deleted
Changed Not Changed Changed
Changed Changed Changed (`BC`)
Changed Deleted Deleted
Deleted Not Changed Deleted
Deleted Changed Deleted
Deleted Deleted Deleted
Added Not Changed Added
Added Added Added (`BC`)
Not Changed Added Added

The remainder of this page is more technical and explains how the three way merge strategy works. If you just want to use Sohva, you do not need to read it...


Diff Algorithm and Patch Format

The first thing to do, is to define how to compute a diff between two documents. Any generic diff algorithm could make it, but we are in a specific case for several reasons:

  • we know we are dealing with json values
  • we assume that the global structure of the documents won't change drastically between two updates (if this is not the case, your scala code looks probably weird)

Assuming both points mentioned above are true, we can now define how the diff is computed. A Diff object is a json patch object, which basically consists of a list of operations to sequentially apply to the Json value.

This patch is computed as follows:

  • if the values are both json objects, the algorithm recursively computes the diff for each field
    • if fields are equal, then it's trivial, go to the next field
    • if fields have different values, compute the diff of these values
    • if a field was deleted, a remove operation is added to the patch
    • if a field was added, an add operation is added to the patch
  • if the values are both arrays, the difference between them is computed by using a patience based algorithm. Find the longest common subsequence and then add/remove/replace the values that are different
  • if the values are different, add a replace operation to the patch

Resolver Algorithm

At the beginning the conflict resolver is given some retry credit which is the amount of times to retry to save to save before failing.

So, there was a potential conflict to resolve:

  1. Retrieve the base document from the couchdb instance (the one with the revision known by the client)
  2. Retrieve the most recent document from the couchdb instance
  3. Apply the strategy to the three document base, last and current

This process is repeated until either

  • the document could be saved
  • the credit is completely used, in which case a ConflictException is raised

Clone this wiki locally