Guide: Replication

8. Replication

Here we are: this is the feature that probably induced you to want to use TouchDB. You have progressed well through the info-dump, Grasshopper, and are ready to learn the secret techniques of document syncing.

Replication is conceptually simple — "Take everything that's changed in database A and copy it over to database B", but it comes with a sometimes-confusing variety of options:

Push vs Pull. This is really just a matter of whether A or B is the remote database.
Continuous vs. one-shot. A one-shot replication proceeds until all the current changes have been copied, then finishes. A continuous replication keeps the connection open, idling in the background and watching for more changes; as soon as any happen, it copies them. (TouchDB's replicator is aware of connectivity changes, so if the device goes offline the replicator will watch for the server to become reachable again, and then reconnect.)
Persistent vs. non-persistent. Non-persistent replications, even continuous ones, are forgotten after the app quits. Persistent replications are remembered in a special _replicator database. This is most useful for continuous replications: by making them persistent, you ensure they will always be ready and watching for changes, every time your app launches.
Filters. Sometimes you only want particular documents to be replicated, or you want particular documents to be ignored. To do this you can define a filter function. The function simply takes a document's contents and returns true if it should be replicated.

Creating A Replication

Replications are represented in CouchCocoa by objects, of class CouchReplication (for non-persistent ones) or CouchPersistentReplication. We'll focus on the persistent kind first, as they're more commonly used. You get them by asking the local CouchDatabase, calling replicationFromDatabaseAtURL: or replicationToDatabaseAtURL:. Or you can get a bulk discount by calling replicateWithURL: to set up a bidirectional replication:

NSArray* repls = [self.database replicateWithURL: newRemoteURL exclusively: YES];
self.pull = [repls objectAtIndex: 0];
self.push = [repls objectAtIndex: 1];

The exclusively: YES option will seek out and remove any pre-existing replications with other remote URLs. This is useful if you only sync with one server at a time and just want to change the address of that server.

It's not strictly necessary to keep references to the replication objects, but you'll need them if you want to monitor their progress.

Monitoring Replication Progress

A replication object has several properties you can observe to track its progress. The most useful are:

.completed — the number of documents copied so far in the current batch
.total — the total number of documents to be copied
.error — will be set to an NSError if the replication fails
.mode — an enumeration that tells you whether the replication is stopped, offline, idle or active. (Offline means the server is unreachable over the network. Idle means the replication is continuous but there is currently nothing left to be copied.)

Generally you can get away with just observing .completed:

[self.pull addObserver: self forKeyPath: @"completed" options: 0 context: NULL];
[self.push addObserver: self forKeyPath: @"completed" options: 0 context: NULL];

Your observation method might look like this:

- (void) observeValueForKeyPath:(NSString *)keyPath ofObject:(id)object 
                         change:(NSDictionary *)change context:(void *)context
{
    if (object == pull || object == push) {
        unsigned completed = pull.completed + push.completed;
        unsigned total = pull.total + push.total;
        if (total > 0 && completed < total) {
            [self showProgressView];
            [progressView setProgress: (completed / (float)total)];
        } else {
            [self hideProgressView];
        }
    }
}

Here progressView is a UIProgressView that shows a bar-graph of the current progress. The progress view is only shown while replication is active, i.e. when total is nonzero.

Don't expect the progress indicator to be completely accurate! It may jump around because the .total property changes as the replicator figures out how many documents need to be copied. And it may not advance smoothly, because some documents may take much longer to transfer than others if they have large attachments. But in practice it seems accurate enough to give the user an idea of what's going on.

One-Shot Replications

In some case you don't want a persistent replication; maybe you just want to pull (or push) once. Or maybe you want to control exactly when replication happens instead of letting TouchDB schedule it. In that case you can create non-persistent replications:

CouchReplication* pull = [self.database pullFromDatabaseAtURL: remoteURL];

or

CouchReplication* push = [self.database pushToDatabaseAtURL: remoteURL];

Note that these create a different class of object, CouchReplication as opposed to CouchPersistentReplication. By historical accident these classes are unrelated (one isn't a superclass of the other), but they have almost the same API.

Filtered Replications

It's pretty common to want to replicate only a subset of documents, especially when pulling from a huge cloud database down to a limited mobile device. For this purpose, TouchDB (like CouchDB) supports user-defined filter functions in replications. A filter function is registered with a name; it takes a document's contents as a parameter and simply returns true or false to indicate whether it should be replicated.

Filtered Pull

Filter functions are run on the source database. In a pull, that would be the remote CouchDB server, so that server must have the appropriate filter function. CouchDB filters are written in JavaScript (usually) and stored in design documents. The API documentation on the CouchDB wiki has details. This does mean that if you don't have admin access to the server, you are restricted to the set of already existing filter functions.

To use an existing remote filter function in a pull replication, set the replication's filter property to the filter's full name, which which is the design document name, a slash, and then the filter name:

pull.filter = @"grocery/sharedItems";

Filtered Push

During a push, on the other hand, the filter function runs locally in TouchDB. As with map/reduce functions, the filter function is nominally associated with a design document, but has to be specified at runtime as a native block pointer. Here's an example of defining a filter function that passes only documents with a "shared" property with a value of true:

CouchDesignDocument* design = [database designDocumentWithName: @"grocery"];
[design defineFilterNamed: @"sharedItems"
                    block: ^(NSDictionary* doc) {
						return [[doc objectForKey: @"shared"] booleanValue];
				    }];

This function can then be plugged into a push replication using its full name:

push.filter = @"grocery/sharedItems";

Parameterized Filters

Filter functions can be made more general-purpose by taking parameters. For example, a filter could pass documents whose "owner" property has a particular value, allowing the user name to be specified by the replication. That way there doesn't have to be a separate filter for every user.

To specify parameters in CouchCocoa, set the filterParams property of the replication object. Its value is a dictionary that maps parameter names to values. The dictionary must be JSON-compatible, so the values can be any type allowed by JSON.

CouchDB-based filter functions access the parameters through the HTTP request object (the query property of the req argument to the function).

Authentication

It's likely that the remote database TouchDB replicates with will require authentication; particularly for a push, since the server is unlikely to accept anonymous writes! In this case the replicator will need to log into the remote server on your behalf.

SECURITY TIP: CouchDB only supports HTTP "Basic" auth, not the more secure Digest auth. Since Basic auth sends the password in easily-readable form, it is only safe to use it over an HTTPS (SSL) connection, or over an isolated network you're confident has full security. So before configuring authentication, please make sure the remote database URL has the https: scheme.

You'll need to register login credentials for the replicator to use. There are several ways to do this, and most of them use the standard credential mechanism provided by the Foundation framework.

Hardcoded Username/Password

The simplest but least-secure way to store credentials is to use the standard syntax for embedding them in the URL of the remote database, for example

https://frank:[email protected]/database/

This URL specifies a username frank and password s33kr1t. If you use this as the remote URL when creating a replication, TouchDB will know to use the included credentials. The drawback of course is that the password is easily readable by anything with access to your app's files. (This is less of a big deal on a sandboxed platform like iOS than it is on Android or Mac OS.)

Using The Credential Store

The better way to store credentials is in the NSURLCredentialStore, which is a Cocoa system API that can store credentials either in memory or in the secure (encrypted) Keychain. They will then get used automatically when there’s a connection to the matching server.

Here’s an example of how to register a credential for a database on Cloudant.com. First create a credential object containing the username and password, as well as an indication of the persistence with which they should be stored:

NSURLCredential* cred;
cred = [NSURLCredential credentialWithUser: @"frank"
                                  password: @"s33kr1t"
                               persistence: NSURLCredentialPersistencePermanent];

Then create a "protection space", which defines the URLs that the credential applies to:

NSURLProtectionSpace* space;
space = [[[NSURLProtectionSpace alloc] initWithHost: @"example.cloudant.com"
                                               port: 443
                                           protocol: @"https"
                                              realm: @"Cloudant Private Database"
                               authenticationMethod: NSURLAuthenticationMethodDefault]
         autorelease];

Finally, register the credential for the protection space:

[[NSURLCredentialStorage sharedCredentialStorage] setDefaultCredential: cred
                                                    forProtectionSpace: space];

This is best done right after the user has entered her name/password in your config UI. Since this example specified permanent persistence, the credential store will write the password securely to the Keychain. From then on, NSURLConnection (Cocoa's underlying HTTP client engine) will find it when it needs to authenticate with that same server.

The Keychain is a secure place to store secrets: it's encrypted with a key derived from the user's iOS passcode, and managed only by a single trusted OS process. But if you don’t want the password stored to disk, use NSURLCredentialPersistenceForSession instead. But then you’ll need to call the above code on every launch, begging the question of where you get the password from; the alternatives are generally less secure than the Keychain.

NOTE: The OS is pretty picky about the parameters of the protection space. If they don’t match exactly — including the port and the realm string — the credentials won’t be used and the sync will fail with a 401 error. This is annoying to troubleshoot. In case of mysterious auth failures, double-check the all the credential's and protection space's spelling and port numbers!

What's My Realm?

If you need to figure out the actual realm string for the server, you can use curl or an equivalent HTTP client tool to examine the "WWW-Authenticate" response header for an auth failure:

$ curl -i -X POST http://example.cloudant.com/dbname/
HTTP/1.1 401 Unauthorized
WWW-Authenticate: Basic realm="Cloudant Private Database"

$ curl -i -X PUT https://example.iriscouch.com/dbname
HTTP/1.1 401 Unauthorized
WWW-Authenticate: Basic realm="administrator"

OAuth

OAuth is a complex and confusing protocol that, among other things, allows a user to use an identity from one site (such as Google or Facebook) to authenticate to another site (such as a CouchDB server) without having to trust the relying site with the user's password.

CouchDB supports OAuth version 1 (but not yet the newer OAuth 2) for client authentication, so if this has been configured in your upstream database, you can replicate with it by providing OAuth tokens:

replication.OAuth = @{ @"consumer_secret": consumerSecret,
	 				   @"consumer_key": consumerKey,
					   @"token_secret": tokenSecret,
					   @"token": token };

Getting these four values is somewhat tricky, involving authenticating with the origin server (the site at which the user has an account/identity.) Usually you'll use an OAuth client library to do the hard work, such as Google's or Facebook's.

OAuth tokens expire after some time, so if you install them into a persistent replication, you'll still need to call the client library periodically to validate them, and if they're updated you'll need to update them in the replication settings.

Guide: Replication

8. Replication

Creating A Replication

Monitoring Replication Progress

One-Shot Replications

Filtered Replications

Filtered Pull

Filtered Push

Parameterized Filters

Authentication

Hardcoded Username/Password

Using The Credential Store

What's My Realm?

OAuth

Further Reading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally