Skip to content
Robbie Hanson edited this page Nov 10, 2013 · 12 revisions

There's an extension for that.

The full text search extension for YapDatabase is built atop the FTS module for SQLite. The FTS module was contributed by Google. As you might imagine, it's pretty fast. Especially when you compare it to Core Data:

I wrote a simple sample app in order to see what kind of performance gain I might see from using a separate full text index in SQLite. The app loads 1682 text files from textfiles.com (where else?), that’s about 42mb of plain text, into both Core Data model objects and an SQLite database. I then timed how long it took to find a single word using both a full text query and a core data fetch request. For one query running on the main thread on slowest device I have (a 4th gen. iPod touch) the Core Data fetch took 9.34 seconds while the SQLite query only took 1.48 seconds. (link)

With the full text search extension, you can have search up and running in minutes.

The primary header files can be found here (for YapDatabase):

And here (for YapCollectionsDatabase):

Intro to extensions

Extensions work seamlessly behind the scenes. You don't have to change the way you write objects to the database. Extensions automatically take part in the transaction architecture, and automatically update themselves as you make changes to the database.

Basically, extensions plug into the database system and are notified whenever you add, remove or update objects. The extensions then automatically update themselves (within the same atomic transaction). Further, you can add or remove extensions at any time, even while using the database.

The end result is that your extensions can come and go with ease. You can change your extensions around with ease as you're developing. But the objects in your database stay the same. And, just as important, the code you have that inserts, deletes & updates objects in the database can stay the same.

For a deeper understanding of the extensions architecture, see the Extensions article.

Creating the FTS extension

The first step is deciding what you want to search. That is, deciding what you want the FTS module to index. For example, if you're storing tweets you may want to index the tweet content and the author name. If you're storing emails then you may want to index the sender, subject and content.

So you can index multiple properties if you want. Or just one. Whatever you require.

// Maybe this...
NSString *propertiesToIndexForMySearch = @[ @"author", @"tweet" ];

// Or maybe this...
NSString *propertiesToIndexForMySearch = @[ @"author", @"subject", @"content" ];

The next step is to create a block which will extract the properties you want to index from an object in the database. It will look something like this:

YapDatabaseFullTextSearchBlockType blockType = YapDatabaseFullTextSearchBlockTypeWithObject;
YapDatabaseFullTextSearchWithObjectBlock block = ^(NSMutableDictionary *dict, NSString *key, id object){

    if ([object isKindOfClass:[Tweet class]])
    {
        __unsafe_unretained Tweet *tweet = (Tweet *)object;

        [dict setObject:tweet.author forKey:@"author"];
        [dict setObject:tweet.tweet forKey:@"tweet"];
    }
    else
    {
        // Don't need to index this item.
        // So we simply don't add anything to the dict.
    }
};

The first parameter to the block is a mutable dictionary. This is where you'll add the proper fields you want to index. The other parameters correspond to a row in the database. So all you have to do is inspect the row, decide if you want to index anything, and if so then extract the proper information and add it to the dictionary.

Next we create the extension by feeding it all the configuration from above.

YapDatabaseFullTextSearch *fts =
	    [[YapDatabaseFullTextSearch alloc] initWithColumnNames:propertiesToIndexForMySearch
	                                                     block:block
	                                                 blockType:blockType];

And lastly we need to plug the extension into the database system.

[database registerExtension:fts withName:@"fts"];

Understanding the internals

You may be wondering how the block is used, and how the extension automatically updates itself. It's quite simple really.

Let's say you have an existing database with 1,000 tweets. You add the code to create and register the FTS extension during app launch, as part of the database setup code. The very first time this code is run, and you register the FTS extension, the extension will enumerate over the 1,000 items in your database, invoking the block for each one. Thus it creates its initial index.

If you then quit your app and relaunch, the extension knows its up-to-date and doesn't have to do anything. That is, its index is current. If you then execute a read-write transaction, and add a new tweet, then the FTS extension automatically invokes the block for this new row. Similarly, if you delete a tweet, the FTS extension knows how to remove that row from its index (if needed).

So basically a few lines of code is all you need to create the indexes for your search. And the extension automatically keeps it up-to-date. Without any other changes to your app.

Performing searches

Once you've created your extension, and registered it, you're ready to start searching. Here's the basics:

[connection readWithBlock:^(YapDatabaseReadTransaction *transaction) {

    // Find matches for: board meeting

    [[transaction ext:@"fts"] enumerateKeysMatching:@"board meeting"
                                         usingBlock:^(NSString *key, BOOL *stop) {
        // ...
    }];
}];

You'll notice you access the extension within a transaction. This is in keeping with the rest of the database architecture. Which also means you get "atomic searches", and other benefits of transactions such as LongLivedReadTransactions (to simplify database access on the main thread).

In terms of search options, the extension supports everything that the SQLite FTS extension supports. Which is a LOT... token queries, token prefixes, phrases, NEAR, AND, OR, NOT ...

Rather than try to document all the possibilities here, I'll point you to the excellent examples and documentation here: SQLite FTS3 and FTS4 Extensions

Here's a few really simple examples:

[connection readWithBlock:^(YapDatabaseReadTransaction *transaction) {

    // tweet.author matches: john
    // and row matches: board meeting

    [[transaction ext:@"fts"] enumerateKeysMatching:@"author:john board meeting"
                                         usingBlock:^(NSString *key, BOOL *stop) {
        // ...
    }];

    // tweet.author matches: john
    // tweet.tweet contains phrase: "board meeting"

    [[transaction ext:@"fts"] enumerateKeysMatching:@"author:john tweet:\\"board meeting\\""
                                         usingBlock:^(NSString *key, BOOL *stop) {
        // ...
    }];

    // find any tweets with the words "meeting" or "conference"

    [[transaction ext:@"fts"] enumerateKeysMatching:@"meeting OR conference"
                                         usingBlock:^(NSString *key, BOOL *stop) {
        // ...
    }];
}];

When searching, you can choose to enumerate matches however you want. You can enumerate just the keys, or also the object, or the metadata, or the whole row...

From the YapDatabaseFullTextSearchTransaction.h:

// Regular query matching

- (void)enumerateKeysMatching:(NSString *)query
                   usingBlock:(void (^)(NSString *key, BOOL *stop))block;

- (void)enumerateKeysAndMetadataMatching:(NSString *)query
                              usingBlock:(void (^)(NSString *key, id metadata, BOOL *stop))block;

- (void)enumerateKeysAndObjectsMatching:(NSString *)query
                             usingBlock:(void (^)(NSString *key, id object, BOOL *stop))block;

- (void)enumerateRowsMatching:(NSString *)query
                   usingBlock:(void (^)(NSString *key, id object, id metadata, BOOL *stop))block;

Also, snippets are supported! So if you're searching large documents, you can use the snippets feature to get fragments of surrounding text that match the search query. Very helpful to display to the user so they can easily identify the context of each search result, and quickly find exactly what they're looking for.

// Query matching + Snippets

- (void)enumerateKeysMatching:(NSString *)query
           withSnippetOptions:(YapDatabaseFullTextSearchSnippetOptions *)options
                   usingBlock:(void (^)(NSString *snippet, NSString *key, BOOL *stop))block;

- (void)enumerateKeysAndMetadataMatching:(NSString *)query
                      withSnippetOptions:(YapDatabaseFullTextSearchSnippetOptions *)options
                              usingBlock:(void (^)(NSString *snippet, NSString *key, id metadata, BOOL *stop))block;

- (void)enumerateKeysAndObjectsMatching:(NSString *)query
                     withSnippetOptions:(YapDatabaseFullTextSearchSnippetOptions *)options
                             usingBlock:(void (^)(NSString *snippet, NSString *key, id object, BOOL *stop))block;

- (void)enumerateRowsMatching:(NSString *)query
           withSnippetOptions:(YapDatabaseFullTextSearchSnippetOptions *)options
                   usingBlock:(void (^)(NSString *snippet, NSString *key, id object, id metadata, BOOL *stop))block;

Clone this wiki locally