Skip to content

WriteOnceBucketIndexingBug

Fred Dushin edited this page Aug 6, 2015 · 2 revisions

Bug Description

RIAK-1937 encapsulates a bug in Riak Search (Yokozuna) whereby Yokozuna indexing is not happening properly with Write Once buckets. Puts occur unusually quickly when Write Once buckets are configured with a search index, and data put into these buckets is not immediately available for query. If AAE is turned on, then data will eventually get indexed in Solr, but via the AAE mechanism, which is less than optimal and can cause nodes to become unresponsive or crash.

This issue was initially reported in https://github.com/basho/yokozuna/issues/512

Write Once buckets were added in Riak 2.1 to provide a fast way to write data to a Riak back-end, effectively by circumventing the Put FSM and sending write requests directly to the Vnodes in the pref-list for the entry. Unfortunately, circumventing the Put FSM also circumvented indexing, which is a bug.

Fix Description

The fix requires a change to Yokozuna and to Riak K/V.

In Yokozuna, we the function yz_kv:index_binary/5, which takes a Bucket, Key, and Riak Object in binary form, as well as the same parameters as those in yz_kv:index/3. The index_binary function will decode the Riak Object (requiring a Bucket and Key) using riak_object:from_binary/3, and otherwise follows the same logic as yz_kv:index/3.

The Yokozuna changes can be found at https://github.com/basho/yokozuna/pull/529/files

Riak/KV is modified to call yz_kv:index_binary/5 in the VNode during Write Once puts.

The Riak/KV changes can be found at https://github.com/basho/riak_kv/pull/1159/files

Test Addition

The yz_pb system test has been modified to verify that data written to write once buckets can be queried after being written. Without the changes described above, this part of the test fails.

Basho Bench Tests

Basho Bench has been run against:

  • 5 node cluster
  • CentOS6 Virtual Machines (VMWare Fusion 7.1.1, over MacOS 10.10.3)
  • Apple Macbook Pro (Early 2011) 2.0 Ghz i7; 16GB RAM; 500GB OWC Mercury EXTREME Pro 6G SSD

Basho Bench driver run from separate machine, using load-fruit.config as described in https://github.com/basho/yokozuna/blob/develop/docs/BENCHMARKING.md

Baseline Measurements

Baseline measurements were run against the Riak 2.1 branch (as of 8/5/15), starting with an empty Riak database, and using the load-fruit configuration described above. The throughput and latency summary is illustrated in the following charts:

Baseline Summary

Bugfix Measurements

The Riak cluster was stopped, data in the cluster was cleared out, and the riak_kv and yokozuna applications were modified with the branch versions containing the bug fixes described above. The throughput and latency summary with these changes is illustrated in the following charts:

Bugfix Summary

Discussion

TODO

Clone this wiki locally