Skip to content

Commit 4229f6d

Browse files
author
Maysam Yabandeh
committed
Fix SeekForPrev bug with Partitioned Filters and Prefix (facebook#5907)
Summary: Partition Filters make use of a top-level index to find the partition that might have the bloom hash of the key. The index is with internal key format (before format version 3). Each partition contains the i) blooms of the keys in that range ii) bloom of prefixes of keys in that range, iii) the bloom of the prefix of the last key in the previous partition. When ::SeekForPrev(key), we first perform a prefix bloom test on the SST file. The partition however is identified using the full internal key, rather than the prefix key. The reason is to be compatible with the internal key format of the top-level index. This creates a corner case. Example: - SST k, Partition N: P1K1, P1K2 - SST k, top-level index: P1K2 - SST k+1, Partition 1: P2K1, P3K1 - SST k+1 top-level index: P3K1 When SeekForPrev(P1K3), it should point us to P1K2. However SST k top-level index would reject P1K3 since it is out of range. One possible fix would be to search with the prefix P1 (instead of full internal key P1K3) however the details of properly comparing prefix with full internal key might get complicated. The fix we apply in this PR is to look into the last partition anyway even if the key is out of range. Pull Request resolved: facebook#5907 Differential Revision: D17889918 Pulled By: maysamyabandeh fbshipit-source-id: 169fd7b3c71dbc08808eae5a8340611ebe5bdc1e
1 parent 73a35c6 commit 4229f6d

File tree

2 files changed

+17
-2
lines changed

2 files changed

+17
-2
lines changed

table/block_based/partitioned_filter_block.cc

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -192,7 +192,12 @@ BlockHandle PartitionedFilterBlockReader::GetFilterPartitionHandle(
192192
index_key_includes_seq(), index_value_is_full());
193193
iter.Seek(entry);
194194
if (UNLIKELY(!iter.Valid())) {
195-
return BlockHandle(0, 0);
195+
// entry is larger than all the keys. However its prefix might still be
196+
// present in the last partition. If this is called by PrefixMayMatch this
197+
// is necessary for correct behavior. Otherwise it is unnecessary but safe.
198+
// Assuming this is an unlikely case for full key search, the performance
199+
// overhead should be negligible.
200+
iter.SeekToLast();
196201
}
197202
assert(iter.Valid());
198203
BlockHandle fltr_blk_handle = iter.value().handle;

table/block_based/partitioned_filter_block_test.cc

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -327,7 +327,7 @@ TEST_P(PartitionedFilterBlockTest, SamePrefixInMultipleBlocks) {
327327
std::unique_ptr<PartitionedIndexBuilder> pib(NewIndexBuilder());
328328
std::unique_ptr<PartitionedFilterBlockBuilder> builder(
329329
NewBuilder(pib.get(), prefix_extractor.get()));
330-
const std::string pkeys[3] = {"p-key1", "p-key2", "p-key3"};
330+
const std::string pkeys[3] = {"p-key10", "p-key20", "p-key30"};
331331
builder->Add(pkeys[0]);
332332
CutABlock(pib.get(), pkeys[0], pkeys[1]);
333333
builder->Add(pkeys[1]);
@@ -344,6 +344,16 @@ TEST_P(PartitionedFilterBlockTest, SamePrefixInMultipleBlocks) {
344344
/*no_io=*/false, &ikey_slice, /*get_context=*/nullptr,
345345
/*lookup_context=*/nullptr));
346346
}
347+
// Non-existent keys but with the same prefix
348+
const std::string pnonkeys[4] = {"p-key9", "p-key11", "p-key21", "p-key31"};
349+
for (auto key : pnonkeys) {
350+
auto ikey = InternalKey(key, 0, ValueType::kTypeValue);
351+
const Slice ikey_slice = Slice(*ikey.rep());
352+
ASSERT_TRUE(reader->PrefixMayMatch(
353+
prefix_extractor->Transform(key), prefix_extractor.get(), kNotValid,
354+
/*no_io=*/false, &ikey_slice, /*get_context=*/nullptr,
355+
/*lookup_context=*/nullptr));
356+
}
347357
}
348358

349359
TEST_P(PartitionedFilterBlockTest, OneBlockPerKey) {

0 commit comments

Comments
 (0)