Skip to content

Commit 9c481aa

Browse files
BenHuddlestondaverigby
authored andcommitted
[BP] MB-42918: Do not replace committed item with pending at add
Currently when we do an add we can replace certain committed items with pending ones. One of these cases is a when we have an unpersisted delete. If we replace this with a pending StoredValue then we allow a subsequent get to find no committed StoredValue in the HashTable and potentially do a BgFetch. This BgFetch may return the alive document back to the HashTable/client which it should not do. Fix this by creating a new StoredValue if we would have replace a committed one with a prepared one. Change-Id: Ide30894867ec9710ab0d334a97acd5c6c8d009ff Reviewed-on: http://review.couchbase.org/c/kv_engine/+/142093 Well-Formed: Build Bot <[email protected]> Reviewed-by: Dave Rigby <[email protected]> Tested-by: Build Bot <[email protected]>
1 parent ec21fca commit 9c481aa

File tree

2 files changed

+71
-7
lines changed

2 files changed

+71
-7
lines changed

engines/ep/src/vbucket.cc

Lines changed: 15 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -2992,12 +2992,8 @@ void VBucket::deletedOnDiskCbk(const Item& queuedItem, bool deleted) {
29922992
* be done to ensure that the item count is accurate in the case of full
29932993
* eviction. We should only decrement the counter for committed (via
29942994
* mutation or commit) items as we only increment for these.
2995-
* Note this is done irrespective of if a StoredValue was found in the
2996-
* HashTable - the SV may have already been replaced (e.g. with
2997-
* a pending SyncWrite). We "know" the delete happened on disk thus
2998-
* should decrement the total number of items.
29992995
*/
3000-
if (queuedItem.isCommitted()) {
2996+
if (v && queuedItem.isCommitted()) {
30012997
decrNumTotalItems();
30022998
++opsDelete;
30032999
}
@@ -3412,7 +3408,16 @@ std::pair<AddStatus, boost::optional<VBNotifyCtx>> VBucket::processAdd(
34123408

34133409
std::pair<AddStatus, VBNotifyCtx> rv = {AddStatus::Success, {}};
34143410

3415-
if (v) {
3411+
// We cannot replace a committed SV with a pending one. If we were to do
3412+
// so then a delete that has not yet been persisted could be replaced
3413+
// with a prepare. A subsequent get could trigger a bg fetch that may
3414+
// return the old (not deleted) document from disk if it runs before the
3415+
// flusher. As such, we must keep the unpersisted delete and add a new
3416+
// prepare for the SyncWrite. Any get will see the unpersisted delete
3417+
// and return KEYNOENT.
3418+
auto replacingCommittedItemWithPending =
3419+
v && v == htRes.committed && itm.isPending();
3420+
if (v && !replacingCommittedItemWithPending) {
34163421
if (v->isTempInitialItem() && eviction == EvictionPolicy::Full &&
34173422
maybeKeyExists) {
34183423
// Need to figure out if an item exists on disk
@@ -3436,7 +3441,8 @@ std::pair<AddStatus, boost::optional<VBNotifyCtx>> VBucket::processAdd(
34363441
std::tie(v, std::ignore, rv.second) =
34373442
updateStoredValue(htRes.getHBL(), *v, itm, queueItmCtx);
34383443
} else {
3439-
if (itm.getBySeqno() != StoredValue::state_temp_init) {
3444+
if (itm.getBySeqno() != StoredValue::state_temp_init &&
3445+
!replacingCommittedItemWithPending) {
34403446
if (eviction == EvictionPolicy::Full && maybeKeyExists) {
34413447
return {AddStatus::AddTmpAndBgFetch, VBNotifyCtx()};
34423448
}
@@ -3456,6 +3462,8 @@ std::pair<AddStatus, boost::optional<VBNotifyCtx>> VBucket::processAdd(
34563462

34573463
if (v->isTempItem()) {
34583464
rv.first = AddStatus::BgFetch;
3465+
} else if (replacingCommittedItemWithPending) {
3466+
rv.first = AddStatus::UnDel;
34593467
}
34603468
}
34613469

engines/ep/tests/module_tests/evp_store_test.cc

Lines changed: 56 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -633,6 +633,62 @@ TEST_P(EPStoreEvictionTest, xattrExpiryOnFullyEvictedItem) {
633633
<< "The foo attribute should be gone";
634634
}
635635

636+
TEST_P(EPStoreEvictionTest, UnDelWithPrepare) {
637+
if (GetParam() == "value_only") {
638+
return;
639+
}
640+
641+
store->setVBucketState(vbid, vbucket_state_active, {{"topology", nlohmann::json::array({{"active", "replica"}})}});
642+
643+
// 1) Store, delete, and persist the item. We need to do this to ensure that
644+
// the bloom filter tells us to go to disk when we try gets if there is
645+
// no delete in the HashTable.
646+
auto key = makeStoredDocKey("key");
647+
storeAndDeleteItem(vbid, key, "value");
648+
auto vb = store->getVBucket(vbid);
649+
EXPECT_EQ(0, vb->getNumItems());
650+
651+
// 1) Store the new item and persist
652+
store_item(vbid, key, "value");
653+
flushVBucketToDiskIfPersistent(vbid, 1);
654+
EXPECT_EQ(1, vb->getNumItems());
655+
656+
// 2) Store the new item but don't persist it yet. We want to test what
657+
// happens when it's dirty.
658+
delete_item(vbid, key);
659+
660+
// 1 because we haven't persisted the delete yet
661+
EXPECT_EQ(1, vb->getNumItems());
662+
663+
// 3) Get now returns not found
664+
auto options = static_cast<get_options_t>(
665+
QUEUE_BG_FETCH | HONOR_STATES | TRACK_REFERENCE | DELETE_TEMP |
666+
HIDE_LOCKED_CAS | TRACK_STATISTICS);
667+
auto gv = getInternal(key, vbid, cookie, ForGetReplicaOp::No, options);
668+
EXPECT_EQ(ENGINE_KEY_ENOENT, gv.getStatus());
669+
670+
// 4) Add prepare
671+
auto prepare = makePendingItem(key, "value");
672+
EXPECT_EQ(ENGINE_SYNC_WRITE_PENDING, store->add(*prepare, cookie));
673+
//EXPECT_EQ(ENGINE_SYNC_WRITE_PENDING, addItem(*prepare, cookie));
674+
675+
// 1 because we haven't persisted the delete yet
676+
EXPECT_EQ(1, vb->getNumItems());
677+
678+
// 5) Check that the HashTable state is now correct
679+
{
680+
auto htRes = vb->ht.findForUpdate(key);
681+
ASSERT_TRUE(htRes.committed);
682+
EXPECT_TRUE(htRes.committed->isDeleted());
683+
EXPECT_TRUE(htRes.pending);
684+
}
685+
686+
// @TODO RDB: Rocks item counting is broken and overcounts assuming
687+
// everything is a new item
688+
flushVBucketToDiskIfPersistent(vbid, 2);
689+
EXPECT_EQ(0, vb->getNumItems());
690+
}
691+
636692
/**
637693
* Verify that when getIf is used it only fetches the metdata from disk for
638694
* the filter, and not the complete document.

0 commit comments

Comments
 (0)