Skip to content

Commit 43c3197

Browse files
BenHuddlestondaverigby
authored andcommitted
[BP] MB-42610: Do not expire prepares when committed items exist
Currently if a committed item exists then it's possible for the expiry code to attempt to expire a prepare. This obviously breaks durability and it can lead to us having two committed items in the HashTable for the same key which will throw assertions the next time we lookup that key. This can only happen in a very specific and limited set of circumstances. The vBucket in question must go down having only partially persisted a Disk snapshot. The vBucket must then flip from replica to active (a situation that would lead to data loss) then we must attempt to expire an item (via compaction or the pager). This expiration could find a prepare in the HashTable and attempt to expire it if the item has the same cas (this should only be possible when we load a completed prepare). Fix this issue by skipping expiration if we find a prepare with the same cas of our item in the HashTable. Change-Id: I1141b0090b1ff65eabf611da1b478b72bd284c78 Reviewed-on: http://review.couchbase.org/c/kv_engine/+/142089 Well-Formed: Build Bot <[email protected]> Tested-by: Build Bot <[email protected]> Reviewed-by: Dave Rigby <[email protected]>
1 parent 0841dbc commit 43c3197

File tree

2 files changed

+88
-0
lines changed

2 files changed

+88
-0
lines changed

engines/ep/src/vbucket.cc

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2412,6 +2412,16 @@ void VBucket::deleteExpiredItem(const Item& it,
24122412
return;
24132413
}
24142414

2415+
if (v->isPending()) {
2416+
// If cas is the same (above statement) and the HashTable has
2417+
// returned a prepare then we must have loaded a logically complete
2418+
// prepare (as we remove them from the HashTable at completion) for
2419+
// some reason. The prepare should be in a maybe visible state but
2420+
// it probably isn't a good idea to assert that here. In this case
2421+
// we must do nothing as we MUST commit any maybe visible prepares.
2422+
return;
2423+
}
2424+
24152425
if (v->isTempNonExistentItem() || v->isTempDeletedItem()) {
24162426
bool deleted = deleteStoredValue(hbl, *v);
24172427
if (!deleted) {

engines/ep/tests/module_tests/evp_store_single_threaded_test.cc

Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,7 @@
5151
#include "tests/module_tests/thread_gate.h"
5252
#include "tests/test_fileops.h"
5353
#include "vbucket_state.h"
54+
#include "vbucket_bgfetch_item.h"
5455

5556
#include "../couchstore/src/internal.h"
5657

@@ -4911,6 +4912,83 @@ TEST_P(STParamPersistentBucketTest,
49114912
testAbortDoesNotIncrementOpsDelete(false /*flusherDedup*/);
49124913
}
49134914

4915+
// @TODO move to EPBucketFullEvictionTest on merge forward to master
4916+
TEST_P(STParamPersistentBucketTest, ExpiryFindsPrepareWithSameCas) {
4917+
if (!fullEviction()) {
4918+
return;
4919+
}
4920+
setVBucketStateAndRunPersistTask(
4921+
vbid,
4922+
vbucket_state_active,
4923+
{{"topology", nlohmann::json::array({{"active", "replica"}})}});
4924+
4925+
// 1) Store prepare with expiry
4926+
auto key = makeStoredDocKey("a");
4927+
using namespace cb::durability;
4928+
auto pre = makePendingItem(key, "value", Requirements{Level::Majority, {}});
4929+
pre->setVBucketId(vbid);
4930+
pre->setExpTime(1);
4931+
4932+
EXPECT_EQ(ENGINE_SYNC_WRITE_PENDING, store->set(*pre, cookie));
4933+
flushVBucketToDiskIfPersistent(vbid, 1);
4934+
4935+
auto vb = store->getVBucket(vbid);
4936+
4937+
// 2) Seqno ack and commit the prepare
4938+
vb->seqnoAcknowledged(folly::SharedMutex::ReadHolder(vb->getStateLock()),
4939+
"replica",
4940+
1 /*prepareSeqno*/);
4941+
vb->processResolvedSyncWrites();
4942+
4943+
// 3) Fudge the current snapshot so that when we warmup we scan the entire
4944+
// snapshot for prepares (i.e. incomplete disk snapshot)
4945+
vb->checkpointManager->updateCurrentSnapshot(3, 3, CheckpointType::Disk);
4946+
flushVBucketToDiskIfPersistent(vbid, 1);
4947+
4948+
// 4) Restart and warmup
4949+
vb.reset();
4950+
resetEngineAndWarmup();
4951+
vb = store->getVBucket(vbid);
4952+
4953+
{
4954+
// Verify that the prepare is there and it's "MaybeVisible"
4955+
auto ret = vb->ht.findForUpdate(key);
4956+
ASSERT_TRUE(ret.pending);
4957+
ASSERT_TRUE(ret.pending->isPreparedMaybeVisible());
4958+
4959+
// And that the commit is there too
4960+
ASSERT_TRUE(ret.committed);
4961+
}
4962+
4963+
// 5) Grab the item from disk just like the compactor would
4964+
vb_bgfetch_queue_t q;
4965+
vb_bgfetch_item_ctx_t ctx;
4966+
ctx.isMetaOnly = GetMetaOnly::No;
4967+
auto diskDocKey = makeDiskDocKey("a");
4968+
q[diskDocKey] = std::move(ctx);
4969+
store->getRWUnderlying(vbid)->getMulti(vbid, q);
4970+
EXPECT_EQ(ENGINE_SUCCESS, q[diskDocKey].value.getStatus());
4971+
4972+
// 6) Callback from the "compactor" with the item to try and expire it. We
4973+
// could also pretend to be the pager here.
4974+
ASSERT_EQ(0, vb->numExpiredItems);
4975+
vb->deleteExpiredItem(*q[diskDocKey].value.item, 2, ExpireBy::Compactor);
4976+
4977+
// Item expiry cannot take place if the MaybeVisible prepare exists.
4978+
EXPECT_EQ(0, vb->numExpiredItems);
4979+
{
4980+
// Verify that the prepare is there and it's "MaybeVisible". Before the
4981+
// fix deleteExpiredItem would select and replace the prepare which is
4982+
// incorrect and causes us to have two committed items in the HashTable.
4983+
auto ret = vb->ht.findForUpdate(key);
4984+
ASSERT_TRUE(ret.pending);
4985+
ASSERT_TRUE(ret.pending->isPreparedMaybeVisible());
4986+
4987+
// And that the commit is there too
4988+
ASSERT_TRUE(ret.committed);
4989+
}
4990+
}
4991+
49144992
INSTANTIATE_TEST_CASE_P(Persistent,
49154993
STParamPersistentBucketTest,
49164994
STParameterizedBucketTest::persistentConfigValues(),

0 commit comments

Comments
 (0)