Skip to content

Commit ef05dad

Browse files
jameseh96veselink1
andcommitted
MB-54221: Make LinkedList field read for stats atomic
A TSAN failure has been seen: WARNING: ThreadSanitizer: data race (pid=28207) Write of size 8 at 0x7b4400001a00 by thread T21: #0 cb::NonNegativeCounter<unsigned long, cb::DefaultUnderflowPolicy>::fetch_sub(long) ../platform/include/platform/non_negative_counter.h:142 (ep_testsuite+0x5bd248) #1 cb::NonNegativeCounter<unsigned long, cb::DefaultUnderflowPolicy>::operator--() ../platform/include/platform/non_negative_counter.h:175 (ep_testsuite+0x9a535d) #2 BasicLinkedList::purgeListElem(boost::intrusive::list_iterator<boost::intrusive::mhtraits<OrderedStoredValue, boost::intrusive::list_member_hook<>, &OrderedStoredValue::seqno_hook>, false>, bool, bool) /home/couchbase/jenkins/workspace/kv_engine.threadsanitizer_master/kv_engine/engines/ep/src/linked_list.cc:412 (ep_testsuite+0xa948c1) Previous read of size 8 at 0x7b4400001a00 by main thread (mutexes: write M3032, write M1005282691001636776): #0 cb::NonNegativeCounter<unsigned long, cb::DefaultUnderflowPolicy>::load() const ../platform/include/platform/non_negative_counter.h:89 (ep_testsuite+0x5bcd35) #1 cb::NonNegativeCounter<unsigned long, cb::DefaultUnderflowPolicy>::operator unsigned long() const ../platform/include/platform/non_negative_counter.h:85 (ep_testsuite+0x687165) #2 BasicLinkedList::getNumStaleItems() const /home/couchbase/jenkins/workspace/kv_engine.threadsanitizer_master/kv_engine/engines/ep/src/linked_list.cc:307 (ep_testsuite+0xa94a7c) This is seen because we read the non-atomic numStaleItems from EphemeralVBucket::CountVisitor without a lock, while the value could be changed from another thread. We don't want to have to lock the list just to read the stats. Make the fields read for stats atomic to resolve the race condition. They are only used for stats; it does not need to be read with any particular consistency with other values. If it had, acquiring the listWriteLock would be required. Change-Id: Ie557b1363ffd987ef108c19e2bfc200481c6e5f1 Co-Authored-By: Vesko Karaganev <[email protected]> Reviewed-on: https://review.couchbase.org/c/kv_engine/+/189597 Tested-by: Vesko Karaganev <[email protected]> Reviewed-by: Dave Rigby <[email protected]>
1 parent b14acd3 commit ef05dad

File tree

1 file changed

+10
-4
lines changed

1 file changed

+10
-4
lines changed

engines/ep/src/linked_list.h

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -244,8 +244,10 @@ class BasicLinkedList : public SequenceList {
244244
*
245245
* This should be non-decrementing, apart from a rollback where it will be
246246
* reset.
247+
*
248+
* Atomic as read for stats without taking the listWriteLock.
247249
*/
248-
Monotonic<seqno_t> highestPurgedDeletedSeqno;
250+
AtomicMonotonic<seqno_t> highestPurgedDeletedSeqno;
249251

250252
/**
251253
* Seqno of the last visible item. Accounts only committed sync-writes (ie,
@@ -265,16 +267,20 @@ class BasicLinkedList : public SequenceList {
265267
* Indicates the number of elements in the list that are stale (old,
266268
* duplicate values). Stale items are owned by the list and hence must
267269
* periodically clean them up.
270+
*
271+
* Atomic as read for stats without taking the listWriteLock.
268272
*/
269-
cb::NonNegativeCounter<uint64_t> numStaleItems;
273+
cb::AtomicNonNegativeCounter<uint64_t> numStaleItems;
270274

271275
/**
272276
* Indicates the number of logically deleted items in the list.
273277
* Since we are append-only, distributed cache supporting incremental
274278
* replication, we need to keep deleted items for while and periodically
275-
* purge them
279+
* purge them.
280+
*
281+
* Atomic as read for stats without taking the listWriteLock.
276282
*/
277-
cb::NonNegativeCounter<uint64_t> numDeletedItems;
283+
cb::AtomicNonNegativeCounter<uint64_t> numDeletedItems;
278284

279285
/* Used only to log debug messages */
280286
const Vbid vbid;

0 commit comments

Comments
 (0)