Skip to content

Commit 9752914

Browse files
yfeldblumfacebook-github-bot
authored andcommitted
revise MicroLock not to access bytes outside of itself
Summary: The `MicroLock` tests fail under libc++ with asan-abort `stack-buffer-overflow`. Example: ``` SCARINESS: 32 (4-byte-read-stack-buffer-overflow) #0 unsigned int std::__2::__cxx_atomic_load[abi:ue170004]<unsigned int>(std::__2::__cxx_atomic_base_impl<unsigned int> const*, std::__2::memory_order) libcxx/include/c++/v1/__atomic/cxx_atomic_impl.h:375 #1 std::__2::__atomic_base<unsigned int, false>::load[abi:ue170004](std::__2::memory_order) const libcxx/include/c++/v1/__atomic/atomic_base.h:60 facebook#2 folly::MicroLockBase<1000u, 0u>::try_lock() folly/MicroLock.h:319 facebook#3 SmallLocks_MicroLockTryLock_Test::TestBody() folly/synchronization/test/SmallLocksTest.cpp:314 ``` The trouble is accessing the lock state as a 4-byte word, where the asan stack has allocated only 1 byte for the lock object - asan tracks this and aborts. It is UB for `MicroLock` to perform atomic stores or atomic read-modify-write operations on the entire 4-byte aligned word containing the `MicroLock` which may be concurrent with other accesses to other bytes in the word. The 4-byte aligned word accesses are necessary in order to support using the `MicroLock` itself as a futex, on Linux platforms. The trouble is that Linux does not provide futex operations on all sizes of integer, only on 32-bit (4-byte) integers. If Linux were to provide futex operations on all sizes of integer, we would immediately switch to 8-bit (1-byte) integers here. This diff fixes the UB and gets the tests passing under libc++ by switching to the `folly::ParkingLot`, as wrapped by `folly::atomic_notify_one` and `folly::atomic_wait` over `std::atomic<uint8_t>`, which indirects the 32-bit (4-byte) futex behind a sharded hashtable. Alternatives are possible for getting the tests passing, although these alternatives would not fix the UB. * We can disable sanitizers on even more functions, as well as switch to atomic builtins since the members of `std::atomic` are not marked as disabling the sanitizers. * We can workaround just in the unit-tests by padding the on-stack `MicroLock` instances. Reviewed By: praihan, ot, aary Differential Revision: D76612658 fbshipit-source-id: 75e9f03e63c85c30c6201b9bce5cd303c9a2b530
1 parent 3c24bb7 commit 9752914

File tree

2 files changed

+47
-102
lines changed

2 files changed

+47
-102
lines changed

third-party/folly/src/folly/MicroLock.cpp

Lines changed: 10 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -23,16 +23,12 @@
2323
namespace folly {
2424

2525
uint8_t MicroLockCore::lockSlowPath(
26-
uint32_t oldWord,
27-
detail::Futex<>* wordPtr,
28-
unsigned baseShift,
29-
unsigned maxSpins,
30-
unsigned maxYields) noexcept {
31-
uint32_t newWord;
26+
uint8_t oldWord, unsigned maxSpins, unsigned maxYields) noexcept {
27+
uint8_t newWord;
3228
unsigned spins = 0;
33-
uint32_t heldBit = 1 << baseShift;
34-
uint32_t waitBit = heldBit << 1;
35-
uint32_t needWaitBit = 0;
29+
uint8_t heldBit = 1;
30+
uint8_t waitBit = heldBit << 1;
31+
uint8_t needWaitBit = 0;
3632

3733
retry:
3834
if ((oldWord & heldBit) != 0) {
@@ -43,34 +39,34 @@ uint8_t MicroLockCore::lockSlowPath(
4339
// lock holder knows to FUTEX_WAKE us.
4440
newWord = oldWord | waitBit;
4541
if (newWord != oldWord) {
46-
if (!wordPtr->compare_exchange_weak(
42+
if (!atomic_ref(lock_).compare_exchange_weak(
4743
oldWord,
4844
newWord,
4945
std::memory_order_relaxed,
5046
std::memory_order_relaxed)) {
5147
goto retry;
5248
}
5349
}
54-
detail::futexWait(wordPtr, newWord, heldBit);
50+
atomic_wait(&atomic_ref(lock_).atomic(), newWord);
5551
needWaitBit = waitBit;
5652
} else if (spins > maxSpins) {
5753
// sched_yield(), but more portable
5854
std::this_thread::yield();
5955
} else {
6056
folly::asm_volatile_pause();
6157
}
62-
oldWord = wordPtr->load(std::memory_order_relaxed);
58+
oldWord = atomic_ref(lock_).load(std::memory_order_relaxed);
6359
goto retry;
6460
}
6561

6662
newWord = oldWord | heldBit | needWaitBit;
67-
if (!wordPtr->compare_exchange_weak(
63+
if (!atomic_ref(lock_).compare_exchange_weak(
6864
oldWord,
6965
newWord,
7066
std::memory_order_acquire,
7167
std::memory_order_relaxed)) {
7268
goto retry;
7369
}
74-
return decodeDataFromWord(newWord, baseShift);
70+
return decodeDataFromWord(newWord);
7571
}
7672
} // namespace folly

third-party/folly/src/folly/MicroLock.h

Lines changed: 37 additions & 88 deletions
Original file line numberDiff line numberDiff line change
@@ -21,10 +21,10 @@
2121
#include <cstdint>
2222
#include <utility>
2323

24-
#include <folly/CPortability.h>
2524
#include <folly/Optional.h>
2625
#include <folly/Portability.h>
27-
#include <folly/detail/Futex.h>
26+
#include <folly/synchronization/AtomicNotification.h>
27+
#include <folly/synchronization/AtomicRef.h>
2828

2929
namespace folly {
3030

@@ -51,12 +51,6 @@ namespace folly {
5151
* Unused bits in the lock can be used to store user data via
5252
* lockAndLoad() and unlockAndStore(), or LockGuardWithData.
5353
*
54-
* MicroLock uses a dirty trick: it actually operates on the full
55-
* 32-bit, four-byte-aligned bit of memory into which it is embedded.
56-
* It never modifies bits outside the ones it's defined to modify, but
57-
* it _accesses_ all the bits in the 32-bit memory location for
58-
* purposes of futex management.
59-
*
6054
* The MaxSpins template parameter controls the number of times we
6155
* spin trying to acquire the lock. MaxYields controls the number of
6256
* times we call sched_yield; once we've tried to acquire the lock
@@ -96,10 +90,6 @@ namespace folly {
9690
class MicroLockCore {
9791
protected:
9892
uint8_t lock_{};
99-
/**
100-
* Arithmetic shift required to get to the byte from the word.
101-
*/
102-
unsigned baseShift() const noexcept;
10393
/**
10494
* Mask for bit indicating that the flag is held.
10595
*/
@@ -109,19 +99,8 @@ class MicroLockCore {
10999
*/
110100
unsigned waitBit() const noexcept;
111101

112-
FOLLY_DISABLE_SANITIZERS static uint8_t lockSlowPath(
113-
uint32_t oldWord,
114-
detail::Futex<>* wordPtr,
115-
unsigned baseShift,
116-
unsigned maxSpins,
117-
unsigned maxYields) noexcept;
118-
119-
/**
120-
* The word (halfword on 64-bit systems) that this lock atomically operates
121-
* on. Although the atomic operations access 4 bytes, only the byte used by
122-
* the lock will be modified.
123-
*/
124-
detail::Futex<>* word() const noexcept;
102+
uint8_t lockSlowPath(
103+
uint8_t oldWord, unsigned maxSpins, unsigned maxYields) noexcept;
125104

126105
static constexpr unsigned kNumLockBits = 2;
127106
static constexpr uint8_t kLockBits =
@@ -140,40 +119,34 @@ class MicroLockCore {
140119
return static_cast<uint8_t>(data << kNumLockBits);
141120
}
142121

143-
static constexpr uint8_t decodeDataFromWord(
144-
uint32_t word, unsigned baseShift) noexcept {
145-
return static_cast<uint8_t>(
146-
static_cast<uint8_t>(word >> baseShift) >> kNumLockBits);
147-
}
148-
uint8_t decodeDataFromWord(uint32_t word) const noexcept {
149-
return decodeDataFromWord(word, baseShift());
122+
static constexpr uint8_t decodeDataFromWord(uint8_t word) noexcept {
123+
return static_cast<uint8_t>(word >> kNumLockBits);
150124
}
151-
static constexpr uint32_t encodeDataToWord(
152-
uint32_t word, unsigned shiftToByte, uint8_t value) noexcept {
153-
const uint32_t preservedBits = word & ~(kDataBits << shiftToByte);
154-
const uint32_t newBits = encodeDataToByte(value) << shiftToByte;
125+
static constexpr uint8_t encodeDataToWord(
126+
uint8_t word, uint8_t value) noexcept {
127+
const uint8_t preservedBits = word & ~(kDataBits);
128+
const uint8_t newBits = encodeDataToByte(value);
155129
return preservedBits | newBits;
156130
}
157131

158132
template <typename Func>
159-
FOLLY_DISABLE_SANITIZERS void unlockAndStoreWithModifier(
160-
Func modifier) noexcept;
133+
void unlockAndStoreWithModifier(Func modifier) noexcept;
161134

162135
public:
163136
/**
164137
* Loads the data stored in the unused bits of the lock atomically.
165138
*/
166-
FOLLY_DISABLE_SANITIZERS uint8_t
167-
load(std::memory_order order = std::memory_order_seq_cst) const noexcept {
168-
return decodeDataFromWord(word()->load(order));
139+
uint8_t load(
140+
std::memory_order order = std::memory_order_seq_cst) const noexcept {
141+
return decodeDataFromWord(atomic_ref(lock_).load(order));
169142
}
170143

171144
/**
172145
* Stores the data in the unused bits of the lock atomically. Since 2 bits are
173146
* used by the lock, the most significant 2 bits of the provided value will be
174147
* ignored.
175148
*/
176-
FOLLY_DISABLE_SANITIZERS void store(
149+
void store(
177150
uint8_t value,
178151
std::memory_order order = std::memory_order_seq_cst) noexcept;
179152

@@ -186,73 +159,51 @@ class MicroLockCore {
186159
void unlock() noexcept;
187160
};
188161

189-
inline detail::Futex<>* MicroLockCore::word() const noexcept {
190-
uintptr_t lockptr = (uintptr_t)&lock_;
191-
lockptr &= ~(sizeof(uint32_t) - 1);
192-
return (detail::Futex<>*)lockptr;
193-
}
194-
195-
inline unsigned MicroLockCore::baseShift() const noexcept {
196-
unsigned offset_bytes = (unsigned)((uintptr_t)&lock_ - (uintptr_t)word());
197-
198-
return static_cast<unsigned>(
199-
kIsLittleEndian ? CHAR_BIT * offset_bytes
200-
: CHAR_BIT * (sizeof(uint32_t) - offset_bytes - 1));
201-
}
202-
203162
inline unsigned MicroLockCore::heldBit() const noexcept {
204-
return 1U << (baseShift() + 0);
163+
return 1U << 0;
205164
}
206165

207166
inline unsigned MicroLockCore::waitBit() const noexcept {
208-
return 1U << (baseShift() + 1);
167+
return 1U << 1;
209168
}
210169

211-
FOLLY_PUSH_WARNING
212-
FOLLY_GCC_DISABLE_WARNING("-Wattributes") // inline + [[gnu::noinline]]
213170
inline void MicroLockCore::store(
214171
uint8_t value, std::memory_order order) noexcept {
215-
detail::Futex<>* wordPtr = word();
216-
217-
const auto shiftToByte = baseShift();
218-
auto oldWord = wordPtr->load(std::memory_order_relaxed);
172+
auto oldWord = atomic_ref(lock_).load(std::memory_order_relaxed);
219173
while (true) {
220-
auto newWord = encodeDataToWord(oldWord, shiftToByte, value);
221-
if (wordPtr->compare_exchange_weak(
174+
auto newWord = encodeDataToWord(oldWord, value);
175+
if (atomic_ref(lock_).compare_exchange_weak(
222176
oldWord, newWord, order, std::memory_order_relaxed)) {
223177
break;
224178
}
225179
}
226180
}
227-
FOLLY_POP_WARNING
228181

229182
template <typename Func>
230183
void MicroLockCore::unlockAndStoreWithModifier(Func modifier) noexcept {
231-
detail::Futex<>* wordPtr = word();
232-
uint32_t oldWord;
233-
uint32_t newWord;
184+
uint8_t oldWord;
185+
uint8_t newWord;
234186

235-
oldWord = wordPtr->load(std::memory_order_relaxed);
187+
oldWord = atomic_ref(lock_).load(std::memory_order_relaxed);
236188
do {
237189
assert(oldWord & heldBit());
238190
newWord = modifier(oldWord) & ~(heldBit() | waitBit());
239-
} while (!wordPtr->compare_exchange_weak(
191+
} while (!atomic_ref(lock_).compare_exchange_weak(
240192
oldWord, newWord, std::memory_order_release, std::memory_order_relaxed));
241193

242194
if (oldWord & waitBit()) {
243-
detail::futexWake(wordPtr, 1, heldBit());
195+
atomic_notify_one(&atomic_ref(lock_).atomic());
244196
}
245197
}
246198

247199
inline void MicroLockCore::unlockAndStore(uint8_t value) noexcept {
248-
unlockAndStoreWithModifier(
249-
[value, shiftToByte = baseShift()](uint32_t oldWord) {
250-
return encodeDataToWord(oldWord, shiftToByte, value);
251-
});
200+
unlockAndStoreWithModifier([value](uint8_t oldWord) {
201+
return encodeDataToWord(oldWord, value);
202+
});
252203
}
253204

254205
inline void MicroLockCore::unlock() noexcept {
255-
unlockAndStoreWithModifier([](uint32_t oldWord) { return oldWord; });
206+
unlockAndStoreWithModifier(identity);
256207
}
257208

258209
template <unsigned MaxSpins = 1000, unsigned MaxYields = 0>
@@ -264,9 +215,9 @@ class MicroLockBase : public MicroLockCore {
264215
* data, in which case reading and locking should be done in one atomic
265216
* operation.
266217
*/
267-
FOLLY_DISABLE_SANITIZERS uint8_t lockAndLoad() noexcept;
218+
uint8_t lockAndLoad() noexcept;
268219
void lock() noexcept { lockAndLoad(); }
269-
FOLLY_DISABLE_SANITIZERS bool try_lock() noexcept;
220+
bool try_lock() noexcept;
270221

271222
/**
272223
* A lock guard which allows reading and writing to the unused bits of the
@@ -315,13 +266,12 @@ bool MicroLockBase<MaxSpins, MaxYields>::try_lock() noexcept {
315266
// else has the lock (by looking at heldBit) or see our CAS succeed.
316267
// A failed CAS by itself does not indicate lock-acquire failure.
317268

318-
detail::Futex<>* wordPtr = word();
319-
uint32_t oldWord = wordPtr->load(std::memory_order_relaxed);
269+
uint8_t oldWord = atomic_ref(lock_).load(std::memory_order_relaxed);
320270
do {
321271
if (oldWord & heldBit()) {
322272
return false;
323273
}
324-
} while (!wordPtr->compare_exchange_weak(
274+
} while (!atomic_ref(lock_).compare_exchange_weak(
325275
oldWord,
326276
oldWord | heldBit(),
327277
std::memory_order_acquire,
@@ -334,11 +284,10 @@ template <unsigned MaxSpins, unsigned MaxYields>
334284
uint8_t MicroLockBase<MaxSpins, MaxYields>::lockAndLoad() noexcept {
335285
static_assert(MaxSpins + MaxYields < (unsigned)-1, "overflow");
336286

337-
detail::Futex<>* wordPtr = word();
338-
uint32_t oldWord;
339-
oldWord = wordPtr->load(std::memory_order_relaxed);
287+
uint8_t oldWord;
288+
oldWord = atomic_ref(lock_).load(std::memory_order_relaxed);
340289
if ((oldWord & heldBit()) == 0 &&
341-
wordPtr->compare_exchange_weak(
290+
atomic_ref(lock_).compare_exchange_weak(
342291
oldWord,
343292
oldWord | heldBit(),
344293
std::memory_order_acquire,
@@ -350,7 +299,7 @@ uint8_t MicroLockBase<MaxSpins, MaxYields>::lockAndLoad() noexcept {
350299
// sure its shifting produces the same result a call to waitBit would.
351300
assert(heldBit() << 1 == waitBit());
352301
// lockSlowPath emits its own memory barrier
353-
return lockSlowPath(oldWord, wordPtr, baseShift(), MaxSpins, MaxYields);
302+
return lockSlowPath(oldWord, MaxSpins, MaxYields);
354303
}
355304
}
356305

0 commit comments

Comments
 (0)