-
Notifications
You must be signed in to change notification settings - Fork 15.3k
Description
The current implementation of {std, ranges}::copy does not correctly handle vector<bool> when the underlying storage type (__storage_type) is smaller than int, e.g., unsigned char, unsigned short, uint8_t and uint16_t.
Reproducer
A minimal reproducer is as follows (Godbolt Link):
#include <limits>
#include <memory>
#include <vector>
#include <algorithm>
#include <cassert>
#include <cstdint>
#include <iostream>
template <typename T, typename Size, typename Difference>
class sized_allocator {
template <typename U, typename Sz, typename Diff>
friend class sized_allocator;
public:
using value_type = T;
using size_type = Size;
using difference_type = Difference;
using propagate_on_container_swap = std::true_type;
constexpr explicit sized_allocator(int d = 0) : data_(d) {}
template <typename U, typename Sz, typename Diff>
constexpr sized_allocator(const sized_allocator<U, Sz, Diff>& a) noexcept : data_(a.data_) {}
constexpr T* allocate(size_type n) {
if (n > max_size())
throw std::bad_array_new_length();
return std::allocator<T>().allocate(n);
}
constexpr void deallocate(T* p, size_type n) noexcept { std::allocator<T>().deallocate(p, n); }
constexpr size_type max_size() const noexcept {
return std::numeric_limits<size_type>::max() / sizeof(value_type);
}
private:
int data_;
constexpr friend bool operator==(const sized_allocator& a, const sized_allocator& b) {
return a.data_ == b.data_;
}
constexpr friend bool operator!=(const sized_allocator& a, const sized_allocator& b) {
return a.data_ != b.data_;
}
};
int main() {
using Alloc = sized_allocator<bool, std::uint8_t, std::int8_t>;
std::vector<bool, Alloc> in(8, false, Alloc(1));
std::vector<bool, Alloc> out(8, true, Alloc(1));
std::copy(in.begin(), in.begin() + 1, out.begin()); // out[0] = false
// We only assigned a single bit, but entire byte got zeroed!
for (std::size_t i = 0; i < out.size(); ++i)
std::cout << out[i];
std::cout << '\n';
}
In this example, a single bit from vector<bool> is copied and set to false using std::copy. However, due to incorrect behavior in std::copy, all bits in the whole storage word are zeroed.
Root Cause Analysis
The root cause in this specific example lies in the following problematic bit mask computation in std::copy:
| __storage_type __m = (~__storage_type(0) << __first.__ctz_) & (~__storage_type(0) >> (__clz - __dn)); |
The evaluation of ~__storage_type(0) is first integral-promoted to -1, which is subsequently left shifted and right-shifted. However, left-shifting a negative value leads to UB before C++20 [1], and right-shifting a negative value results in sign-bit extension since C++20 [2]. Due to sign-bit extension, all bits in the mask are set to 1, making the mask incorrect and leading to erroneous copying behavior.
It appears that Clang exhibits consistent incorrect behavior in std::copy even before C++20. I believe this is because Clang already performed sign-bit extension in right-shift operations even before C++20.
There are several other bitwise operations in std::copy that are similarly flawed and need to be fixed.
[1] [expr.shift]/3
Right-shift on signed integral types is an arithmetic right shift, which performs sign-extension.
[2] cppreference
For negative a, the behavior of a << b is undefined. (until C++20)