Skip to content

Commit 577ff77

Browse files
committed
Use correct numeric types for the ArrayHistogram calculations
`float` only has 23-bit long mantissa, so the maximum integer that can be safely represented is 8 388 608, which is too low for correct calculation of MFU thresholds (we can easily have more than 8 million items in vBuckets). `std::aggregate` uses the type of the last parameter, which was `int` in this case. Not sufficient to hold the sum of all bucket counts given a generic `CountType`. Change-Id: I93f9af90ecc420af5355c6f82eda9a8c837e5d90 Reviewed-on: https://review.couchbase.org/c/kv_engine/+/190045 Tested-by: Vesko Karaganev <[email protected]> Reviewed-by: Dave Rigby <[email protected]>
1 parent 9760ef8 commit 577ff77

File tree

2 files changed

+15
-3
lines changed

2 files changed

+15
-3
lines changed

engines/ep/src/array_histogram.cc

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -111,7 +111,7 @@ size_t ArrayHistogram<CountType, ArraySize, StoredType>::getValueAtPercentile(
111111
// With the bias:
112112
// (50.0 / 100.0) * 3 + 0.5
113113
// would give 2.0 samples, and would find the 50th percentile to be 20.
114-
auto threshold = CountType(((percentile / 100.0f) * totalCount) + 0.5f);
114+
auto threshold = CountType(((percentile / 100.0) * totalCount) + 0.5);
115115

116116
if (!threshold) {
117117
// the histogram is not empty, but the requested percentile
@@ -139,7 +139,7 @@ size_t ArrayHistogram<CountType, ArraySize, StoredType>::getValueAtPercentile(
139139
template <class CountType, size_t ArraySize, class StoredType>
140140
CountType ArrayHistogram<CountType, ArraySize, StoredType>::getNumberOfSamples()
141141
const {
142-
return std::accumulate(valueArray.begin(), valueArray.end(), 0);
142+
return std::accumulate(valueArray.begin(), valueArray.end(), CountType(0));
143143
}
144144

145145
template <class CountType, size_t ArraySize, class StoredType>

engines/ep/tests/module_tests/array_histogram_test.cc

Lines changed: 13 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -132,6 +132,18 @@ TEST(ArrayHistogramTest, Percentile) {
132132
SCOPED_TRACE("50%ile and 100%ile bumped to 120");
133133
expect(100, 120, 120);
134134
}
135+
136+
// Test with a large number of samples. Previously, we used floats for the
137+
// threshold calculation, and because of the 24-bit mantissa in floats,
138+
// these values would be off. We also used to use 32-bit ints for the total
139+
// number of samples. Compare implementations above those limits.
140+
hdr.add(110, std::numeric_limits<int32_t>::max());
141+
hist.add(110, std::numeric_limits<int32_t>::max());
142+
143+
{
144+
SCOPED_TRACE("50%ile bumped to 110");
145+
expect(100, 110, 120);
146+
}
135147
}
136148

137149
TEST(ArrayHistogramTest, ThreadSafety) {
@@ -160,4 +172,4 @@ TEST(ArrayHistogramTest, ThreadSafety) {
160172

161173
thread1.join();
162174
thread2.join();
163-
}
175+
}

0 commit comments

Comments
 (0)