-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
RandomSelectorImpl.pick() uses Kotlin’s List.binarySearch with a comparison lambda that returns -1 for values < randomValue and 0 for values ≥ randomValue. This violates the comparator contract (it never returns a positive value for >), so the binary search is free to return any index where the lambda yields 0. In practice, it often returns a near-middle index, causing severe bias (e.g., “almost always the middle item”).
Affected code
// dev.slne.surf.surfapi.core.api.random.RandomSelectorImpl
override fun pick(randomGenerator: RandomGenerator): E {
val totalWeight = cumulativeWeights.getDouble(cumulativeWeights.size - 1)
val randomValue = randomGenerator.nextDouble(totalWeight)
// BUG: comparator never returns > 0 for elements greater than randomValue
val index = cumulativeWeights.binarySearch { if (it < randomValue) -1 else 0 }
.let { if (it < 0) -(it + 1) else it }
return elements[index]
}Why this is a bug
Kotlin’s List<T>.binarySearch(comparison: (T) -> Int) expects the lambda to behave like a comparator relative to a fixed key: return negative for elements less than the key, zero for elements equal to the key, and positive for elements greater than the key. Our lambda returns 0 for both equal *and greater elements, which makes the ordering non-strict and unspecified. The implementation may therefore return an arbitrary “match” index among all elements ≥ randomValue—commonly near the middle—leading to biased selection.
This explains field behavior like “almost always gravel” in certain phases even though weights suggest otherwise.
Steps to reproduce
- Build a cumulative weight list, e.g.
[50, 70, 100](weights 50/20/30). - Call
pick()thousands of times. - Observe that the second element is selected disproportionately often (bias towards a middle index), instead of ~50%, ~20%, ~30%.
(We validated this by swapping in a simple lower-bound implementation: the bias disappears and distributions match expectations.)
Expected behavior
Elements should be selected with probability proportional to their weights (i.e., index is the first cumulative ≥ randomValue—a proper lower bound).
Fix
Use a correct lower-bound search. Two safe options:
Option A: Manual lower-bound (recommended)
override fun pick(randomGenerator: RandomGenerator): E {
val totalWeight = cumulativeWeights.getDouble(cumulativeWeights.size - 1)
val r = randomGenerator.nextDouble(totalWeight)
// lower_bound: first index with cumulativeWeights[idx] >= r
var lo = 0
var hi = cumulativeWeights.size - 1
while (lo < hi) {
val mid = (lo + hi) ushr 1
if (r <= cumulativeWeights.getDouble(mid)) {
hi = mid
} else {
lo = mid + 1
}
}
return elements[lo]
}Option B: Use proper comparator with binarySearch
If you keep the Kotlin helper, make sure the comparator distinguishes >:
val i = cumulativeWeights.binarySearch { it.compareTo(r) } // negative, zero, positive
val index = if (i < 0) -(i + 1) else i
return elements[index]Note:
cumulativeWeightsis a FastUtilDoubleList. Ensure you’re calling an overload that actually uses the provided comparator; otherwise prefer Option A.
Additional notes
fromWeightedIterableandfromIterablecorrectly build cumulative sums; the bug is strictly in how the index is found.nextDouble(totalWeight)generatesr ∈ [0, totalWeight), so there is always an index withcumulative ≥ r—no extra clamping needed.- The
FlowRandomSelectorImplis unrelated to this bug.
Suggested tests
-
Deterministic distribution test (seeded RNG):
- Weights:
[50.0, 20.0, 30.0] - Run 100k picks; assert frequencies within ±2% of expected.
- Weights:
-
Edge cases:
- Single element (always index 0).
- Highly skewed weights (e.g.,
[1e-9, 1e9]) to ensure extremes behave. - Many elements (e.g., 10k items) to validate no off-by-one at boundaries.
Environment
- Kotlin stdlib: binary search behavior per documentation (no guarantee which equal element is returned).
- FastUtil
DoubleListfor cumulative weights.
Proposed change (diff)
diff --git a/core/api/random/RandomSelectorImpl.kt b/core/api/random/RandomSelectorImpl.kt
@@
override fun pick(randomGenerator: RandomGenerator): E {
val totalWeight = cumulativeWeights.getDouble(cumulativeWeights.size - 1)
- val randomValue = randomGenerator.nextDouble(totalWeight)
-
- val index = cumulativeWeights.binarySearch { if (it < randomValue) -1 else 0 }
- .let { if (it < 0) -(it + 1) else it }
-
- return elements[index]
+ val r = randomGenerator.nextDouble(totalWeight)
+
+ // lower_bound search for first cumulative >= r
+ var lo = 0
+ var hi = cumulativeWeights.size - 1
+ while (lo < hi) {
+ val mid = (lo + hi) ushr 1
+ if (r <= cumulativeWeights.getDouble(mid)) {
+ hi = mid
+ } else {
+ lo = mid + 1
+ }
+ }
+ return elements[lo]
}