Expected behavior and actual behavior:
Re: Median definition.
See code examples below.
The Agg.percentileBy method incorrectly assumes that index rounding is enough, but isn't when the the (size * percentile) still has an effectively non-zero factional part, and the floor indexed value and the next value are not equal.
I appreciate that not all Comparable types can be, or easily, added and divided, so I suggest that a modified percentile method, and median method, be added, accepting a type specific function, with parameters for the floor indexed value, the next indexed value and the fractional part of the index, to allow returning a more correct value. The function would only need to be called when the index has an effectively non-zero fractional part and the relevant indexed values are different. Any such new functionally should also exposed anywhere else the existing median and percentile* methods are also exposed e.g. Window.
Steps to reproduce the problem:
e.g. for median
Seq.of(1d,2d,3d,4d,5d,6d,7d).median().ifPresent(System.out::println);
prints 4.0, which is correct, because the exact middle value is 4.0,
Seq.of(1d,2d,3d,4d,5d,6d,7d,8d).median().ifPresent(System.out::println);
prints 4.0 which is apparently incorrect, because the middle values are 4 and 5, so should print (4 + 5) / 2 = 4.5
Versions:
Expected behavior and actual behavior:
Re: Median definition.
See code examples below.
The Agg.percentileBy method incorrectly assumes that index rounding is enough, but isn't when the the (size * percentile) still has an effectively non-zero factional part, and the floor indexed value and the next value are not equal.
I appreciate that not all Comparable types can be, or easily, added and divided, so I suggest that a modified percentile method, and median method, be added, accepting a type specific function, with parameters for the floor indexed value, the next indexed value and the fractional part of the index, to allow returning a more correct value. The function would only need to be called when the index has an effectively non-zero fractional part and the relevant indexed values are different. Any such new functionally should also exposed anywhere else the existing median and percentile* methods are also exposed e.g. Window.
Steps to reproduce the problem:
e.g. for median
prints 4.0, which is correct, because the exact middle value is 4.0,
prints 4.0 which is apparently incorrect, because the middle values are 4 and 5, so should print (4 + 5) / 2 = 4.5
Versions: