-
Notifications
You must be signed in to change notification settings - Fork 374
MATH-1671: Update stat.descriptive to use Commons Statistics #260
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #260 +/- ##
============================================
+ Coverage 86.54% 87.12% +0.57%
+ Complexity 9787 89 -9698
============================================
Files 532 504 -28
Lines 35516 33488 -2028
Branches 6194 5831 -363
============================================
- Hits 30738 29175 -1563
+ Misses 3518 3192 -326
+ Partials 1260 1121 -139 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
1684862 to
5fd1193
Compare
Removed percentile classes from descriptive.rank: CentralPivotingStrategy KthSelector Median MedianOf3PivotingStrategy Percentile PivotingStrategy RandomPivotingStrategy
Refactor the SummaryStatistics default implementations to use DoubleStatistics. Removes method: getPopulationVariance getSecondMoment The population variance relies the second moment implementation which computes a statistic related to the central second moment. It is not a standard statistic and is not supported in Commons Statistics. Updates the min/max implementations to use Math.min/max. Previous behaviour ignored NaN values. The change now matches with JDK stream behaviour.
Refactor the DescriptiveStatistics default implementations to use DoubleStatistics. Removes method: getPopulationVariance The variance implementation can be overridden if desired. Updates the min/max implementations to use Math.min/max. Previous behaviour ignored NaN values. The change now matches with JDK stream behaviour.
Removes redundant classes. descriptive.moment: - FirstMoment - FourthMoment - GeometricMean - Kurtosis - SecondMoment - Skewness - StandardDeviation - ThirdMoment Mean + Variance have been changed to only implement the weighted evaluation interface. descriptive.rank: - Min - Max descriptive.summary: - Sum - SumOfLogs - SumOfSquares Product has been changed to only implement the weighted evaluation interface. The utility class StatUtils has been updated to delegate all calls to Commons Statistics. Legacy Math exceptions have been preserved. Removes methods to compute the variance using an existing mean: public static double variance(double[] values, double mean, int begin, int length) public static double variance(double[] values, double mean) public static double populationVariance(double[] values, double mean, int begin, int length) public static double populationVariance(double[] values, double mean) Note: StatUtils has inconsistent documentation of what to return for an empty array. The documentation states NaN but StatUtilsTest requires otherwise: Sum-of-squares = 0 Product = 1 Sum-of-logs = 0 This is inconsistent and has been updated to NaN for all statistics. The class MultivariateSummaryStatistics has been updated with partial implementations of StorelessUnivariateStatistic that delegate to Commons Statistics. Some test classes have been updated to pass the build after removal of the statistic implementations.
Fix test coverage for refactored code.
Add WeightedSum implementation.
Update user guide.
Remove sum-of-logs, geometric mean and sum-of-squares from SummaryStatistics for performance reasons.
The SemiVariance had many untested methods and the implementation was bugged. This change corrects the implementation: - from using (i=start; i<length; i++) to use i<start+length - the use of arguments 0 and values.length to start and length for the array sub-range method Tests have been added for sub-range evaluation and to complete code coverage for the class.
5fd1193 to
bb2ec67
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.