Skip to content

Dev/new stats#662

Open
py-cyber wants to merge 49 commits intoDesbordante:mainfrom
py-cyber:dev/new-stats
Open

Dev/new stats#662
py-cyber wants to merge 49 commits intoDesbordante:mainfrom
py-cyber:dev/new-stats

Conversation

@py-cyber
Copy link
Contributor

@py-cyber py-cyber commented Jan 2, 2026

Summary

This PR adds 6 new statistics to the DataStats class. These statistics help users understand their data better.

New Statistics

  1. Interquartile Range — shows the spread of the middle 50% of numeric data
  2. Coefficient of Variation — shows how much data varies compared to the average
  3. Monotonicity — tells if data always goes up, always goes down, always the same, or neither
  4. Jarque-Bera Statistic — tests if numeric data has a normal distribution
  5. Entropy — measures how mixed categorical data is
  6. Gini Coefficient — measures inequality in categorical data

Changes

  1. Core

    Location: src/core/algorithms/statistics/
    Change: extended DataStats class with six new statistical methods. Enhanced ColumnStats data structure to store new statistics.

  2. Python Bindings

    Location: src/python_bindings/statistics/
    Change: added pybind methods for all new statistics.

  3. Tests

    Location: src/tests/unit/
    Change: implemented google-testing for new statistical methods.

    Location: src/python_bindings/
    Change: implemented Python integration tests for new statistical methods.

  4. Examples

    Location: examples/basic/
    Change: Updated demonstration case to show usage of all new statistics.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants