Skip to content

Handle common_norm for density stat in Hist#3911

Open
Mr-Neutr0n wants to merge 1 commit intomwaskom:masterfrom
Mr-Neutr0n:fix-hist-common-norm-density
Open

Handle common_norm for density stat in Hist#3911
Mr-Neutr0n wants to merge 1 commit intomwaskom:masterfrom
Mr-Neutr0n:fix-hist-common-norm-density

Conversation

@Mr-Neutr0n
Copy link

Summary

so.Hist(stat='density', common_norm=True) was silently ignoring common_norm — each group's density was normalized independently instead of across all groups.

The root cause: _eval() passed density=True to np.histogram, which baked in per-group normalization before _normalize() (and its common_norm grouping logic) ever ran. For every other stat (percent, proportion, frequency), normalization happens inside _normalize() where common_norm is properly respected — but density was the exception.

Changes

  • _eval(): Always pass density=False to np.histogram so raw counts are preserved for all stats.
  • _normalize(): Add density normalization (count / (total_count * bin_width)) alongside the existing percent/proportion/frequency cases. This way common_norm controls the scope of total_count for density just like it does for the other stats.

Test plan

  • Added test_common_norm_density_default — verifies total area sums to 1 across all groups when common_norm=True
  • Added test_common_norm_density_false — verifies each group individually sums to 1 when common_norm=False
  • Added test_common_norm_density_subset — verifies normalization within subgroups
  • All 32 tests in test_counting.py pass (29 existing + 3 new)

Closes #3633

Previously, Hist(stat='density') passed density=True directly to
np.histogram, which normalized each group independently before
common_norm could take effect. This meant common_norm=True was
silently ignored for density normalization.

Move density normalization into _normalize() so it respects the same
common_norm grouping logic used by other stats like percent and
proportion.

Closes mwaskom#3633
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

so.Hist ignores common_norm=True for the "density" aggregate statistic

1 participant