Skip to content

xyq100000-gif/prior-sensitivity-hierarchical-counts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Prior Sensitivity and Shrinkage in a Bayesian Hierarchical Count Model

This repository studies a narrow methodological question in hierarchical Bayes: when a historical prior points to the wrong population mean, what changes most after updating—predictive fit, group ranking, or the shrinkage target itself?

The analysis uses a fully synthetic grouped-count dataset with 50 groups observed over 10 periods. No course handout, assessment text, or supplied coursework data are included.

Model

mu ~ Gamma(mu_shape, mu_rate)
kappa ~ Gamma(kappa_shape, kappa_rate)
lambda_i | mu, kappa ~ Gamma(kappa * mu, kappa)
y_ij | lambda_i ~ Poisson(lambda_i)

This parameterization separates:

  • mu: population-level shrinkage target
  • kappa: pooling strength

The posterior mean for each group can be written as

E[lambda_i | mu, kappa, y_i]
= K / (kappa + K) * ybar_i + kappa / (kappa + K) * mu

so partial pooling is explicitly toward mu, not simply toward the empirical grand mean.

Prior scenarios

Three priors are compared:

  • historical_conflict: strong prior centered far above the observed data
  • historical_discounted: same prior target, weaker concentration
  • weakly_informative: prior target close to the observed scale

The observed dataset has:

  • overall mean = 0.976
  • zero proportion = 0.444
  • max count = 8
  • between-group SD = 0.765

Main findings

  1. Prior predictive checks detect the mismatch immediately.
    Coverage of the four observed summary metrics under the 95% prior-predictive interval is:

    • historical_conflict: 0/4
    • historical_discounted: 2/4
    • weakly_informative: 4/4
  2. The strongest sensitivity is the posterior population mean.
    Posterior mu summaries are:

    • historical_conflict: 1.357 [1.079, 1.742]
    • historical_discounted: 1.003 [0.823, 1.219]
    • weakly_informative: 0.971 [0.796, 1.166]
  3. Group ranking is comparatively robust.
    The top five groups are unchanged across all three priors: G18, G16, G49, G50, G13.
    Between historical_conflict and weakly_informative, the Spearman rank correlation is 0.996, the top-10 overlap is 9/10, and the mean absolute rank difference is 0.880.

  4. Posterior predictive fit recovers even when the prior is wrong.
    Coverage of the four observed summary metrics under the 95% posterior-predictive interval is:

    • historical_conflict: 4/4
    • historical_discounted: 4/4
    • weakly_informative: 4/4

The practical takeaway is precise: a mismatched historical prior can be easy to flag with prior-predictive checks and can continue to shift the posterior shrinkage target after updating, even when posterior predictive fit and group ranking remain relatively stable.

Selected outputs

Posterior shrinkage targets under the three priors:

Posterior population mean by prior

Rank stability between the strongest conflicting prior and the weakly informative prior:

Rank stability

Additional outputs are in results/figures/ and results/tables/, including full posterior group summaries and predictive-interval summaries.

Reproduce

pip install -r requirements.txt
python src/run_analysis.py

Repository layout

.
├── README.md
├── requirements.txt
├── data/
│   └── simulated_grouped_counts.csv
├── src/
│   └── run_analysis.py
├── results/
│   ├── figures/
│   └── tables/
└── docs/
    └── methods_note.md

About

Bayesian hierarchical count modeling case study on prior sensitivity, shrinkage targets, and predictive checks using synthetic grouped-count data.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages