Prior Sensitivity and Shrinkage in a Bayesian Hierarchical Count Model

This repository studies a narrow methodological question in hierarchical Bayes: when a historical prior points to the wrong population mean, what changes most after updating—predictive fit, group ranking, or the shrinkage target itself?

The analysis uses a fully synthetic grouped-count dataset with 50 groups observed over 10 periods. No course handout, assessment text, or supplied coursework data are included.

Model

mu ~ Gamma(mu_shape, mu_rate)
kappa ~ Gamma(kappa_shape, kappa_rate)
lambda_i | mu, kappa ~ Gamma(kappa * mu, kappa)
y_ij | lambda_i ~ Poisson(lambda_i)

This parameterization separates:

mu: population-level shrinkage target
kappa: pooling strength

The posterior mean for each group can be written as

E[lambda_i | mu, kappa, y_i]
= K / (kappa + K) * ybar_i + kappa / (kappa + K) * mu

so partial pooling is explicitly toward mu, not simply toward the empirical grand mean.

Prior scenarios

Three priors are compared:

historical_conflict: strong prior centered far above the observed data
historical_discounted: same prior target, weaker concentration
weakly_informative: prior target close to the observed scale

The observed dataset has:

overall mean = 0.976
zero proportion = 0.444
max count = 8
between-group SD = 0.765

Main findings

Prior predictive checks detect the mismatch immediately.
Coverage of the four observed summary metrics under the 95% prior-predictive interval is:
- historical_conflict: 0/4
- historical_discounted: 2/4
- weakly_informative: 4/4
The strongest sensitivity is the posterior population mean.
Posterior mu summaries are:
- historical_conflict: 1.357 [1.079, 1.742]
- historical_discounted: 1.003 [0.823, 1.219]
- weakly_informative: 0.971 [0.796, 1.166]
Group ranking is comparatively robust.
The top five groups are unchanged across all three priors: G18, G16, G49, G50, G13.
Between historical_conflict and weakly_informative, the Spearman rank correlation is 0.996, the top-10 overlap is 9/10, and the mean absolute rank difference is 0.880.
Posterior predictive fit recovers even when the prior is wrong.
Coverage of the four observed summary metrics under the 95% posterior-predictive interval is:
- historical_conflict: 4/4
- historical_discounted: 4/4
- weakly_informative: 4/4

The practical takeaway is precise: a mismatched historical prior can be easy to flag with prior-predictive checks and can continue to shift the posterior shrinkage target after updating, even when posterior predictive fit and group ranking remain relatively stable.

Selected outputs

Posterior shrinkage targets under the three priors:

Rank stability between the strongest conflicting prior and the weakly informative prior:

Additional outputs are in results/figures/ and results/tables/, including full posterior group summaries and predictive-interval summaries.

Reproduce

pip install -r requirements.txt
python src/run_analysis.py

Repository layout

.
├── README.md
├── requirements.txt
├── data/
│   └── simulated_grouped_counts.csv
├── src/
│   └── run_analysis.py
├── results/
│   ├── figures/
│   └── tables/
└── docs/
    └── methods_note.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Prior Sensitivity and Shrinkage in a Bayesian Hierarchical Count Model

Model

Prior scenarios

Main findings

Selected outputs

Reproduce

Repository layout

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
data		data
docs		docs
results		results
src		src
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Prior Sensitivity and Shrinkage in a Bayesian Hierarchical Count Model

Model

Prior scenarios

Main findings

Selected outputs

Reproduce

Repository layout

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages