Skip to content

Latest commit

 

History

History
25 lines (12 loc) · 932 Bytes

File metadata and controls

25 lines (12 loc) · 932 Bytes

Performance-based justification for Bayesian inference

We proof that the predictive posterior distribution maximizes the log-likelihood of future observations averaged over the data-generating distribution:

The essence of this proof is to show that the predictive posterior distribution is superior to any other reference distribution in terms of the log-likelihood:

or equivalently that:

Proofing this conjecture is straightforward [1]:

Note that while we used sums in our proof, which assumes that relevant quantities take discrete values, the same ideas can be readily applied to continuous-valued quantities by replacing sums with integrals.

References:

[1] Aitchison, J. (1975). Goodness of prediction fit. Biometrika, 62(3), 547-554.