You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: 06-the-learnable-universe/module-1-statistical-inference/01-bayesian-statistics-inference/03-mod5-part3-MCMC.md
+29-26Lines changed: 29 additions & 26 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -580,14 +580,14 @@ $$
580
580
:class: tip
581
581
**Before we derive the acceptance probability, take a moment to think:**
582
582
583
-
Given that we want detailed balance π(θ)T(θ'|θ) = π(θ')T(θ|θ'), and we've decided T = Q × α, what constraints must α satisfy?
583
+
Given that we want detailed balance $π(θ)T(θ'|θ) = π(θ')T(θ|θ')$, and we've decided $T = Q × α$, what constraints must $α$ satisfy?
584
584
585
585
Write down your answer before reading on:
586
586
587
-
- Should α depend on both θ and θ', or just one?
588
-
- If θ' has higher posterior probability than θ, should we always accept?
589
-
- If θ' has lower posterior probability, should we always reject?
590
-
- What role does the proposal distribution Q play?
587
+
- Should $α$ depend on both $θ$ and $θ'$, or just one?
588
+
- If $θ'$ has higher posterior probability than θ, should we always accept?
589
+
- If $θ'$ has lower posterior probability, should we always reject?
590
+
- What role does the proposal distribution $Q$ play?
591
591
592
592
Think about it for 30 seconds, then continue. The derivation will be more meaningful if you've wrestled with the problem first.
593
593
:::
@@ -618,15 +618,17 @@ $$
618
618
r = \frac{\pi(\theta') Q(\theta | \theta')}{\pi(\theta) Q(\theta' | \theta)}
619
619
$$
620
620
621
-
**When \(\pi\) is the posterior:** write \(\pi(\theta) \propto p(D\mid\theta)p(\theta)\). Then
621
+
**When $\pi$ is the posterior:** write
622
+
$$\pi(\theta) \propto p(D\mid\theta)p(\theta).$$
622
623
623
-
\[
624
+
Then
625
+
626
+
$$
624
627
r \;=\; \frac{\pi(\theta')\,Q(\theta\mid\theta')}{\pi(\theta)\,Q(\theta'\mid\theta)}
Here you can evaluate $π$ using only log-likelihood + log-prior; any constant normalizer cancels. This is the **Metropolis algorithm** (the original 1953 version). You only need to evaluate the ratio of posterior probabilities!
690
+
Here you can evaluate $π$ using only log-likelihood + log-prior; any constant normalizer cancels. This is the **Metropolis algorithm** (the original 1953 version). **You only need to evaluate the ratio of posterior probabilities!**
689
691
690
692
**Asymmetric proposals**: $Q(θ'|θ) ≠ Q(θ|θ')$
691
693
@@ -718,7 +720,7 @@ The proposal distribution Q determines how the chain explores. A crucial paramet
718
720
:class: dropdown
719
721
For high-dimensional problems with Gaussian targets and Gaussian proposals, there's a beautiful theory (Roberts & Rosenthal 2001):
720
722
721
-
**Optimal acceptance rate**: ~23.4% as dimension d → ∞
723
+
**Optimal acceptance rate**: ~23.4% as dimension $d → ∞$
722
724
723
725
This balances:
724
726
@@ -733,15 +735,15 @@ In practice, aim for:
733
735
734
736
If your acceptance rate is outside these ranges, adjust your proposal scale σ:
735
737
736
-
- Too high acceptance (>60%)? Increase σ
737
-
- Too low acceptance (<15%)? Decrease σ
738
+
- Too high acceptance (>60%)? Increase $σ$
739
+
- Too low acceptance (<15%)? Decrease $σ$
738
740
:::
739
741
740
742
**Adaptive tuning**: In practice, you might run the chain for a **burn-in** period, monitor the acceptance rate, and adjust $\sigma$. Common strategy:
741
743
742
744
- Run 1000 steps (burn-in)
743
-
- If acceptance rate > 50%, multiply σ by 1.2
744
-
- If acceptance rate < 20%, divide σ by 1.2
745
+
- If acceptance rate > 50%, multiply $σ$ by 1.2
746
+
- If acceptance rate < 20%, divide $σ$ by 1.2
745
747
- Repeat until acceptance rate is in target range
746
748
747
749
Once tuned, **fix** the proposal and run your production chain. (Don't keep adapting during production — this violates the Markov property and detailed balance.)
0 commit comments