diff --git a/lectures/_static/quant-econ.bib b/lectures/_static/quant-econ.bib
index 1b6e4dbd2..ac295e8e3 100644
--- a/lectures/_static/quant-econ.bib
+++ b/lectures/_static/quant-econ.bib
@@ -2,6 +2,35 @@
 QuantEcon Bibliography File used in conjuction with sphinxcontrib-bibtex package
 Note: Extended Information (like abstracts, doi, url's etc.) can be found in quant-econ-extendedinfo.bib file in _static/
 ###
+@book{friedman1953essays,
+  title={Essays in positive economics},
+  author={Friedman, Milton},
+  year={1953},
+  publisher={University of Chicago press}
+}
+
+@article{alchian1950uncertainty,
+  title={Uncertainty, evolution, and economic theory},
+  author={Alchian, Armen A},
+  journal={Journal of political economy},
+  volume={58},
+  number={3},
+  pages={211--221},
+  year={1950},
+  publisher={The University of Chicago Press}
+}
+
+
+@article{blume2006if,
+  title={If you're so smart, why aren't you rich? Belief selection in complete and incomplete markets},
+  author={Blume, Lawrence and Easley, David},
+  journal={Econometrica},
+  volume={74},
+  number={4},
+  pages={929--966},
+  year={2006},
+  publisher={Wiley Online Library}
+}
 
 @article{mendoza1998international,
   title={The international ramifications of tax reforms: supply-side economics in a global economy},
diff --git a/lectures/likelihood_ratio_process.md b/lectures/likelihood_ratio_process.md
index 25fb252e5..2daa27157 100644
--- a/lectures/likelihood_ratio_process.md
+++ b/lectures/likelihood_ratio_process.md
@@ -37,8 +37,10 @@ Among  things that we'll learn  are
 
 * How a likelihood ratio process is a key ingredient in frequentist hypothesis testing
 * How a **receiver operator characteristic curve** summarizes information about a false alarm probability and power in frequentist hypothesis testing
-* How a  statistician can combine frequentist probabilities of type I and type II errors to form posterior probabilities of erroneous model selection or missclassification of individuals 
-* How during World War II the United States Navy devised a decision rule that Captain Garret L. Schyler challenged, a topic to be studied in  {doc}`this lecture <wald_friedman>`
+* How a  statistician can combine frequentist probabilities of type I and type II errors to form posterior probabilities of mistakes in a  model selection or a  classification problem
+* How likelihood ratios helped Lawrence Blume and David Easley formulate an answer to  the question ''If you're so smart, why aren't you rich?'' {cite}`blume2006if`
+* How to use a Kullback-Leibler divergence to  quantify the difference between two probability distributions with the same support 
+* How during World War II the United States Navy devised a decision rule for doing quality control on lots of ammunition, a topic that sets the stage for {doc}`this lecture <wald_friedman>`
 * A peculiar property of likelihood ratio processes
 
 
@@ -70,7 +72,7 @@ that $q$ is either $f$ or $g$, permanently.
 
 Nature knows which density it permanently draws from, but we the observers do not.
 
-We know both $f$ and $g$ but we don’t know which density nature
+We know both $f$ and $g$ but we don't know which density nature
 chose.
 
 But we want to know.
@@ -119,12 +121,12 @@ inferences using a classic frequentist approach due to Neyman and
 Pearson {cite}`Neyman_Pearson`.
 
 To help us appreciate how things work, the following Python code evaluates $f$ and $g$ as two different
-beta distributions, then computes and simulates an associated likelihood
+Beta distributions, then computes and simulates an associated likelihood
 ratio process by generating a sequence $w^t$ from one of the two
 probability distributions, for example, a sequence of  IID draws from $g$.
 
 ```{code-cell} ipython3
-# Parameters in the two beta distributions.
+# Parameters in the two Beta distributions.
 F_a, F_b = 1, 1
 G_a, G_b = 3, 1.2
 
@@ -323,7 +325,7 @@ plt.show()
 
 We now describe how to employ the machinery
 of Neyman and Pearson {cite}`Neyman_Pearson` to test the hypothesis that  history $w^t$ is generated by repeated
-IID draws from density $g$.
+IID draws from density $f$.
 
 Denote $q$ as the data generating process, so that
 $q=f \text{ or } g$.
@@ -419,7 +421,7 @@ Below we plot some informative figures that illustrate this.
 We also present a classical frequentist method for choosing a sample
 size $t$.
 
-Let’s start with a case in which we fix the threshold $c$ at
+Let's start with a case in which we fix the threshold $c$ at
 $1$.
 
 ```{code-cell} ipython3
@@ -438,7 +440,7 @@ errors both decrease, which is good.
 
 This is because most of the probability mass of log$(L(w^t))$
 moves toward $-\infty$ when $g$ is the data generating
-process, ; while log$(L(w^t))$ goes to
+process,  while log$(L(w^t))$ goes to
 $\infty$ when data are generated by $f$.
 
 That disparate  behavior of log$(L(w^t))$ under $f$ and $q$
@@ -506,7 +508,7 @@ If for a fixed $t$ we now free up and move $c$, we will sweep out the probabilit
 of detection as a function of the probability of false alarm.
 
 This produces  a [receiver operating characteristic
-curve](https://en.wikipedia.org/wiki/Receiver_operating_characteristic).
+curve (ROC curve)](https://en.wikipedia.org/wiki/Receiver_operating_characteristic).
 
 Below, we plot receiver operating characteristic curves for different
 sample sizes $t$.
@@ -536,6 +538,13 @@ Notice that as $t$ increases, we are assured a larger probability
 of detection and a smaller probability of false alarm associated with
 a given discrimination threshold $c$.
 
+For a given sample size $t$, both $\alpha$ and $\beta$ change as we  vary $c$.
+
+As we increase $c$
+
+* $\alpha \equiv  \Pr\left\{ L\left(w^{t}\right)<c\mid q=f\right\}$ increases  
+* $\beta \equiv \Pr\left\{ L\left(w^{t}\right)>c\mid q=g\right\}$ decreases
+
 As $t \rightarrow + \infty$, we approach the perfect detection
 curve that is indicated by a right angle hinging on the blue dot.
 
@@ -584,12 +593,12 @@ presented to Milton Friedman, as we describe in  {doc}`this lecture <wald_friedm
 (rel_entropy)=
 ## Kullback–Leibler Divergence
 
-Now let’s consider a case in which neither $g$ nor $f$
+Now let's consider a case in which neither $g$ nor $f$
 generates the data.
 
 Instead, a third distribution $h$ does.
 
-Let’s watch how how the cumulated likelihood ratios $f/g$ behave
+Let's study  how accumulated likelihood ratios $L$  behave
 when $h$ governs the data.
 
 A key tool here is called **Kullback–Leibler divergence**.
@@ -598,73 +607,46 @@ It is also called **relative entropy**.
 
 It measures how one probability distribution differs from another.
 
-In our application, we want to measure how $f$ or $g$
+In our application, we want to measure how much $f$ or $g$
 diverges from $h$
 
-The two Kullback–Leibler divergences pertinent for us are $K_f$
+Two Kullback–Leibler divergences pertinent for us are $K_f$
 and $K_g$ defined as
 
 $$
 \begin{aligned}
-K_{f}   &=E_{h}\left[\log\left(\frac{f\left(w\right)}{h\left(w\right)}\right)\frac{f\left(w\right)}{h\left(w\right)}\right] \\
-    &=\int\log\left(\frac{f\left(w\right)}{h\left(w\right)}\right)\frac{f\left(w\right)}{h\left(w\right)}h\left(w\right)dw \\
-    &=\int\log\left(\frac{f\left(w\right)}{h\left(w\right)}\right)f\left(w\right)dw
+K_{f} = D_{KL}\bigl(h\|f\bigr) = KL(h, f)
+          &= E_{h}\left[\log\frac{h(w)}{f(w)}\right] \\
+          &= \int \log\left(\frac{h(w)}{f(w)}\right)h(w)dw .
 \end{aligned}
 $$
 
 $$
 \begin{aligned}
-K_{g}   &=E_{h}\left[\log\left(\frac{g\left(w\right)}{h\left(w\right)}\right)\frac{g\left(w\right)}{h\left(w\right)}\right] \\
-    &=\int\log\left(\frac{g\left(w\right)}{h\left(w\right)}\right)\frac{g\left(w\right)}{h\left(w\right)}h\left(w\right)dw \\
-    &=\int\log\left(\frac{g\left(w\right)}{h\left(w\right)}\right)g\left(w\right)dw
+K_{g} = D_{KL}\bigl(h\|g\bigr) = KL(h,g)
+          &= E_{h}\left[\log\frac{h(w)}{g(w)}\right] \\
+          &= \int \log\left(\frac{h(w)}{g(w)}\right)h(w)dw .
 \end{aligned}
 $$
 
-When $K_g < K_f$, $g$ is closer to $h$ than $f$
-is.
-
-- In that case we’ll find that $L\left(w^t\right) \rightarrow 0$.
-
-When $K_g > K_f$, $f$ is closer to $h$ than $g$
-is.
-
-- In that case we’ll find that
-  $L\left(w^t\right) \rightarrow + \infty$
-
-We’ll now experiment with an $h$ is also a beta distribution
-
-We’ll start by setting parameters $G_a$ and $G_b$ so that
-$h$ is closer to $g$
-
-```{code-cell} ipython3
-H_a, H_b = 3.5, 1.8
-
-h = jit(lambda x: p(x, H_a, H_b))
-```
-
-```{code-cell} ipython3
-x_range = np.linspace(0, 1, 100)
-plt.plot(x_range, f(x_range), label='f')
-plt.plot(x_range, g(x_range), label='g')
-plt.plot(x_range, h(x_range), label='h')
-
-plt.legend()
-plt.show()
-```
++++
 
-Let’s compute the Kullback–Leibler discrepancies by quadrature
+Let's compute the Kullback–Leibler discrepancies by quadrature
 integration.
 
 ```{code-cell} ipython3
 def KL_integrand(w, q, h):
 
-    m = q(w) / h(w)
+    m = h(w) / q(w)
 
-    return np.log(m) * q(w)
+    return np.log(m) * h(w)
 ```
 
 ```{code-cell} ipython3
 def compute_KL(h, f, g):
+    """
+    Compute KL divergence with reference distribution h
+    """
 
     Kf, _ = quad(KL_integrand, 0, 1, args=(f, h))
     Kg, _ = quad(KL_integrand, 0, 1, args=(g, h))
@@ -672,58 +654,120 @@ def compute_KL(h, f, g):
     return Kf, Kg
 ```
 
-```{code-cell} ipython3
-Kf, Kg = compute_KL(h, f, g)
-Kf, Kg
-```
+### A helpful formula
 
-We have $K_g < K_f$.
+There is a mathematical relationship between likelihood ratios and KL divergence. 
 
-Next, we can verify our conjecture about $L\left(w^t\right)$ by
-simulation.
+When data is generated by distribution $h$, the expected log likelihood ratio is:
 
-```{code-cell} ipython3
-l_arr_h = simulate(H_a, H_b)
-l_seq_h = np.cumprod(l_arr_h, axis=1)
-```
+$$
+\frac{1}{t} E_{h}\!\bigl[\log L_t\bigr] = KL(h, g) - KL(h, f) = K_g - K_f
+$$ (eq:kl_likelihood_link)
 
-The figure below plots over time the fraction of paths
-$L\left(w^t\right)$ that fall in the interval $[0,0.01]$.
+where $L_t=\prod_{j=1}^{t}\frac{f(w_j)}{g(w_j)}$ is the likelihood ratio process.
 
-Notice that, as expected,  it converges to 1  when $g$ is closer to
-$h$ than $f$ is.
+(For the proof, see [this note](https://nowak.ece.wisc.edu/ece830/ece830_fall11_lecture7.pdf).)
 
-```{code-cell} ipython3
-N, T = l_arr_h.shape
-plt.plot(range(T), np.sum(l_seq_h <= 0.01, axis=0) / N)
-```
+{eq}`eq:kl_likelihood_link` tells us that:
+- When $K_g < K_f$ (i.e., $g$ is closer to $h$ than $f$ is), the expected log likelihood ratio is negative, so $L\left(w^t\right) \rightarrow 0$.
+- When $K_g > K_f$ (i.e., $f$ is closer to $h$ than $g$ is), the expected log likelihood ratio is positive, so $L\left(w^t\right) \rightarrow + \infty$.
 
-We can also try an $h$ that is closer to $f$ than is
-$g$ so that now $K_g$ is larger than $K_f$.
+Let's verify this using simulation.
 
-```{code-cell} ipython3
-H_a, H_b = 1.2, 1.2
-h = jit(lambda x: p(x, H_a, H_b))
-```
+In the simulation, we generate multiple paths using Beta distributions $f$, $g$, and $h$, and compute the paths of $\log(L(w^t))$.
 
-```{code-cell} ipython3
-Kf, Kg = compute_KL(h, f, g)
-Kf, Kg
-```
+We consider three cases: (1) $h$ is closer to $f$, (2) $f$ and $g$ are approximately equidistant from $h$, and (3) $h$ is closer to $g$.
 
 ```{code-cell} ipython3
-l_arr_h = simulate(H_a, H_b)
-l_seq_h = np.cumprod(l_arr_h, axis=1)
-```
+:tags: [hide-input]
 
-Now probability mass of $L\left(w^t\right)$ falling above
-$10000$ diverges to $+\infty$.
+# Define test scenarios
+scenarios = [
+    {
+        "name": "KL(h,g) > KL(h,f)",
+        "h_params": (1.2, 1.1),
+        "expected": r"$L_t \to \infty$"
+    },
+    {
+        "name": "KL(h,g) ≈ KL(h,f)",
+        "h_params": (2, 1.35),
+        "expected": "$L_t$ fluctuates"
+    },
+    {
+        "name": "KL(h,g) < KL(h,f)", 
+        "h_params": (3.5, 1.5),
+        "expected": r"$L_t \to 0$"
+    }
+]
 
-```{code-cell} ipython3
-N, T = l_arr_h.shape
-plt.plot(range(T), np.sum(l_seq_h > 10000, axis=0) / N)
+fig, axes = plt.subplots(2, 3, figsize=(15, 12))
+
+for i, scenario in enumerate(scenarios):
+    # Define h
+    h = lambda x: p(x, scenario["h_params"][0], 
+                    scenario["h_params"][1])
+    
+    # Compute KL divergences
+    Kf, Kg = compute_KL(h, f, g)
+    kl_diff = Kg - Kf
+    
+    # Simulate paths
+    N_paths = 100
+    T = 150
+
+    # Generate data from h
+    h_data = np.random.beta(scenario["h_params"][0], 
+                scenario["h_params"][1], (N_paths, T))
+    l_ratios = f(h_data) / g(h_data)
+    l_cumulative = np.cumprod(l_ratios, axis=1)
+    log_l_cumulative = np.log(l_cumulative)
+    
+    # Plot distributions
+    ax = axes[0, i]
+    x_range = np.linspace(0.001, 0.999, 200)
+    ax.plot(x_range, [f(x) for x in x_range], 
+        'b-', label='f', linewidth=2)
+    ax.plot(x_range, [g(x) for x in x_range], 
+        'r-', label='g', linewidth=2)
+    ax.plot(x_range, [h(x) for x in x_range], 
+        'g--', label='h (data)', linewidth=2)
+    ax.set_xlabel('w')
+    ax.set_ylabel('density')
+    ax.set_title(scenario["name"], fontsize=16)
+    ax.legend()
+    
+    # Plot log likelihood ratio paths
+    ax = axes[1, i]
+    for j in range(min(20, N_paths)):
+        ax.plot(log_l_cumulative[j, :], alpha=0.3, color='purple')
+    
+    # Plot theoretical expectation
+    theory_line = kl_diff * np.arange(1, T+1)
+    ax.plot(theory_line, 'k--', linewidth=2, label=r'$t \times (K_g - K_f)$')
+    
+    ax.set_xlabel('t')
+    ax.set_ylabel('$log L_t$')
+    ax.set_title(f'KL(h,f)={Kf:.3f}, KL(h,g)={Kg:.3f}\n{scenario["expected"]}', 
+                fontsize=16)
+    ax.legend(fontsize=16)
+
+plt.tight_layout()
+plt.show()
 ```
 
+Note that
+
+- In the first figure, $\log L(w^t)$ diverges to $\infty$ because $K_g > K_f$.
+- In the second figure, we still have $K_g > K_f$, but the difference is smaller, so $L(w^t)$ diverges to infinity at a slower pace.
+- In the last figure, $\log L(w^t)$ diverges to $-\infty$ because $K_g < K_f$.
+- The black dotted line, $t \left(KL(h,g) - KL(h, f)\right)$, closely fits the paths verifying {eq}`eq:kl_likelihood_link`.
+
+These observations align with the theory.
+
+In a [later section](hetero_agent), we will see an application of these ideas.
+
++++
+
 ## Hypothesis Testing and Classification 
 
 We now describe how a  statistician can combine frequentist probabilities of type I and type II errors in order to 
@@ -746,8 +790,7 @@ We assume that $f$ and $g$ both put positive probabilities on the same intervals
 
   
 
-In the simulations below, we specify that  $f$ is a $\text{Beta}(1, 1)$ distribution and that  $g$ is $\text{Beta}(3, 1.2)$ distribution,
-just as we did often earlier in this lecture.
+In the simulations below, we specify that  $f$ is a $\text{Beta}(1, 1)$ distribution and that  $g$ is $\text{Beta}(3, 1.2)$ distribution.
 
 We consider two alternative timing protocols. 
 
@@ -993,7 +1036,7 @@ plt.tight_layout()
 plt.show()
 ```
 
-To  the left of the  green vertical line  $f < g $,  so $l_t < 1$; therefore a  $w_t$ that falls to the left of the green line is classified as a type $g$ individual. 
+To  the left of the  green vertical line  $g < f$,  so $l_t < 1$; therefore a  $w_t$ that falls to the left of the green line is classified as a type $g$ individual. 
 
  * The shaded orange area equals $\beta$ -- the probability of classifying someone as a type $g$ individual when it is really a type $f$ individual.
 
@@ -1182,6 +1225,8 @@ where $m = \frac{1}{2}(f+g)$ is a mixture of $f$ and $g$.
 ```{note}
 We studied KL divergence in the [section above](rel_entropy) with respect to a reference distribution $h$.
 
+Recall that  KL divergence $KL(f, g)$ measures expected excess surprisal from using misspecified model $g$ instead $f$ when $f$ is the true model.
+
 Because in general $KL(f, g) \neq KL(g, f)$, KL divergence is not symmetric, but Jensen-Shannon divergence is symmetric.
 
 (In fact, the square root of the Jensen-Shannon divergence is a metric referred to as the Jensen-Shannon distance.)
@@ -1189,7 +1234,7 @@ Because in general $KL(f, g) \neq KL(g, f)$, KL divergence is not symmetric, but
 As {eq}`eq:js_divergence` shows, the Jensen-Shannon divergence computes average of the KL divergence of $f$ and $g$ with respect to a particular reference distribution $m$ defined below the equation.
 ```
 
-Now let's create a comparison table showing KL divergence, Jensen-Shannon divergence, and Chernoff entropy
+Now let's create a comparison table showing KL divergence, Jensen-Shannon divergence, and Chernoff entropy for a set of pairs of Beta distributions.
 
 ```{code-cell} ipython3
 def js_divergence(f, g):
@@ -1221,6 +1266,12 @@ def kl_divergence(f, g):
 
 distribution_pairs = [
     # (f_params, g_params)
+    ((1, 1), (0.1, 0.2)),
+    ((1, 1), (0.3, 0.3)),
+    ((1, 1), (0.3, 0.4)),
+    ((1, 1), (0.5, 0.5)),
+    ((1, 1), (0.7, 0.6)),
+    ((1, 1), (0.9, 0.8)),
     ((1, 1), (1.1, 1.05)),
     ((1, 1), (1.2, 1.1)),
     ((1, 1), (1.5, 1.2)),
@@ -1228,10 +1279,7 @@ distribution_pairs = [
     ((1, 1), (2.5, 1.8)),
     ((1, 1), (3, 1.2)),
     ((1, 1), (4, 1)),
-    ((1, 1), (5, 1)), 
-    ((2, 2), (4, 2)), 
-    ((2, 2), (6, 2)), 
-    ((2, 2), (8, 2)),
+    ((1, 1), (5, 1))
 ]
 
 # Create comparison table
@@ -1304,8 +1352,8 @@ def plot_dist_diff():
         ((1, 1), (1.5, 1.2)),
         ((1, 1), (2, 1.5)),  
         ((1, 1), (3, 1.2)),  
-        ((1, 1), (5, 1)),    
-        ((2, 2), (8, 2)), 
+        ((1, 1), (5, 1)),
+        ((1, 1), (0.3, 0.3))
     ]
     
     fig, axes = plt.subplots(3, 2, figsize=(15, 12))
@@ -1372,9 +1420,15 @@ In the simulation below, nature draws $N / 2$ sequences from $g$ and $N/2$ seque
 
 ```{note}
 Nature does this rather than flipping a fair coin to decide whether to draw from $g$ or $f$ once and for all before each simulation of length $T$.
-``` 
+```
 
 ```{code-cell} ipython3
+def compute_likelihood_ratio_stats(sequences, f, g):
+    """Compute likelihood ratios and cumulative products."""
+    l_ratios = f(sequences) / g(sequences)
+    L_cumulative = np.cumprod(l_ratios, axis=1)
+    return l_ratios, L_cumulative
+
 def error_divergence_cor():
     """
     compute correlation between error probabilities and divergence measures.
@@ -1409,13 +1463,15 @@ def error_divergence_cor():
         sequences_f = np.random.beta(f_a, f_b, (N_half, T_large))
         sequences_g = np.random.beta(g_a, g_b, (N_half, T_large))
 
-        # Compute likelihood ratios
-        l_ratios_f = f(sequences_f) / g(sequences_f)
-        l_ratios_g = f(sequences_g) / g(sequences_g)
 
-        # Compute cumulative products
-        L_cumulative_f = np.cumprod(l_ratios_f, axis=1)[:, -1]
-        L_cumulative_g = np.cumprod(l_ratios_g, axis=1)[:, -1]
+
+        # Compute likelihood ratios and cumulative products
+        _, L_cumulative_f = compute_likelihood_ratio_stats(sequences_f, f, g)
+        _, L_cumulative_g = compute_likelihood_ratio_stats(sequences_g, f, g)
+        
+        # Get final values
+        L_cumulative_f = L_cumulative_f[:, -1]
+        L_cumulative_g = L_cumulative_g[:, -1]
 
         # Calculate error probabilities
         error_probs[i] = 0.5 * (np.mean(L_cumulative_f < 1) + 
@@ -1453,7 +1509,7 @@ def plot_error_divergence(data):
     # Create figure and axes
     fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
     
-    # Helper function for plotting correlation
+    # function for plotting correlation
     def plot_correlation(ax, x_vals, x_label, color):
         ax.scatter(x_vals, log_error, alpha=0.7, s=60, color=color)
         ax.set_xlabel(x_label)
@@ -1461,7 +1517,7 @@ def plot_error_divergence(data):
         
         # Calculate correlation and trend line
         corr = np.corrcoef(x_vals, log_error)[0, 1]
-        z = np.polyfit(x_vals, log_error, 1)
+        z = np.polyfit(x_vals, log_error, 2)
         x_trend = np.linspace(x_vals.min(), x_vals.max(), 100)
         ax.plot(x_trend, np.poly1d(z)(x_trend), 
                 "r--", alpha=0.8, linewidth=2)
@@ -1484,6 +1540,594 @@ We'll encounter related  ideas in {doc}`wald_friedman`.
 
 +++
 
+(hetero_agent)=
+## Consumption and Heterogeneous Beliefs
+
+A likelihood ratio process lies behind  Lawrence  Blume and David Easley's answer to their question
+''If you're so smart, why aren't you rich?'' {cite}`blume2006if`.  
+
+Here we'll  provide an example that illustrates  basic components of their analysis. 
+
+Let the random variable $s_t \in (0,1)$ at time $t =0, 1, 2, \ldots$ be distributed according to the same  Beta distribution  with parameters 
+$\theta = [\theta_1, \theta_2]$.
+
+We'll denote this  probability density as
+
+$$
+\pi(s_t|\theta)
+$$
+
+Below, we'll often just write $\pi(s_t)$ instead of $\pi(s_t|\theta)$ to save space.
+
+Let $s_t \equiv y_t^1$ be the endowment of a nonstorable consumption good  that a person we'll call "agent 1" receives at time $t$.
+
+Let a history $s^t = [s_t, s_{t-1}, \ldots, s_0]$ be a sequence of i.i.d. random variables with joint distribution
+
+$$
+\pi_t(s^t) = \pi(s_t) \pi(s_{t-1}) \cdots \pi(s_0)
+$$ 
+
+So in our example, the history $s^t$ is a comprehensive record of agent $1$'s endowments of the consumption good  from time $0$ up to time $t$.  
+
+If agent $1$ were to lives on an island by himself, agent $1$'s consumption $c^1(s_t)$  at time $t$ is 
+
+$$c^1(s_t) = y_t^1 = s_t. $$
+
+But in our model, agent 1 is not alone.  
+
+### Nature and beliefs
+
+Nature draws i.i.d. sequences $\{s_t\}_{t=0}^\infty$ from $\pi_t(s^t)$.
+
+* so $\pi$ without a superscript is nature's model 
+* but in addition to nature, there are other entities inside our model -- artificial people that we call "agents"
+* each agent  has a sequence of probability distributions over $s^t$ for $t=0, \ldots$ 
+* agent $i$ thinks that nature draws i.i.d. sequences $\{s_t\}_{t=0}^\infty$ from $\pi_t^i(s^t)$
+   * agent $i$ is mistaken unless $\pi_t^i(s^t) = \pi_t(s^t)$
+
+```{note}
+A **rational expectations** model would set $\pi_t^i(s^t) = \pi_t(s^t)$ for all agents $i$.
+```
+
+
+
+There are two agents named $i=1$ and $i=2$.
+
+At time $t$, agent $1$ receives an endowment
+
+$$
+y_t^1 = s_t 
+$$
+
+of a nonstorable consumption good, while agent $2$ receives an endowment of 
+
+
+$$
+y_t^2 = 1 - s_t 
+$$
+
+The aggregate endowment of the consumption good is
+
+$$
+y_t^1 + y_t^2 = 1
+$$
+
+at each date $t \geq 0$. 
+
+At date $t$ agent $i$ consumes $c_t^i(s^t)$ of the good.  
+
+A (non wasteful) feasible allocation of the aggregate endowment of  $1$ each period  satisfies
+
+$$
+c_t^1 + c_t^2 = 1 .
+$$
+
+### A social risk-sharing arrangement
+
+In order to share risks, a  benevolent social planner will dictate a  a history-dependent consumption allocation in the form of a sequence of functions 
+
+$$
+c_t^i = c_t^i(s^t)
+$$
+
+that satisfy
+
+$$
+c_t^1(s^t) + c_t^2(s^t) = 1  
+$$ (eq:feasibility)
+
+for all $s^t$ for all $t \geq 0$. 
+
+To design a socially optimal allocation, the social planner wants to know what agents $1$ believe about the endowment sequence and how they feel about bearing risks.  
+
+As for the endowment sequences, agent $i$ believes that nature draws i.i.d.  sequences from joint densities 
+
+$$
+\pi_t^i(s^t) = \pi(s_t)^i \pi^i(s_{t-1}) \cdots \pi^i(s_0)
+$$ 
+
+As for attitudes toward bearing risks, agent $i$ has a one-period utility function
+
+$$
+u(c_t^i) = u(c_t^i) = \ln (c_t^i)
+$$
+
+with marginal utility of consumption in period $i$
+
+$$
+u'(c_t^i) = \frac{1}{c_t^i}
+$$
+
+Putting its beliefs about its random  endowment sequence and its attitudes toward bearing risks together,  agent $i$ has intertemporal utility function 
+
+$$
+V^i = \sum_{t=0}^{\infty} \sum_{s^t} \delta^t u(c_t^i(s^t)) \pi_t^i(s^t) ,
+$$ (eq:objectiveagenti)
+
+where $\delta \in (0,1)$ is an intertemporal discount factor, and $u(\cdot)$ is a strictly increasing, concave one-period utility function.
+
+
+### The social planner's allocation problem
+
+The benevolent dictator has all the information it requires to choose a consumption allocation that maximizes the  social welfare criterion 
+
+$$
+W = \lambda V^1 + (1-\lambda) V^2
+$$ (eq:welfareW)
+
+where $\lambda \in [0,1]$ is a Pareto weight  tells how much the planner likes  agent $1$ and $1 - \lambda$ is a Pareto weight that tells how much the socical planner likes  agent $2$.  
+
+Setting $\lambda = .5$ expresses ''egalitarian'' social preferences. 
+
+Notice how social welfare criterion {eq}`eq:welfareW` takes into account both agent's preferences as represented by formula {eq}`eq:objectiveagenti`.
+
+This means that the social planner knows and respects
+
+* the one period utility function $u(\cdot) = \ln(\cdot)$
+* each agent $i$'s probability model $\{\pi_t^i(s^t)\}_{t=0}^\infty$
+
+Consequently, we anticipate that   these objects will appear in the social planner's rule for allocating the aggregate endowment each period. 
+
+
+First-order necessary conditions for maximizing welfare criterion {eq}`eq:welfareW` subject to the feasibility constraint {eq}`eq:feasibility` are 
+
+$$\frac{\pi_t^2(s^t)}{\pi_t^1(s^t)} \frac{(1/c_t^2(s^t))}{(1/c_t^1(s^t))} = \frac{\lambda}{1 -\lambda}$$
+
+which can be rearranged to become  
+
+
+
+
+$$
+\frac{c_t^1(s^t)}{c_t^2(s^t)} = \frac{\lambda}{1- \lambda} l_t(s^t)
+$$ (eq:allocationrule0)
+
+
+where
+
+$$ l_t(s^t) = \frac{\pi_t^1(s^t)}{\pi_t^2(s^t)} $$
+
+is the likelihood ratio of agent 1's joint density to agent 2's joint density. 
+
+Using 
+
+$$c_t^1(s^t) + c_t^2(s^t) = 1$$
+
+we can rewrite allocation rule {eq}`eq:allocationrule0` as 
+
+
+
+$$\frac{c_t^1(s^t)}{1 - c_t^1(s^t)} = \frac{\lambda}{1-\lambda} l_t(s^t)$$
+
+or 
+
+$$c_t^1(s^t) = \frac{\lambda}{1-\lambda} l_t(s^t)(1 - c_t^1(s^t))$$
+
+which  implies that the social planner's allocation rule is
+
+$$
+c_t^1(s^t) = \frac{\lambda l_t(s^t)}{1-\lambda + \lambda l_t(s^t)}
+$$ (eq:allocationrule1)
+
+If we define a temporary or **continuation Pareto weight**  process as 
+
+$$
+\lambda_t(s^t) = \frac{\lambda l_t(s^t)}{1-\lambda + \lambda l_t(s^t)},
+$$
+
+then we can represent the social planner's allocation rule as
+
+$$
+c_t^1(s^t) = \lambda_t(s^t) .
+$$
+
+
+
+
+### If you're so smart, $\ldots$ 
+
+
+Let's compute some values   of limiting allocations {eq}`eq:allocationrule1` for some interesting possible limiting
+values of the likelihood ratio process $l_t(s^t)$:
+
+ $$l_\infty (s^\infty)= 1; \quad c_\infty^1 = \lambda$$
+ 
+  *  In the above case, both agents are equally smart (or equally not smart) and the consumption allocation stays put  at a $\lambda, 1 - \lambda $ split between the two agents. 
+
+$$l_\infty (s^\infty) = 0; \quad c_\infty^1 = 0$$
+
+*  In the above case, agent 2 is smarter than agent 1, and agent 1's share of the aggregate endowment converges to zero.  
+
+
+
+$$l_\infty (s^\infty)= \infty; \quad c_\infty^1 = 1$$
+
+*  In the above case, agent 1 is smarter than agent 2, and agent 1's share of the aggregate endowment converges to 1. 
+
+
+Soon we'll do some simulations that will shed further light on possible outcomes.
+
+But before we do that, let's take a detour and study some  "shadow prices" for the social planning problem that can readily be
+converted to "equilibrium prices" for a competitive equilibrium. 
+
+Doing this will allow us to connect our analysis with an argument  of {cite}`alchian1950uncertainty` and {cite}`friedman1953essays` that competitive market processes can make prices of risky assets better reflect realistic probability assessments. 
+
+
+
+### Competitive Equilibrium Prices 
+
+The two fundamental welfare theorems for general equilibrium models lead us to anticipate that there is  a connection between the allocation that solves the social planning problem we have been studying and the allocation in a  **competitive equilibrium**  with complete markets in history-contingent commodities.
+
+```{note}
+For the two welfare theorems and their history, see   <https://en.wikipedia.org/wiki/Fundamental_theorems_of_welfare_economics>.
+```
+
+Such a connection prevails for our model.  
+
+We'll sketch it now.
+
+In a competitive equilibrium, there is no social planner that dictatorially collects everybody's endowments and then reallocates them.
+
+Instead, there is a comprehensive centralized   market that meets at one point in time.
+
+There are **prices** at which price-taking agents can buy or sell whatever goods that they want.  
+
+Trade is multilateral in the sense that all that there is a "Walrasian auctioneer" who lives outside the model and whose job is to verify that
+each agent's budget constraint is satisfied.  
+
+That budget constraint involves the total value of the agent's endowment stream and the total value of its consumption stream.  
+
+Suppose that at time $-1$, before time $0$ starts, agent  $i$ can purchase one unit $c_t(s^t)$ of  consumption at time $t$ after history
+$s^t$ at price $p_t(s^t)$.  
+
+Notice that there is (very long) **vector** of prices.  
+
+We want to study how agents' diverse beliefs influence equilibrium prices.  
+
+Agent $i$ faces a **single** intertemporal budget constraint
+
+$$
+\sum_{t=0}\sum_{s^t} p_t(s^t) c_t^i (y_t(s^t)) \leq \sum_{t=0}\sum_{s^t} p_t(s^t) y_t^i (y_t(s^t))
+$$ (eq:budgetI)
+
+Agent $i$ puts a Lagrange multiplier $\mu^i$ on {eq}`eq:budgetI` and once-and-for-all chooses a consumption plan $\{c^i_t(s^t)\}_{t=0}^\infty$
+to maximize criterion {eq}`eq:objectiveagenti` subject to budget constraint {eq}`eq:budgetI`.
+
+```{note}
+For convenience, let's remind ourselves of criterion {eq}`eq:objectiveagenti`:  
+$
+V^i = \sum_{t=0}^{\infty} \sum_{s^t} \delta^t u_t(c_t^i(s^t)) \pi_t^i(s^t)$
+```
+
+First-order conditions for maximizing  with respect to $c_t^i(s^t)$ are 
+
+$$
+\delta^t u'(c^i(s^t)) \pi_t^i(s^t) = \mu_i p_t(s^t) ,
+$$ 
+
+which we can rearrange to obtain
+
+$$
+p_t(s^t) = \frac{ \delta^t \pi_t^i(s^t)}{\mu^i c^i(s^t)}   
+$$ (eq:priceequation1)
+
+for $i=1,2$.  
+
+If we divide equation {eq}`eq:priceequation1` for agent $1$ by the appropriate  version of equation {eq}`eq:priceequation1` for agent 2, use
+$c^2_t(s^t) = 1 - c^1_t(s^t)$, and do some algebra, we'll obtain
+
+$$
+c_t^1(s^t) = \frac{\mu_1 l_t(s^t)}{\mu_2 + \mu_1 l_t(s^t)} .
+$$ (eq:allocationce)
+
+We now engage in an extended "guess-and-verify" exercise that involves matching objects in our competitive equilibrium with objects in 
+our social planning problem.  
+
+* we'll match consumption allocations in the planning problem with equilibrium consumption allocations in the competitive equilibrium
+* we'll match "shadow" prices in the planning problem with competitive equilibrium prices. 
+
+Notice that if we set $\mu_1 = \lambda$ and $\mu_2 = 1 -\lambda$, then  formula {eq}`eq:allocationce` agrees with formula
+{eq}`eq:allocationrule1`.  
+
+  * doing this amounts to choosing a **numeraire** or normalization for the price system $\{p_t(s^t)\}_{t=0}^\infty$
+
+```{note}
+For information about how a numeraire  must be chosen to pin down the absolute price level in a model like ours that determines only
+relative prices,   see <https://en.wikipedia.org/wiki/Num%C3%A9raire>.
+```
+
+If we substitute formula  {eq}`eq:allocationce` for $c_t^1(s^t)$ into formula {eq}`eq:priceequation1` and rearrange, we obtain
+
+$$
+p_t(s^t) = \frac{\delta^t \pi_t^2(s^t)}{1 - \lambda + \lambda l_t(s^t)}
+$$ (eq:pformulafinal)
+
+According to formula {eq}`eq:pformulafinal`, we have the following possible limiting cases:
+
+* when $l_\infty = 0$, $c_\infty^2 = 0 $ and tails of competitive equilibrium prices reflect agent $2$'s probability model $\pi_t^2(s^t)$ 
+* when $l_\infty = 1$, $c_\infty^1 = 0 $ and tails competitive equilibrium prices reflect agent $1$'s probability model $\pi_t^2(s^t)$ 
+* for small $t$'s, competitive equilbrium prices reflect both agents' probability models.  
+
+### Simulations 
+
+Now let's implement some simulations when agent $1$ believes marginal density 
+
+$$\pi^1(s_t) = f(s_t) $$
+
+and agent $2$ believes marginal density 
+
+$$ \pi^2(s_t) = g(s_t) $$
+
+where $f$ and $g$ are Beta distributions like ones that  we used in earlier  sections of this lecture.
+
+Meanwhile, we'll assume that  nature believes a  marginal density
+
+$$
+\pi(s_t) = h(s_t) 
+$$
+
+where $h(s_t)$ is perhaps a  mixture of $f$ and $g$.
+
+Let's  write a Python function that computes agent 1's  consumption share
+
+```{code-cell} ipython3
+def simulate_blume_easley(sequences, f_belief=f, g_belief=g, λ=0.5):
+    """Simulate Blume-Easley model consumption shares."""
+    l_ratios, l_cumulative = compute_likelihood_ratio_stats(sequences, f_belief, g_belief)
+    c1_share = λ * l_cumulative / (1 - λ + λ * l_cumulative)
+    return l_cumulative, c1_share
+```
+
+Now let's use this  function to generate sequences in which  
+
+*  nature draws from  $f$ each period, or 
+*  nature draws from  $g$ each period, or
+*  or nature flips a fair coin each period  to decide whether  to draw from  $f$ or $g$
+
+```{code-cell} ipython3
+λ = 0.5
+T = 100
+N = 10000
+
+# Nature follows f, g, or mixture
+s_seq_f = np.random.beta(F_a, F_b, (N, T))
+s_seq_g = np.random.beta(G_a, G_b, (N, T))
+
+h = jit(lambda x: 0.5 * f(x) + 0.5 * g(x))
+model_choices = np.random.rand(N, T) < 0.5
+s_seq_h = np.empty((N, T))
+s_seq_h[model_choices] = np.random.beta(F_a, F_b, size=model_choices.sum())
+s_seq_h[~model_choices] = np.random.beta(G_a, G_b, size=(~model_choices).sum())
+
+l_cum_f, c1_f = simulate_blume_easley(s_seq_f)
+l_cum_g, c1_g = simulate_blume_easley(s_seq_g)
+l_cum_h, c1_h = simulate_blume_easley(s_seq_h)
+```
+
+Before looking at the figure below, have some fun by guessing whether agent 1 or agent 2 will have a larger and larger consumption share as time passes in our three cases. 
+
+To make better guesses,  let's visualize instances of the likelihood ratio processes in  the three cases.
+
+```{code-cell} ipython3
+fig, axes = plt.subplots(2, 3, figsize=(15, 10))
+
+titles = ["Nature = f", "Nature = g", "Nature = mixture"]
+data_pairs = [(l_cum_f, c1_f), (l_cum_g, c1_g), (l_cum_h, c1_h)]
+
+for i, ((l_cum, c1), title) in enumerate(zip(data_pairs, titles)):
+    # Likelihood ratios
+    ax = axes[0, i]
+    for j in range(min(50, l_cum.shape[0])):
+        ax.plot(l_cum[j, :], alpha=0.3, color='blue')
+    ax.set_yscale('log')
+    ax.set_xlabel('time')
+    ax.set_ylabel('Likelihood ratio $l_t$')
+    ax.set_title(title)
+    ax.axhline(y=1, color='red', linestyle='--', alpha=0.5)
+
+    # Consumption shares
+    ax = axes[1, i]
+    for j in range(min(50, c1.shape[0])):
+        ax.plot(c1[j, :], alpha=0.3, color='green')
+    ax.set_xlabel('time')
+    ax.set_ylabel("Agent 1's consumption share")
+    ax.set_ylim([0, 1])
+    ax.axhline(y=λ, color='red', linestyle='--', alpha=0.5)
+
+plt.tight_layout()
+plt.show()
+```
+
+In the left panel, nature chooses $f$. Agent 1's consumption reaches $1$ very quickly.
+
+In the middle panel, nature chooses $g$. Agent 1's consumption ratio tends to move towards $0$ but not as fast as in the first case.
+
+In the right panel, nature flips coins each period. We see a very similar pattern to the processes in the left panel.
+
+The figures in the top panel remind us of the discussion in [this section](rel_entropy).
+
+We invite readers to revisit [that section](rel_entropy) and try to infer the relationships among $KL(f, g)$, $KL(g, f)$, $KL(h, f)$, and $KL(h,g)$.
+
+
+Let's compute values of KL divergence
+
+```{code-cell} ipython3
+shares = [np.mean(c1_f[:, -1]), np.mean(c1_g[:, -1]), np.mean(c1_h[:, -1])]
+Kf_g, Kg_f = kl_divergence(f, g), kl_divergence(g, f)
+Kf_h, Kg_h = compute_KL(h, f, g)
+
+print(f"Final shares: f={shares[0]:.3f}, g={shares[1]:.3f}, mix={shares[2]:.3f}")
+print(f"KL divergences: \nKL(f,g)={Kf_g:.3f}, KL(g,f)={Kg_f:.3f}")
+print(f"KL(h,f)={Kf_h:.3f}, KL(h,g)={Kg_h:.3f}")
+```
+
+We find that $KL(f,g) > KL(g,f)$ and $KL(h,g) > KL(h,f)$.
+
+The first inequality tells us that the average "surprise" or "inefficiency" of using belief $g$ when nature chooses $f$ is greater than the "surprise" of using belief $f$ when nature chooses $g$.
+
+The second inequality tells us that agent 1's belief distribution $f$ is closer to nature's pick than agent 2's belief $g$.
+
++++
+
+To make this idea more concrete, let's compare two cases:
+
+- agent 1's belief distribution $f$ is close to agent 2's belief distribution $g$;
+- agent 1's belief distribution $f$ is far from agent 2's belief distribution $g$.
+
+
+We use the two distributions visualized below
+
+```{code-cell} ipython3
+def plot_distribution_overlap(ax, x_range, f_vals, g_vals, 
+                            f_label='f', g_label='g', 
+                            f_color='blue', g_color='red'):
+    """Plot two distributions with their overlap region."""
+    ax.plot(x_range, f_vals, color=f_color, linewidth=2, label=f_label)
+    ax.plot(x_range, g_vals, color=g_color, linewidth=2, label=g_label)
+    
+    overlap = np.minimum(f_vals, g_vals)
+    ax.fill_between(x_range, 0, overlap, alpha=0.3, color='purple', label='Overlap')
+    ax.set_xlabel('x')
+    ax.set_ylabel('Density')
+    ax.legend()
+    
+# Define close and far belief distributions
+f_close = jit(lambda x: p(x, 1, 1))
+g_close = jit(lambda x: p(x, 1.1, 1.05))
+
+f_far = jit(lambda x: p(x, 1, 1))
+g_far = jit(lambda x: p(x, 3, 1.2))
+
+js_close = js_divergence(f_close, g_close)
+js_far = js_divergence(f_far, g_far)
+
+# Visualize the belief distributions
+fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 5))
+
+x_range = np.linspace(0.001, 0.999, 200)
+
+# Close beliefs
+f_close_vals = [f_close(x) for x in x_range]
+g_close_vals = [g_close(x) for x in x_range]
+plot_distribution_overlap(ax1, x_range, f_close_vals, g_close_vals,
+                         f_label='f (Beta(1, 1))', g_label='g (Beta(1.1, 1.05))')
+ax1.set_title(f'Close Beliefs\nJS divergence={js_close:.4f}')
+
+# Far beliefs
+f_far_vals = [f_far(x) for x in x_range]
+g_far_vals = [g_far(x) for x in x_range]
+plot_distribution_overlap(ax2, x_range, f_far_vals, g_far_vals,
+                         f_label='f (Beta(1, 1))', g_label='g (Beta(3, 1.2))')
+ax2.set_title(f'Far Beliefs\nJS divergence={js_far:.4f}')
+
+plt.tight_layout()
+plt.show()
+```
+
+Let's draw the same consumption ratio plots as above for agent 1.
+
+We replace the simulation paths with median and percentiles to make the figure cleaner.
+
+Staring at the figure below, can we infer the relation between $KL(f,g)$ and $KL(g,f)$?
+
+From the right panel, can we infer the relation between $KL(h,g)$ and $KL(h,f)$?
+
+```{code-cell} ipython3
+fig, axes = plt.subplots(2, 3, figsize=(15, 10))
+nature_params = {'close': [(1, 1), (1.1, 1.05), (2, 1.5)],
+                 'far':   [(1, 1), (3, 1.2),   (2, 1.5)]}
+nature_labels = ["Nature = f", "Nature = g", "Nature = h"]
+colors = {'close': 'blue', 'far': 'red'}
+
+threshold = 1e-5  # “close to zero” cutoff
+
+for row, (f_belief, g_belief, label) in enumerate([
+                        (f_close, g_close, 'close'),
+                        (f_far, g_far, 'far')]):
+    
+    for col, nature_label in enumerate(nature_labels):
+        params = nature_params[label][col]
+        s_seq = np.random.beta(params[0], params[1], (1000, 200))
+        _, c1 = simulate_blume_easley(s_seq, f_belief, g_belief, λ)
+        
+        median_c1 = np.median(c1, axis=0)
+        p10, p90 = np.percentile(c1, [10, 90], axis=0)
+        
+        ax = axes[row, col]
+        color = colors[label]
+        ax.plot(median_c1, color=color, linewidth=2, label='Median')
+        ax.fill_between(range(len(median_c1)), p10, p90, alpha=0.3, color=color, label='10–90%')
+        ax.set_xlabel('time')
+        ax.set_ylabel("Agent 1's share")
+        ax.set_ylim([0, 1])
+        ax.set_title(nature_label)
+        ax.axhline(y=λ, color='gray', linestyle='--', alpha=0.5)
+        below = np.where(median_c1 < threshold)[0]
+        above = np.where(median_c1 > 1-threshold)[0]
+        if below.size > 0: first_zero = (below[0], True)
+        elif above.size > 0: first_zero = (above[0], False)
+        else: first_zero = None
+        if first_zero is not None:
+            ax.axvline(x=first_zero[0], color='black', linestyle='--',
+                       alpha=0.7, 
+                       label=fr'Median $\leq$ {threshold}' if first_zero[1]
+                       else fr'Median $\geq$ 1-{threshold}')
+        ax.legend()
+
+plt.tight_layout()
+plt.show()
+```
+
+Holding to our guesses, let's calculate the four values
+
+```{code-cell} ipython3
+# Close case
+Kf_g, Kg_f = kl_divergence(f_close, g_close), kl_divergence(g_close, f_close)
+Kf_h, Kg_h = compute_KL(h, f_close, g_close)
+
+print(f"KL divergences (close): \nKL(f,g)={Kf_g:.3f}, KL(g,f)={Kg_f:.3f}")
+print(f"KL(h,f)={Kf_h:.3f}, KL(h,g)={Kg_h:.3f}")
+
+# Far case
+Kf_g, Kg_f = kl_divergence(f_far, g_far), kl_divergence(g_far, f_far)
+Kf_h, Kg_h = compute_KL(h, f_far, g_far)
+
+print(f"KL divergences (far): \nKL(f,g)={Kf_g:.3f}, KL(g,f)={Kg_f:.3f}")
+print(f"KL(h,f)={Kf_h:.3f}, KL(h,g)={Kg_h:.3f}")
+```
+
+We find that in the first case, $KL(f,g) \approx KL(g,f)$ and both are relatively small, so although either agent 1 or agent  2 will eventually consume everything, convergence displaying in  first two panels on the top is pretty  slowly.
+
+In the first two panels at the bottom, we see convergence occurring faster (as indicated by the black dashed line) because the divergence gaps $KL(f, g)$ and $KL(g, f)$ are larger.
+
+Since $KL(f,g) > KL(g,f)$, we  see faster convergence in  the first panel at the bottom when  nature chooses $f$  than in the second panel where nature chooses $g$.
+
+This ties in nicely with {eq}`eq:kl_likelihood_link`.
+
+
++++
+
+
 ## Related Lectures
 
 Likelihood processes play an important role in Bayesian learning, as described in {doc}`likelihood_bayes`
diff --git a/lectures/mccall_model.md b/lectures/mccall_model.md
index 37ede1e4a..3298eec65 100644
--- a/lectures/mccall_model.md
+++ b/lectures/mccall_model.md
@@ -369,17 +369,29 @@ We are going to use Numba to accelerate our code.
 
 * See, in particular, the discussion of `@jitclass` in [our lecture on Numba](https://python-programming.quantecon.org/numba.html).
 
-The following helps Numba by providing some type
+The following helps Numba by providing some type specifications.
 
 ```{code-cell} python3
 mccall_data = [
     ('c', float64),      # unemployment compensation
     ('β', float64),      # discount factor
-    ('w', float64[:]),   # array of wage values, w[i] = wage at state i
-    ('q', float64[:])    # array of probabilities
+    ('w', float64[::1]), # array of wage values, w[i] = wage at state i
+    ('q', float64[::1])  # array of probabilities
 ]
 ```
 
+```{note}
+Note the use of `[::1]` in the array type declarations above. 
+
+This notation specifies that the arrays should be C-contiguous.
+
+This is important for performance, especially when using the `@` operator for matrix multiplication (e.g., `v @ q`). 
+
+Without this specification, Numba might need to handle non-contiguous arrays, which can significantly slow down these operations.
+
+Try to replace `[::1]` with `[:]` and see what happens.
+```
+
 Here's a class that stores the data and computes the values of state-action pairs,
 i.e. the value in the maximum bracket on the right hand side of the Bellman equation {eq}`odu_pv2p`,
 given the current state and an arbitrary feasible action.
diff --git a/lectures/mccall_q.md b/lectures/mccall_q.md
index 1be450e6e..4d7b23d5d 100644
--- a/lectures/mccall_q.md
+++ b/lectures/mccall_q.md
@@ -149,8 +149,8 @@ Then we'll plot various iterates on the Bellman operator.
 mccall_data = [
     ('c', float64),      # unemployment compensation
     ('β', float64),      # discount factor
-    ('w', float64[:]),   # array of wage values, w[i] = wage at state i
-    ('q', float64[:]),    # array of probabilities
+    ('w', float64[::1]),  # array of wage values, w[i] = wage at state i
+    ('q', float64[::1])   # array of probabilities
 ]