Clarify docs on using sampler output as inits

amas0 · amas0 · commit d6d5bd3f6625 · 2025-05-14T18:18:08.000-04:00
diff --git a/docsrc/users-guide/examples/VI as Sampler Inits.ipynb b/docsrc/users-guide/examples/VI as Sampler Inits.ipynb
@@ -4,16 +4,17 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "## Using Estimates from Variational, Laplace, or Optimization Methods to Initialize the NUTS-HMC Sampler\n",
+    "## Initializing the NUTS-HMC sampler\n",
     "\n",
-    "In this example, we show how to use parameter estimates returned by Stan's various posterior approximation or optimization algorithms as initial values for Stan's NUTS-HMC sampler. These include:\n",
+    "In this example, we show how to use parameter estimates returned by any of Stan's inference algorithms as initial values for Stan's NUTS-HMC sampler. These include:\n",
     "\n",
     "* [Pathfinder ](https://mc-stan.org/docs/cmdstan-guide/pathfinder-config.html) \n",
     "* [ADVI ](https://mc-stan.org/docs/cmdstan-guide/variational_config.html) \n",
     "* [Laplace](https://mc-stan.org/docs/cmdstan-guide/laplace_sample_config.html)\n",
     "* [Optimization](https://mc-stan.org/docs/cmdstan-guide/optimize_config.html)\n",
+    "* [NUTS-HMC MCMC](https://mc-stan.org/docs/cmdstan-guide/mcmc_config.html)\n",
     "\n",
-    "By default, the NUTS-HMC sampler randomly initializes all model parameters uniformly in the interval $(-2, 2)$.  If this interval is far from the typical set of the posterior, initializing sampling from these approximation algorithms can speed up and improve adaptation.\n",
+    "By default, the NUTS-HMC sampler randomly initializes all (unconstrained) model parameters uniformly in the interval (-2, 2).  If this interval is far from the typical set of the posterior, initializing sampling from these approximation algorithms can speed up and improve adaptation.\n",
     "\n",
     "### Model and data\n",
     "\n",
@@ -49,7 +50,7 @@
    "source": [
     "### Demonstration with Stan's `pathfinder` method\n",
     "\n",
-    "The approximation methods all follow the same general pattern of usage. First, we call the \n",
+    "Initializing the sampler with estimates from any previous inference algorithm follows the same general usage pattern. First, we call the \n",
     "corresponding method on the `CmdStanModel` object. From the resulting fit, we call the `.create_inits()` \n",
     "method to construct a set of per-chain initializations for the model parameters. To make it explicit, \n",
     "we will walk through the process using the `pathfinder` method (which wraps the \n",
@@ -84,7 +85,7 @@
    "source": [
     "Posteriordb provides reference posteriors for all models. For the blr model, conditioned on the dataset `sblri.json`, the reference posteriors can be found in the [sblri-blr.json](https://github.com/stan-dev/posteriordb/blob/master/posterior_database/reference_posteriors/summary_statistics/mean/mean/sblri-blr.json) file.\n",
     "\n",
-    "The reference posteriors for all elements of `beta` and `sigma` are all very close to $1.0$.\n",
+    "The reference posteriors for all elements of `beta` and `sigma` are all very close to 1.0.\n",
     "\n",
     "The experiments reported in Figure 3 of the paper [Pathfinder: Parallel quasi-Newton variational inference](https://arxiv.org/abs/2108.03782) by Zhang et al. show that Pathfinder provides a better estimate of the posterior, as measured by the 1-Wasserstein distance to the reference posterior, than 75 iterations of the warmup Phase I algorithm used by the NUTS-HMC sampler.\n",
     "Furthermore, Pathfinder is more computationally efficient, requiring fewer evaluations of the log density and gradient functions. Therefore, using the Pathfinder estimates to initialize the parameter values for the NUTS-HMC sampler can allow the sampler to do a better job of adapting the stepsize and metric during warmup, resulting in better performance and estimation.\n",
@@ -191,7 +192,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "### Other approximation algorithms"
+    "### Other inference algorithms"
    ]
   },
   {
@@ -349,7 +350,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "It is also possible to use the output of the `sample()` method to construct inits to be fed into a future sampling run:"
+    "It is also possible to use the output of the `sample()` method itself to construct inits to be fed into a future sampling run:"
    ]
   },
   {