minor edits to the linear regression notebook

paul-buerkner · paul-buerkner · commit 7557086fb88d · 2025-04-18T00:49:08.000+02:00
diff --git a/examples/Linear_Regression_Starter.ipynb b/examples/Linear_Regression_Starter.ipynb
@@ -415,7 +415,7 @@
     "id": "PmSWcEVB86pr"
    },
    "source": [
-    "Let me elaborate on a few adapter steps:\n",
+    "Let us discuss a few adapter steps in detail:\n",
     "\n",
     "The `.broadcast(\"N\", to=\"x\")` transform will copy the value of `N` batch-size times to ensure that it will also have a `batch_size` dimension even though it was actually just a single value, constant over all datasets within a batch. The batch dimension will be inferred from `x` (this needs to be present during inference).\n",
     "\n",
@@ -469,7 +469,7 @@
    "source": [
     "Those shapes are as we expect them to be. The first dimenstion is always the batch size which was 500 for our example data. All variables adhere to this rule since the first dimension is indeed 500.\n",
     "\n",
-    "For `summary_variables`, the second dimension is equal to `N`, which happend to be sampled as `14` for these example data. It's third dimension is `2`, since we have combined `x` and `y` into summary variables, each of which are vectors of length `N` within each simulated dataset.\n",
+    "For `summary_variables`, the second dimension is equal to the sampled valued of `N`. It's third dimension is `2`, since we have combined `x` and `y` into summary variables, each of which are vectors of length `N` within each simulated dataset.\n",
     "\n",
     "For `inference_conditions`, the second dimension is just `1` because we have passed only the scalar variable `N` there.\n",
     "\n",
@@ -620,7 +620,7 @@
    "source": [
     "Let's check out the resulting inference. Say we want to obtain 1000 posterior samples from our approximated posterior of a simulated dataset where we know the ground truth values.\n",
     "\n",
-    "You can also explore the automated diagnostics avilable through the methods:\n",
+    "You can also explore the automated diagnostics available through the methods:\n",
     "\n",
     "- `compute_default_diagnostics()`\n",
     "- `plot_default_diagnostics()`"
@@ -793,7 +793,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Accuracy looks good for most datasets There is some more variation especially for $\\beta_2$ but this is not necessarily a reason for concern. Keep in mind that perfect accuracy is not the goal of bayesflow inference. Rather, the goal is to estimate the correct posterior as close as possible. And this correct posterior might very well be far away from the true value for some datasets. In fact, we would fully expect the true value to sometimes be at the tail of the posterior. \n",
+    "Accuracy looks good for most datasets There is some more variation especially for $\\beta_1$ and $\\sigma$ but this is not necessarily a reason for concern. Keep in mind that perfect accuracy is not the goal of bayesflow inference. Rather, the goal is to estimate the correct posterior as close as possible. And this correct posterior might very well be far away from the true value for some datasets. In fact, we would fully expect the true value to sometimes be at the tail of the posterior. \n",
     "\n",
     "If this was not the case, than our posterior approximation may be too wide. Unfortunately, in many cases we don't have access to the correct posterior, so we need a method that provides us with an indication of the posterior approximations' accuracy without. This is where simulation-based calibration (SBC) comes into play. In short, if the true values are simulated from the prior used during inference (as is the case for our validatian data above), We would expect the rank of the true parameter value to be uniformly distributed from 1 to `num_samples`."
    ]
@@ -845,7 +845,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "The histograms look quite good overall, but could be a bit more uniform especially for $\\beta_0$. That said, the SBC histograms have some drawbacks on how the confidence bands are computed, so we recommend using another kind of plot that is based on the empirical cumulative distribution function (ECDF). For the ECDF, we can compute better confidence bands than for histograms, so the SBC ECDF plot is usually preferable. [This SBC interpretation guide by Martin Modrák](https://hyunjimoon.github.io/SBC/articles/rank_visualizations.html) gives further background information and also practical examples of how to interpret the SBC plots."
+    "The histograms look quite good overall, but could be a bit more uniform. That said, the SBC histograms have some drawbacks on how the confidence bands are computed, so we recommend using another kind of plot that is based on the empirical cumulative distribution function (ECDF). For the ECDF, we can compute better confidence bands than for histograms, so the SBC ECDF plot is usually preferable. [This SBC interpretation guide by Martin Modrák](https://hyunjimoon.github.io/SBC/articles/rank_visualizations.html) gives further background information and also practical examples of how to interpret the SBC plots."
    ]
   },
   {
@@ -885,7 +885,7 @@
     "id": "nv6ipPY6EyBv"
    },
    "source": [
-    "The plot confirms that the approximate posteriors are well calibrated, except for the small issues in the posteriors of $\\beta_0$ that we had already seen in the histograms. Likely, for fully well calibrated inference, we would have to train the approximator a little longer, but that's okay. After all, we can effort a little more training time since afterwards, inference on any number of new (real or simulated) datasets is very fast due to amortization."
+    "The plot confirms that the approximate posteriors are well calibrated, except for the small issues that we had already seen in the histograms. Likely, for fully well calibrated inference, we would have to train the approximator a little longer, but that's okay. After all, we can effort a little more training time since afterwards, inference on any number of new (real or simulated) datasets is very fast due to amortization."
    ]
   },
   {
@@ -941,6 +941,13 @@
     "### Saving and Loading the Trained Networks"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "For saving and later on reloading bayesflow approximators, we provide some convenient functionalities. For saving, we use:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 32,
@@ -954,6 +961,13 @@
     "# approximator.save_weights(filepath=\"checkpoints/regression.h5\")"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "For loading, we then use:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 33,
@@ -973,6 +987,13 @@
     "approximator = keras.saving.load_model(\"checkpoints/regression.keras\")"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "All the usual methods continue to work on the loaded approximator. For example:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 34,