tightening the conclusion with advice

NathanielF · NathanielF · commit eb05301a3e7a · 2025-11-01T21:42:45.000Z
Signed-off-by: Nathaniel &lt;NathanielF@users.noreply.github.com&gt;
diff --git a/docs/source/knowledgebase/structural_causal_models.ipynb b/docs/source/knowledgebase/structural_causal_models.ipynb
@@ -402,7 +402,7 @@
     "Each prior embodies a different epistemological stance on how much structure the data can learn versus how much the analyst must impose. In the unconfounded case, the treatment and outcome errors are independent, so the joint model effectively decomposes into two connected regressions. The treatment effect $\\alpha$ then captures the causal impact of the treatment on the outcome, and under this setting, its posterior should center around the true value of 3. The goal is not to solve confounding yet but to show that when the world is simple and well-behaved, the Bayesian model recovers the truth just as OLS does—but with richer uncertainty quantification and a coherent probabilistic structure.\n",
     "\n",
     "The following code defines the model and instantiates it under several prior choices. The model’s graphical representation, produced by `pm.model_to_graphviz()`, visualizes its structure: covariates feed into both the treatment and the outcome equations, the treatment coefficient $\\alpha$ links them, and the two residuals \n",
-    "$U$ and $V$ are connected through a correlation parameter $\\rho$, which we can freely set to zero or more substantive values. These parameterisations offer us a way to derive insight into the structure of the causal system under study. \n",
+    "$U$ and $V$ are connected through a correlation parameter $\\rho$, which we can freely set to zero or more substantive values. These parameterisations offer us a way to derive insight from the structure of the causal system under study. \n",
     "\n",
     "### Fitting the Continuous Treatment Model\n",
     "\n",
@@ -5567,25 +5567,25 @@
    "source": [
     "The results all indicate a positive effect on weight due to the quitting smoking. They vary slightly in the attributed effect but, interestingly even if we try to zero out the correlation between treatment and outcome the model still implies a higher effect than observed in the simpler regression model. The Bayes factor plots repor that the alternative hypothesis $\\alpha \\neq 3$ is between 5 and 12 times more likely than the null hypothesis of  $\\alpha = 3$. They also indicate the effect of Bayesian updating by the extent in which the posterior has transformed from the prior in each plot. \n",
     "\n",
-    ":::{admonition} Advice for the Practitioner\n",
-    ":class: tip\n",
+    "### Applying These Methods\n",
     "\n",
-    "We have seen a number of ways in which to model the structural relationships between treatment and outcome for causal inference. In `CausalPy` we will add a flexible API to capture some of these options, but no API can be fully robust for each and every niche problem. You may wish to prioritise one or more of these components in your own modelling. Our main advice here is to model the parameters that matter - the ones that give insight into _the structure of your problem_. Use the structural parameters to diagnose the degree of confounding. Use variable selection priors with care as a diagnostic aid for theoretical instruments. Assess your model in context with a range of reasonable alternatives and report the variation honestly. This process, the careful craft of statistical modelling, underwrites contemporary Bayesian workflow and sound causal inference.\n",
+    "The models demonstrated here are not recipes to be followed mechanically but frameworks for making structural assumptions explicit. Before fitting a Bayesian causal model to real data, ask yourself three questions:\n",
+    "\n",
+    "**First: Can I defend my causal structure theoretically?** Which variables do you believe are confounders, which are instruments, which are irrelevant? Write down your causal graph before writing down your priors. If you cannot justify exclusion restrictions through domain knowledge or institutional understanding, data-driven variable selection will not rescue you—it will merely dress speculation in statistical clothing.\n",
+    "\n",
+    "**Second: How sensitive are my conclusions to structural assumptions?** The confounding parameter ρ is rarely identified from observables alone. Vary your priors on ρ across plausible ranges and observe how your treatment effect estimate shifts. Fit models with normal priors, sparse priors, and theory-driven exclusions. If your causal conclusions are stable across specifications, they're robust. If they vary dramatically, that variation is real epistemic uncertainty and should be reported as such.\n",
+    "\n",
+    "**Third: Where have I placed flexibility, and why?** Automated variable selection and nonparametric methods are powerful tools, but flexibility in the outcome equation can absorb the causal effects you're trying to estimate. As we demonstrated with BART, sufficiently flexible outcome models learn total associations rather than structural parameters. Use flexibility in the treatment equation if needed, but keep the outcome equation constrained to interpretable causal parameters.\n",
     "\n",
-    ":::"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
     "### Conclusion\n",
     "\n",
-    "When we specify a Bayesian causal model, we write down a probabilistic program that encodes our beliefs about how data are generated. We must consider which variables influence which, how uncertainty enters, and what invariances we are prepared to assume. Once fitted, the model becomes an _executable nomological machine_: we can run it forward under interventions, perturb its assumptions, and observe the probabilistic consequences. This executable character is what distinguishes structural modelling from purely associational approaches. It allows us to simulate alternative worlds and test the coherence of our causal story, rather than merely report coefficients.\n",
+    "These questions point to what distinguishes structural modeling from purely associational approaches. When we specify a Bayesian causal model, we write down a probabilistic program that encodes our beliefs about how data are generated—which variables influence which, how uncertainty enters, what exclusions hold. Once fitted, the model becomes a working machine we can run forward under interventions, perturb in its assumptions, and interrogate for consequences. This executable character lets us simulate alternative worlds and test the coherence of our causal story, rather than merely report coefficients.\n",
+    "\n",
+    "The virtue of treating causal models as probabilistic programs is twofold. First, it forces us to articulate our causal beliefs explicitly i.e. the graphical, functional, and stochastic components that make the model run. Second, it offers a disciplined way to explore what follows from those beliefs under uncertainty. Bayesian structural causal inference therefore unites an epistemic modesty with computational rigor: each model is a local, provisional machine for generating causal understanding, not a final map of the world.\n",
     "\n",
-    "The range of model types in `CausalPy` illustrate that there are a range of such programs. Each program is fit to the world in various ways and appropriate for inference contingent on on how well fit each method is to the world. The sensitivity of our findings to perturbations of the program is an important feature of work in causal inference. The virtue of treating causal models as probabilistic programs is therefore twofold. First, it forces us to articulate the structure of our causal beliefs explicitly — the graphical, functional, and stochastic components that make the model run. Second, it offers a disciplined way to explore what follows from those beliefs under uncertainty. In doing so, Bayesian structural causal inference unites an epistemic modesty with the computational rigor of modern probabilistic programming: each model is a local, provisional machine for generating causal understanding, not a final map of the world fit once and forever.\n",
+    "The credibility revolution's achievement was recognizing that causal claims require more than correlations. Causal inference requires identification strategies. These strategies try to bracket complexity through design. Bayesian structural modeling takes a complementary path: it models complexity explicitly, then explores how robust our conclusions are to structural perturbations. Both approaches succeed when we know not only how our models work, but where they stop working. \n",
     "\n",
-    "The discovery of these causal programs and quasi-experimental designs is a genuine achievement of _the credibility revolution_, but we can recognize the value of such programmatic abstractions without mistaking their scope for universality. These designs succeed by bracketing complexity rather than modelling it, allowing us to extract causal signals even when the full data-generating process is unknown or only partially knowable. Causal inference succeeds when we know not only how our models work, but also where they stop working. Every causal model, like every fish tank, is a small world whose regularities we can nurture but never universalize. Our task is not to master the ocean, but to build clear tanks and learn when to change the water.\n",
+    "Every causal model, like every fish tank, is a \"small world\" whose regularities we can nurture but never universalize. Our task is not to master the ocean, but to build clear tanks and learn when to change the water.\n",
     "\n",
     "## References\n",
     ":::{bibliography}\n",