add note on Bollen

NathanielF · NathanielF · commit 9882a0827a75 · 2025-09-28T10:32:12.000+01:00
diff --git a/examples/case_studies/bayesian_sem_workflow.ipynb b/examples/case_studies/bayesian_sem_workflow.ipynb
@@ -1545,7 +1545,9 @@
    "source": [
     "### Model Diagnostics and Assessment\n",
     "\n",
-    "For each latent variable (satisfaction, well being, constructive, dysfunctional), we will plot a forest/ridge plot of the posterior distributions of their factor scores `eta` as drawn. Each panel will have a vertical reference line at 0 (since latent scores are typically centered/scaled).These panels visualize the distribution of estimated latent scores across individuals, separated by latent factor. Then we will summarizes posterior estimates of model parameters (factor loadings, regression coefficients, variances, etc.), providing a quick check against identification constraints (like fixed loadings) and effect directions. Finally we will plot the upper-triangle of the residual correlation matrix with a blue–white–red colormap (−1 to +1). This visualizes residual correlations among observed indicators after the SEM structure is accounted for — helping detect model misfit or unexplained associations."
+    "For each latent variable (satisfaction, well being, constructive, dysfunctional), we will plot a forest/ridge plot of the posterior distributions of their factor scores `eta` as drawn. Each panel will have a vertical reference line at 0 (since latent scores are typically centered/scaled).These panels visualize the distribution of estimated latent scores across individuals, separated by latent factor. Then we will summarizes posterior estimates of model parameters (factor loadings, regression coefficients, variances, etc.), providing a quick check against identification constraints (like fixed loadings) and effect directions. Finally we will plot the upper-triangle of the residual correlation matrix with a blue–white–red colormap (−1 to +1). This visualizes residual correlations among observed indicators after the SEM structure is accounted for — helping detect model misfit or unexplained associations.\n",
+    "\n",
+    "Below these model checks we will plot some diagnostics for the sampler. These plots are aimed at checking whether the sampler has sufficiently explored the parameter space. "
    ]
   },
   {
@@ -1710,19 +1712,45 @@
     "plot_diagnostics(idata_cfa_model_v1, parameters);"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "3b5c7ecb",
+   "metadata": {},
+   "source": [
+    "These plots indicate a fairly promising modelling strategy. The estimated factor Loadings are all close to 1 which implies a conformity in the magnitude and scale of the indicator metrics within each of the four factors.The indicator(s) are strongly reflective of the latent factor although `UF1` and `FOR` seem to be moving in opposite directions. We will want to address this later when we specify covariance structures for the residuals. \n",
+    "\n",
+    "The Posterior Predictive Residuals are close to 0 which suggests that model is well able to capture the latent covariance structure of the observed data. The latent factors move together in intuitive ways, with high Satisfaction ~~ high Well Being. The sampler diagnostics give no indication of trouble. This is a promising start. "
+   ]
+  },
   {
    "cell_type": "markdown",
    "id": "a5538546",
    "metadata": {},
    "source": [
     "## Structuring the Latent Relations\n",
     "\n",
+    "The next expansionary move in SEM modelling is to consider the relations between the latent constructs. These are generally intended to have a causal interpretation. The constructs are hard to measure precisely, but collectively as a function of multiple indicator variables, we argue they are exhaustively characterised. \n",
+    "\n",
+    "> As I have just explained, we cannot isolate a dependent variable from all influences but a single explanatory variable, so it is impossible to make definitive statements about causes. We replace perfect isolation with pseudo-isolation by assuming that the disturbance (i.e., the composite of all omitted determinants) is uncorrelated with the exogenous variables of an equation. - Bollen in _Structural Equations with Latent Variables_ pg45\n",
+    "\n",
+    "This is a claim of conditional independence which licenses the causal interpretation of the the arrows in the below plot. The fact that the latent relations operate a higher level of abstraction makes it easier to postulate these \"clean\" direct paths between constructs. The model makes an argument - to have proprely measured the latent constructs, and isolated their variation to support a causal claim. Criticisms of the model proceed by assessing how compelling these postulates are in the context of the fitted model.\n",
+    "\n",
     "![](sem3_excalidraw.png)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "544e9848",
+   "metadata": {},
+   "source": [
+    "The isolation or conditional independence of interest is encoded in the model with the sampling of the `gamma` variable. These are drawn from a process that is structuraly divorced from the influence of the exogenous variables. For instance if we have $\\gamma_{cts} \\perp\\!\\!\\!\\perp \\eta_{dtp}$ then the $\\beta_{cts -> dpt}$ coefficient is an unbiased estimate of the direct effect of `CTS` on `DTP` because the remaining variation in $\\eta_{dtp}$ is noise by construction. \n",
+    "\n",
+    "It is entirely optional how many arrows you want to add to your system. In our case we have structured the DAG following the discussion in {cite:p}`vehkalahti2019multivariate` which will allow us to unpick the direct and indirect effects below. "
+   ]
+  },
   {
    "cell_type": "code",
-   "execution_count": 12,
+   "execution_count": null,
    "id": "8b0738c0",
    "metadata": {},
    "outputs": [
@@ -2135,10 +2163,11 @@
     "\n",
     "    B = make_B()\n",
     "    I = pt.eye(latent_dim)\n",
+    "    ## Clean Causal Influence of Shocks\n",
     "    eta = pm.Deterministic(\n",
     "        \"eta\", pt.slinalg.solve(I - B + 1e-8 * I, gamma.T).T, dims=(\"obs\", \"latent\")\n",
     "    )\n",
-    "\n",
+    "    ## Influence of Exogenous indicator variables\n",
     "    mu = pt.dot(eta, Lambda.T)\n",
     "\n",
     "    ## Error Terms\n",
@@ -2148,6 +2177,14 @@
     "pm.model_to_graphviz(sem_model_v1)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "19b49daf",
+   "metadata": {},
+   "source": [
+    "We have also added the covariance structure on the residuals by supplying a multivariate normal likelihood with a precise covariance structure to add a correlation among the `UF1` and `FOR` indicators metrics. "
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 13,
@@ -2660,7 +2697,9 @@
    "id": "c469d2a9",
    "metadata": {},
    "source": [
-    "### Model Diagnostics and Assessment"
+    "### Model Diagnostics and Assessment\n",
+    "\n",
+    "The modelling shows improvement in the posterior predictive checks on the model implied residuals. Additionally we now get insight into the implied paths and relationships between the latent constructs. These move in compelling ways. Dysfunctional thought processes have a probable negative impact on well being, and similarly for job satisfaction. Conversely constructive thought processes have a probable positive direct effect on well being and satisfaction. Although the latter appears slight. "
    ]
   },
   {
@@ -2690,6 +2729,14 @@
     "plot_model_highlights(idata_sem_model_v1, \"SEM\", parameters, sem=True);"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "id": "f1b1795d",
+   "metadata": {},
+   "source": [
+    "However, the model diagnostics appear less robust. The sampler seemed to have difficulty with sampling the parameters for the path-coefficients `mu_betas`. "
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 16,
diff --git a/examples/case_studies/bayesian_sem_workflow.myst.md b/examples/case_studies/bayesian_sem_workflow.myst.md
@@ -538,6 +538,8 @@ idata_cfa_model_v1["posterior"]["Lambda"].sel(chain=0, draw=0)
 
 For each latent variable (satisfaction, well being, constructive, dysfunctional), we will plot a forest/ridge plot of the posterior distributions of their factor scores `eta` as drawn. Each panel will have a vertical reference line at 0 (since latent scores are typically centered/scaled).These panels visualize the distribution of estimated latent scores across individuals, separated by latent factor. Then we will summarizes posterior estimates of model parameters (factor loadings, regression coefficients, variances, etc.), providing a quick check against identification constraints (like fixed loadings) and effect directions. Finally we will plot the upper-triangle of the residual correlation matrix with a blue–white–red colormap (−1 to +1). This visualizes residual correlations among observed indicators after the SEM structure is accounted for — helping detect model misfit or unexplained associations.
 
+Below these model checks we will plot some diagnostics for the sampler. These plots are aimed at checking whether the sampler has sufficiently explored the parameter space. 
+
 ```{code-cell} ipython3
 :tags: [hide-input]
 
@@ -647,10 +649,28 @@ plot_model_highlights(idata_cfa_model_v1, "CFA", parameters)
 plot_diagnostics(idata_cfa_model_v1, parameters);
 ```
 
+These plots indicate a fairly promising modelling strategy. The estimated factor Loadings are all close to 1 which implies a conformity in the magnitude and scale of the indicator metrics within each of the four factors.The indicator(s) are strongly reflective of the latent factor although `UF1` and `FOR` seem to be moving in opposite directions. We will want to address this later when we specify covariance structures for the residuals. 
+
+The Posterior Predictive Residuals are close to 0 which suggests that model is well able to capture the latent covariance structure of the observed data. The latent factors move together in intuitive ways, with high Satisfaction ~~ high Well Being. The sampler diagnostics give no indication of trouble. This is a promising start. 
+
++++
+
 ## Structuring the Latent Relations
 
+The next expansionary move in SEM modelling is to consider the relations between the latent constructs. These are generally intended to have a causal interpretation. The constructs are hard to measure precisely, but collectively as a function of multiple indicator variables, we argue they are exhaustively characterised. 
+
+> As I have just explained, we cannot isolate a dependent variable from all influences but a single explanatory variable, so it is impossible to make definitive statements about causes. We replace perfect isolation with pseudo-isolation by assuming that the disturbance (i.e., the composite of all omitted determinants) is uncorrelated with the exogenous variables of an equation. - Bollen in _Structural Equations with Latent Variables_ pg45
+
+This is a claim of conditional independence which licenses the causal interpretation of the the arrows in the below plot. The fact that the latent relations operate a higher level of abstraction makes it easier to postulate these "clean" direct paths between constructs. The model makes an argument - to have proprely measured the latent constructs, and isolated their variation to support a causal claim. Criticisms of the model proceed by assessing how compelling these postulates are in the context of the fitted model.
+
 ![](sem3_excalidraw.png)
 
++++
+
+The isolation or conditional independence of interest is encoded in the model with the sampling of the `gamma` variable. These are drawn from a process that is structuraly divorced from the influence of the exogenous variables. For instance if we have $\gamma_{cts} \perp\!\!\!\perp \eta_{dtp}$ then the $\beta_{cts -> dpt}$ coefficient is an unbiased estimate of the direct effect of `CTS` on `DTP` because the remaining variation in $\eta_{dtp}$ is noise by construction. 
+
+It is entirely optional how many arrows you want to add to your system. In our case we have structured the DAG following the discussion in {cite:p}`vehkalahti2019multivariate` which will allow us to unpick the direct and indirect effects below. 
+
 ```{code-cell} ipython3
 with pm.Model(coords=coords) as sem_model_v1:
 
@@ -672,10 +692,11 @@ with pm.Model(coords=coords) as sem_model_v1:
 
     B = make_B()
     I = pt.eye(latent_dim)
+    ## Clean Causal Influence of Shocks
     eta = pm.Deterministic(
         "eta", pt.slinalg.solve(I - B + 1e-8 * I, gamma.T).T, dims=("obs", "latent")
     )
-
+    ## Influence of Exogenous indicator variables
     mu = pt.dot(eta, Lambda.T)
 
     ## Error Terms
@@ -685,6 +706,8 @@ with pm.Model(coords=coords) as sem_model_v1:
 pm.model_to_graphviz(sem_model_v1)
 ```
 
+We have also added the covariance structure on the residuals by supplying a multivariate normal likelihood with a precise covariance structure to add a correlation among the `UF1` and `FOR` indicators metrics. 
+
 ```{code-cell} ipython3
 idata_sem_model_v1 = sample_model(sem_model_v1, sampler_kwargs)
 ```
@@ -698,11 +721,15 @@ idata_sem_model_v1["posterior"]["B_"].sel(chain=0, draw=0)
 
 ### Model Diagnostics and Assessment
 
+The modelling shows improvement in the posterior predictive checks on the model implied residuals. Additionally we now get insight into the implied paths and relationships between the latent constructs. These move in compelling ways. Dysfunctional thought processes have a probable negative impact on well being, and similarly for job satisfaction. Conversely constructive thought processes have a probable positive direct effect on well being and satisfaction. Although the latter appears slight. 
+
 ```{code-cell} ipython3
 parameters = ["mu_betas", "lambdas1", "lambdas2", "lambdas3", "lambdas4"]
 plot_model_highlights(idata_sem_model_v1, "SEM", parameters, sem=True);
 ```
 
+However, the model diagnostics appear less robust. The sampler seemed to have difficulty with sampling the parameters for the path-coefficients `mu_betas`. 
+
 ```{code-cell} ipython3
 plot_diagnostics(idata_sem_model_v1, parameters);
 ```