Skip to content

Commit 84f6198

Browse files
committed
add writing analogy
1 parent 8189e5b commit 84f6198

File tree

2 files changed

+45
-5
lines changed

2 files changed

+45
-5
lines changed

examples/case_studies/bayesian_sem_workflow.ipynb

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5754,9 +5754,9 @@
57545754
"id": "e51eaf54",
57555755
"metadata": {},
57565756
"source": [
5757-
"#### Parameter Recovery Plots\n",
5757+
"#### The Parameter Recovery Process\n",
57585758
"\n",
5759-
"But we can actually assess the degree of parameter recovery because we know the true values"
5759+
"But we can actually assess the degree of parameter recovery because we know the true values. This is a pivotal part of the model building process, akin to how writer's read aloud their own work to test it for assonance, cogency and flow. In simulation based probabilistic modelling we should be able generate data from models with known parameters, and recover the latent parameter values through the inferential workflow. "
57605760
]
57615761
},
57625762
{
@@ -5817,12 +5817,26 @@
58175817
");"
58185818
]
58195819
},
5820+
{
5821+
"cell_type": "markdown",
5822+
"id": "41e491be",
5823+
"metadata": {},
5824+
"source": [
5825+
"Here we see how the posterior distributions “recover” the true values within uncertainty ensuring the model is faithful to the data generating process. Were the effort at parameter recover to fail, we would equally have learned something about our model. Parameter recovery exercises helps discover issues of mis-specification or unidentified parameters. Put another way, they tell us how informative our data is with respect to our data generating model. Verlyn Klinkenborg starts his justly famous book _Several short sentences about writing_ with the following advice: \n",
5826+
"\n",
5827+
"> \"Here, in short, is what i want to tell you. Know what each sentence says, What it doesn't say, And what it implies. Of these, the hardest is know what each sentence actually says\" - V. Klinkenborg\n",
5828+
"\n",
5829+
"This advice transfers exactly to the art of statistical modelling. To know what our model says, we need to say it aloud. We need to feel how it lands with an audience. We need to understand is implications and limitations. The Bayesian workflow explores the depths of meaning achieved by our statistical approximations. It traces out the effects of interlocking components and the layered interactions of structural regressions. In each articulation we're testing which the flavours of reality resonate in the telling. What shape the posterior? How plausible the range of values? How faithful are our predictions to reality? On these questions we weigh each model just as the writer weighs each sentence for their effects. "
5830+
]
5831+
},
58205832
{
58215833
"cell_type": "markdown",
58225834
"id": "bf4b62e5",
58235835
"metadata": {},
58245836
"source": [
5825-
"### Hypothesis Evaluation: How does group membership change direct effects?"
5837+
"### Hypothesis Evaluation: How does group membership change direct effects?\n",
5838+
"\n",
5839+
"In this case we've encoded a differnce in the regression effects for each of the two groups. These effects are easily recovered with quantifiable measures of uncertainty around the treatment effects."
58265840
]
58275841
},
58285842
{
@@ -5859,6 +5873,16 @@
58595873
");"
58605874
]
58615875
},
5876+
{
5877+
"cell_type": "markdown",
5878+
"id": "59c4d17a",
5879+
"metadata": {},
5880+
"source": [
5881+
"In an applied setting it's these kinds of implications that are crucially important to surface and understand. From a workflow point of view we want to ensure that our modelling drives clarity on these precise points and avoids adding noise generally. \n",
5882+
"\n",
5883+
"Another way, we might interrogate the implications of a model is to see how well it can predict \"downstream\" outcomes of the implied model. In the job-satisfaction setting we might wonder about how job-satisfaction relates to attrition risk?"
5884+
]
5885+
},
58625886
{
58635887
"cell_type": "markdown",
58645888
"id": "6636967a",

examples/case_studies/bayesian_sem_workflow.myst.md

Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1185,9 +1185,9 @@ The model samples well and gives good evidence of distinct posterior distributio
11851185
az.plot_trace(idata_hierarchical, var_names=["mu_betas_treatment", "mu_betas_control", "Lambda"]);
11861186
```
11871187

1188-
#### Parameter Recovery Plots
1188+
#### The Parameter Recovery Process
11891189

1190-
But we can actually assess the degree of parameter recovery because we know the true values
1190+
But we can actually assess the degree of parameter recovery because we know the true values. This is a pivotal part of the model building process, akin to how writer's read aloud their own work to test it for assonance, cogency and flow. In simulation based probabilistic modelling we should be able generate data from models with known parameters, and recover the latent parameter values through the inferential workflow.
11911191

11921192
```{code-cell} ipython3
11931193
az.plot_posterior(
@@ -1203,8 +1203,18 @@ az.plot_posterior(
12031203
);
12041204
```
12051205

1206+
Here we see how the posterior distributions “recover” the true values within uncertainty ensuring the model is faithful to the data generating process. Were the effort at parameter recover to fail, we would equally have learned something about our model. Parameter recovery exercises helps discover issues of mis-specification or unidentified parameters. Put another way, they tell us how informative our data is with respect to our data generating model. Verlyn Klinkenborg starts his justly famous book _Several short sentences about writing_ with the following advice:
1207+
1208+
> "Here, in short, is what i want to tell you. Know what each sentence says, What it doesn't say, And what it implies. Of these, the hardest is know what each sentence actually says" - V. Klinkenborg
1209+
1210+
This advice transfers exactly to the art of statistical modelling. To know what our model says, we need to say it aloud. We need to feel how it lands with an audience. We need to understand is implications and limitations. The Bayesian workflow explores the depths of meaning achieved by our statistical approximations. It traces out the effects of interlocking components and the layered interactions of structural regressions. In each articulation we're testing which the flavours of reality resonate in the telling. What shape the posterior? How plausible the range of values? How faithful are our predictions to reality? On these questions we weigh each model just as the writer weighs each sentence for their effects.
1211+
1212+
+++
1213+
12061214
### Hypothesis Evaluation: How does group membership change direct effects?
12071215

1216+
In this case we've encoded a differnce in the regression effects for each of the two groups. These effects are easily recovered with quantifiable measures of uncertainty around the treatment effects.
1217+
12081218
```{code-cell} ipython3
12091219
diff = (
12101220
idata_hierarchical["posterior"]["mu_betas_control"]
@@ -1217,6 +1227,12 @@ plt.suptitle(
12171227
);
12181228
```
12191229

1230+
In an applied setting it's these kinds of implications that are crucially important to surface and understand. From a workflow point of view we want to ensure that our modelling drives clarity on these precise points and avoids adding noise generally.
1231+
1232+
Another way, we might interrogate the implications of a model is to see how well it can predict "downstream" outcomes of the implied model. In the job-satisfaction setting we might wonder about how job-satisfaction relates to attrition risk?
1233+
1234+
+++
1235+
12201236
## Discrete Choice Component
12211237

12221238
```{code-cell} ipython3

0 commit comments

Comments
 (0)