Skip to content

Commit 4a5100d

Browse files
authored
DOC: Add statistical references to EABM (#176)
* DOC: Add reference to "The Book of Statistical Proofs" * DOC: Add citation to "The Book of Statistical Proofs" * DOC: Add another reference to The Book of Statistical Proofs * DOC: Add links to preliz functions * DOC: Add another link to maxent in preliz Updated the reference to the 'maxent' function to include a link for better clarity.
1 parent d5150ea commit 4a5100d

File tree

4 files changed

+17
-9
lines changed

4 files changed

+17
-9
lines changed

Chapters/Model_comparison.qmd

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -98,7 +98,7 @@ for p, ax in zip([0.5, 0.1, 0.9, 0.0001], axes.ravel()):
9898
ax.set_ylim(-0.05, 1.05)
9999
```
100100

101-
The concept of entropy appears many times in statistics. It can be useful, for example when defining priors. In general we want to use a prior that has maximum entropy given our knowledge (see for example [PreliZ](https://preliz.readthedocs.io/en/latest/)'s `maxent` function). And also when comparing models as we will see in the next section.
101+
The concept of entropy appears many times in statistics. It can be useful, for example when defining priors. In general we want to use a prior that has maximum entropy given our knowledge (see for example [PreliZ](https://preliz.readthedocs.io/en/latest/)'s [`maxent`](https://preliz.readthedocs.io/en/latest/unidimensional.html#preliz.unidimensional.maxent) function). And also when comparing models as we will see in the next section.
102102

103103
## KL divergence {#sec-kl-divergence}
104104

@@ -118,7 +118,7 @@ $$
118118
$$
119119

120120

121-
If $p$ represents the **data generating process** or the **population** or the **true** distribution, and $q$ represents our model. It may seems that this expressions are all useless because we don't know $p$. That the reason we are trying to fit a model in the first place. But, if our goal is to compare $m$ models represented with $q_0, q_1 \cdots q_m$, we can can still use the KL divergence to compare them! The reason is that even when we do not know $p$, its entropy is a constant term for all comparisons.
121+
If $p$ represents the **data generating process** or the **population** or the **true** distribution, and $q$ represents our model. It may seems that this expressions are all useless because we don't know $p$. That the reason we are trying to fit a model in the first place. But, if our goal is to compare $m$ models represented with $q_0, q_1, \cdots, q_m$, we can can still use the KL divergence to compare them! The reason is that even when we do not know $p$, its entropy is a constant term for all comparisons.
122122

123123
$$
124124
\begin{split}
@@ -493,7 +493,7 @@ $$
493493
BF_{01} = \frac{p(y \mid H_0)}{p(y \mid H_1)} \frac{p(\theta=0.5 \mid y, H_1)}{p(\theta=0.5 \mid H_1)}
494494
$$
495495

496-
This is true only when $H_0$ is a particular case of $H_1$, [see](https://statproofbook.github.io/P/bf-sddr).
496+
This is true only when $H_0$ is a particular case of $H_1$, see [The Book of Statistical Proofs](https://statproofbook.github.io/P/bf-sddr) [@soch_2024].
497497

498498
Let's do it. We only need to sample the prior and posterior for a model. Let's try the BetaBinomial model with a Uniform prior:
499499

Chapters/Prior_elicitation.qmd

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -107,7 +107,7 @@ For some priors in a model, we may know or assume that most of the mass is withi
107107

108108
## Maximum entropy distributions with maxent
109109

110-
In PreliZ we can compute maximum entropy priors using the function `maxent`. It works for unidimensional distributions. The first argument is a PreliZ distribution. Then we specify an upper and lower bound and the probability between them.
110+
In PreliZ we can compute maximum entropy priors using the function [`maxent`](https://preliz.readthedocs.io/en/latest/unidimensional.html#preliz.unidimensional.maxent). It works for unidimensional distributions. The first argument is a PreliZ distribution. Then we specify an upper and lower bound and the probability between them.
111111

112112
As an example, we want to elicit a scale parameter. From domain knowledge we know the parameter has a relatively high probability of being less than 3. Hence, we could use a HalfNormal distribution and do:
113113

@@ -140,10 +140,10 @@ dist_mean.summary(), dist_mode.summary()
140140

141141
## Other direct elicitation methods from PreliZ
142142

143-
There are many other method for direct elicitation of parameters. For instance the [quartile](https://preliz.readthedocs.io/en/latest/unidimensional.html#preliz.unidimensional.quartile) functions identifies a distribution that matches specified
144-
quartiles, and [Quartine_int](https://preliz.readthedocs.io/en/latest/unidimensional.html#preliz.unidimensional.QuartileInt) provides an interactive approach to achieve the same, offering a more hands-on experience for refining distributions.
143+
There are many other method for direct elicitation of parameters. For instance the [`quartile`](https://preliz.readthedocs.io/en/latest/unidimensional.html#preliz.unidimensional.quartile) function identifies a distribution that matches specified
144+
quartiles, and [`Quartine_int`](https://preliz.readthedocs.io/en/latest/unidimensional.html#preliz.unidimensional.QuartileInt) provides an interactive approach to achieve the same, offering a more hands-on experience for refining distributions.
145145

146-
One method worth of special mention is the [Roulette](https://preliz.readthedocs.io/en/latest/unidimensional.html#preliz.unidimensional.Roulette) method allows which allows users to find a prior distribution by drawing it interactively [@morris_2014]. The name "roulette" comes from the analogy of placing a limited set of chips where one believes the mass of a distribution should be concentrated. In this method, a grid of `m` equally sized bins is provided, covering the range of `x`, and users allocate a total of `n` chips across the bins. Effectively, this creates a histogram,representing the user's information about the distribution. The method then identifies the best-fitting distribution from a predefined pool of options, translating the drawn histogram into a suitable probabilistic model.
146+
One method worth of special mention is the [`Roulette`](https://preliz.readthedocs.io/en/latest/unidimensional.html#preliz.unidimensional.Roulette) method allows which allows users to find a prior distribution by drawing it interactively [@morris_2014]. The name "roulette" comes from the analogy of placing a limited set of chips where one believes the mass of a distribution should be concentrated. In this method, a grid of `m` equally sized bins is provided, covering the range of `x`, and users allocate a total of `n` chips across the bins. Effectively, this creates a histogram,representing the user's information about the distribution. The method then identifies the best-fitting distribution from a predefined pool of options, translating the drawn histogram into a suitable probabilistic model.
147147

148148
As this is an interactive method we can't show it here, but you can run the following cell to see how it works.
149149

@@ -253,7 +253,7 @@ The new priors still generate some values that are too wide, but at least the bu
253253

254254
The process described in the previous section is straightforward: sample from the prior predictive --> plot --> refine --> repeat. On the good side, this is a very flexible approach and can be a good way to understand the effect of individual parameters in the predictions of a model. But it can be time-consuming and it requires some understanding of the model so you know which parameters to tweak and in which direction.
255255

256-
One way to improve this workflow is by adding interactivity. We can do this with PreliZ's function, `predictive_explorer`. Which we can not show here, in a full glory but you can see an static image in @fig-predictive-explorer, and you can try it for yourself by running the following block of code.
256+
One way to improve this workflow is by adding interactivity. We can do this with PreliZ's function, [`predictive_explorer`](https://preliz.readthedocs.io/en/latest/predictive.html#preliz.predictive.predictive_explorer). Which we can not show here, in a full glory but you can see a static image in @fig-predictive-explorer, and you can try it for yourself by running the following block of code.
257257

258258
```{python}
259259
#| eval : false

Chapters/Prior_posterior_predictive_checks.qmd

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -454,7 +454,7 @@ $$
454454
U = F_Y(Y)
455455
$$
456456

457-
follows a standard Uniform distribution. A proof of this result can be found in the [The Book of Statistical Proofs](https://statproofbook.github.io/P/cdf-pit.html).
457+
follows a standard Uniform distribution. A proof of this result can be found in [The Book of Statistical Proofs](https://statproofbook.github.io/P/cdf-pit.html) [@soch_2024].
458458

459459
In other words if we apply the CDF of any continuous distribution to a random variable with that distribution, the result will be a random variable with a standard uniform distribution. This is a very powerful result, as it allows us to use the standard uniform distribution as a reference distribution for many statistical tests, including posterior predictive checks.
460460

references.bib

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -861,3 +861,11 @@ @inproceedings{fernandes_2018
861861
year = {2018},
862862
pages = {1--12},
863863
}
864+
865+
@misc{soch_2024,
866+
title={The Book of Statistical Proofs},
867+
author={Soch, Joram and Faulkenberry, Thomas J and Petrykowski, Kenneth and Allefeld, Carsten},
868+
year={2024},
869+
doi={10.5281/ZENODO.4305949},
870+
url={https://statproofbook.github.io/}
871+
}

0 commit comments

Comments
 (0)