Skip to content

Normalize some tags and fix typos #772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Feb 18, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion examples/bart/bart_categorical_hawks.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -1473,7 +1473,7 @@
"source": [
"So far we have a very good result concerning the classification of the species based on the 5 covariables. However, if we want to select a subset of covariable to perform future classifications is not very clear which of them to select. Maybe something sure is that `Tail` could be eliminated. At the beginning when we plot the distribution of each covariable we said that the most important variables to make the classification could be `Wing`, `Weight` and, `Culmen`, nevertheless after running the model we saw that `Hallux`, `Culmen` and, `Wing`, proved to be the most important ones.\n",
"\n",
"Unfortunatelly, the partial dependence plots show a very wide dispersion, making results look suspicious. One way to reduce this variability is adjusting independent trees, below we will see how to do this and get a more accurate result. "
"Unfortunately, the partial dependence plots show a very wide dispersion, making results look suspicious. One way to reduce this variability is adjusting independent trees, below we will see how to do this and get a more accurate result. "
]
},
{
Expand Down
2 changes: 1 addition & 1 deletion examples/bart/bart_categorical_hawks.myst.md
Original file line number Diff line number Diff line change
Expand Up @@ -195,7 +195,7 @@ all

So far we have a very good result concerning the classification of the species based on the 5 covariables. However, if we want to select a subset of covariable to perform future classifications is not very clear which of them to select. Maybe something sure is that `Tail` could be eliminated. At the beginning when we plot the distribution of each covariable we said that the most important variables to make the classification could be `Wing`, `Weight` and, `Culmen`, nevertheless after running the model we saw that `Hallux`, `Culmen` and, `Wing`, proved to be the most important ones.

Unfortunatelly, the partial dependence plots show a very wide dispersion, making results look suspicious. One way to reduce this variability is adjusting independent trees, below we will see how to do this and get a more accurate result.
Unfortunately, the partial dependence plots show a very wide dispersion, making results look suspicious. One way to reduce this variability is adjusting independent trees, below we will see how to do this and get a more accurate result.

+++

Expand Down
4 changes: 2 additions & 2 deletions examples/bart/bart_introduction.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"(BART_introduction)=\n",
"# Bayesian Additive Regression Trees: Introduction\n",
":::{post} Dec 21, 2021\n",
":tags: BART, non-parametric, regression \n",
":tags: BART, nonparametric, regression \n",
":category: intermediate, explanation\n",
":author: Osvaldo Martin\n",
":::"
Expand Down Expand Up @@ -210,7 +210,7 @@
"id": "7eb4c307",
"metadata": {},
"source": [
"Before checking the result, we need to discuss one more detail, the BART variable always samples over the real line, meaning that in principle we can get values that go from $-\\infty$ to $\\infty$. Thus, we may need to transform their values as we would do for standard Generalized Linear Models, for example in the `model_coal` we computed `pm.math.exp(μ_)` because the Poisson distribution is expecting values that go from 0 to $\\infty$. This is business as usual, the novelty is that we may need to apply the inverse transformation to the values of `Y`, as we did in the previous model where we took $\\log(Y)$. The main reason to do this is that the values of `Y` are used to get a reasonable initial value for the sum of trees and also the variance of the leaf nodes. Thus, applying the inverse transformation is a simple way to improve the efficiency and accuracy of the result. Should we do this for every possible likelihood? Well, no. If we are using BART for the location parameter of distributions like Normal, StudentT, or AssymetricLaplace, we don't need to do anything as the support of these parameters is also the real line. A nontrivial exception is the Bernoulli likelihood (or Binomial with n=1), in that case, we need to apply the logistic function to the BART variable, but there is no need to apply its inverse to transform `Y`, PyMC-BART already takes care of that particular case.\n",
"Before checking the result, we need to discuss one more detail, the BART variable always samples over the real line, meaning that in principle we can get values that go from $-\\infty$ to $\\infty$. Thus, we may need to transform their values as we would do for standard Generalized Linear Models, for example in the `model_coal` we computed `pm.math.exp(μ_)` because the Poisson distribution is expecting values that go from 0 to $\\infty$. This is business as usual, the novelty is that we may need to apply the inverse transformation to the values of `Y`, as we did in the previous model where we took $\\log(Y)$. The main reason to do this is that the values of `Y` are used to get a reasonable initial value for the sum of trees and also the variance of the leaf nodes. Thus, applying the inverse transformation is a simple way to improve the efficiency and accuracy of the result. Should we do this for every possible likelihood? Well, no. If we are using BART for the location parameter of distributions like Normal, StudentT, or AsymmetricLaplace, we don't need to do anything as the support of these parameters is also the real line. A nontrivial exception is the Bernoulli likelihood (or Binomial with n=1), in that case, we need to apply the logistic function to the BART variable, but there is no need to apply its inverse to transform `Y`, PyMC-BART already takes care of that particular case.\n",
"\n",
"OK, now let's see the result of `model_coal`."
]
Expand Down
4 changes: 2 additions & 2 deletions examples/bart/bart_introduction.myst.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ kernelspec:
(BART_introduction)=
# Bayesian Additive Regression Trees: Introduction
:::{post} Dec 21, 2021
:tags: BART, non-parametric, regression
:tags: BART, nonparametric, regression
:category: intermediate, explanation
:author: Osvaldo Martin
:::
Expand Down Expand Up @@ -98,7 +98,7 @@ with pm.Model() as model_coal:
idata_coal = pm.sample(random_seed=RANDOM_SEED)
```

Before checking the result, we need to discuss one more detail, the BART variable always samples over the real line, meaning that in principle we can get values that go from $-\infty$ to $\infty$. Thus, we may need to transform their values as we would do for standard Generalized Linear Models, for example in the `model_coal` we computed `pm.math.exp(μ_)` because the Poisson distribution is expecting values that go from 0 to $\infty$. This is business as usual, the novelty is that we may need to apply the inverse transformation to the values of `Y`, as we did in the previous model where we took $\log(Y)$. The main reason to do this is that the values of `Y` are used to get a reasonable initial value for the sum of trees and also the variance of the leaf nodes. Thus, applying the inverse transformation is a simple way to improve the efficiency and accuracy of the result. Should we do this for every possible likelihood? Well, no. If we are using BART for the location parameter of distributions like Normal, StudentT, or AssymetricLaplace, we don't need to do anything as the support of these parameters is also the real line. A nontrivial exception is the Bernoulli likelihood (or Binomial with n=1), in that case, we need to apply the logistic function to the BART variable, but there is no need to apply its inverse to transform `Y`, PyMC-BART already takes care of that particular case.
Before checking the result, we need to discuss one more detail, the BART variable always samples over the real line, meaning that in principle we can get values that go from $-\infty$ to $\infty$. Thus, we may need to transform their values as we would do for standard Generalized Linear Models, for example in the `model_coal` we computed `pm.math.exp(μ_)` because the Poisson distribution is expecting values that go from 0 to $\infty$. This is business as usual, the novelty is that we may need to apply the inverse transformation to the values of `Y`, as we did in the previous model where we took $\log(Y)$. The main reason to do this is that the values of `Y` are used to get a reasonable initial value for the sum of trees and also the variance of the leaf nodes. Thus, applying the inverse transformation is a simple way to improve the efficiency and accuracy of the result. Should we do this for every possible likelihood? Well, no. If we are using BART for the location parameter of distributions like Normal, StudentT, or AsymmetricLaplace, we don't need to do anything as the support of these parameters is also the real line. A nontrivial exception is the Bernoulli likelihood (or Binomial with n=1), in that case, we need to apply the logistic function to the BART variable, but there is no need to apply its inverse to transform `Y`, PyMC-BART already takes care of that particular case.

OK, now let's see the result of `model_coal`.

Expand Down
4 changes: 2 additions & 2 deletions examples/bart/bart_quantile_regression.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@
"(BART_quantile)=\n",
"# Quantile Regression with BART\n",
":::{post} Jan 25, 2023\n",
":tags: BART, non-parametric, quantile, regression \n",
":tags: BART, nonparametric, quantile, regression \n",
":category: intermediate, explanation\n",
":author: Osvaldo Martin\n",
":::"
Expand Down Expand Up @@ -468,7 +468,7 @@
"id": "8e963637",
"metadata": {},
"source": [
"We can see that when we use a Normal likelihood, and from that fit we compute the quantiles, the quantiles q=0.1 and q=0.9 are symetrical with respect to q=0.5, also the shape of the curves is essentially the same just shifted up or down. Additionally the Asymmetric Laplace family allows the model to account for the increased variability in BMI as the age increases, while for the Gaussian family that variability always stays the same."
"We can see that when we use a Normal likelihood, and from that fit we compute the quantiles, the quantiles q=0.1 and q=0.9 are symmetrical with respect to q=0.5, also the shape of the curves is essentially the same just shifted up or down. Additionally the Asymmetric Laplace family allows the model to account for the increased variability in BMI as the age increases, while for the Gaussian family that variability always stays the same."
]
},
{
Expand Down
4 changes: 2 additions & 2 deletions examples/bart/bart_quantile_regression.myst.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ kernelspec:
(BART_quantile)=
# Quantile Regression with BART
:::{post} Jan 25, 2023
:tags: BART, non-parametric, quantile, regression
:tags: BART, nonparametric, quantile, regression
:category: intermediate, explanation
:author: Osvaldo Martin
:::
Expand Down Expand Up @@ -141,7 +141,7 @@ plt.xlabel("Age")
plt.ylabel("BMI");
```

We can see that when we use a Normal likelihood, and from that fit we compute the quantiles, the quantiles q=0.1 and q=0.9 are symetrical with respect to q=0.5, also the shape of the curves is essentially the same just shifted up or down. Additionally the Asymmetric Laplace family allows the model to account for the increased variability in BMI as the age increases, while for the Gaussian family that variability always stays the same.
We can see that when we use a Normal likelihood, and from that fit we compute the quantiles, the quantiles q=0.1 and q=0.9 are symmetrical with respect to q=0.5, also the shape of the curves is essentially the same just shifted up or down. Additionally the Asymmetric Laplace family allows the model to account for the increased variability in BMI as the age increases, while for the Gaussian family that variability always stays the same.

+++

Expand Down
4 changes: 2 additions & 2 deletions examples/case_studies/CFA_SEM.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -4019,7 +4019,7 @@
"source": [
"### Intermediate Cross-Loading Model\n",
"\n",
"The idea of a measurment model is maybe a little opaque when we only see models that fit well. Instead we want to briefly show how an in-apt measurement model gets reflected in the estimated parameters for the factor loadings. Here we specify a measurement model which attempts to couple the `se_social` and `sup_parents` indicators and bundle them into the same factor. "
"The idea of a measurement model is maybe a little opaque when we only see models that fit well. Instead we want to briefly show how an in-apt measurement model gets reflected in the estimated parameters for the factor loadings. Here we specify a measurement model which attempts to couple the `se_social` and `sup_parents` indicators and bundle them into the same factor. "
]
},
{
Expand Down Expand Up @@ -7890,7 +7890,7 @@
"source": [
"# Conclusion\n",
"\n",
"We've just seen how we can go from thinking about the measurment of abstract psychometric constructs, through the evaluation of complex patterns of correlation and covariances among these latent constructs to evaluating hypothetical causal structures amongst the latent factors. This is a bit of whirlwind tour of psychometric models and the expressive power of SEM and CFA models, which we're ending by linking them to the realm of causal inference! This is not an accident, but rather evidence that causal concerns sit at the heart of most modeling endeavours. When we're interested in any kind of complex joint-distribution of variables, we're likely interested in the causal structure of the system - how are the realised values of some observed metrics dependent on or related to others? Importantly, we need to understand how these observations are realised without confusing simple correlation for cause through naive or confounded inference.\n",
"We've just seen how we can go from thinking about the measurement of abstract psychometric constructs, through the evaluation of complex patterns of correlation and covariances among these latent constructs to evaluating hypothetical causal structures amongst the latent factors. This is a bit of whirlwind tour of psychometric models and the expressive power of SEM and CFA models, which we're ending by linking them to the realm of causal inference! This is not an accident, but rather evidence that causal concerns sit at the heart of most modeling endeavours. When we're interested in any kind of complex joint-distribution of variables, we're likely interested in the causal structure of the system - how are the realised values of some observed metrics dependent on or related to others? Importantly, we need to understand how these observations are realised without confusing simple correlation for cause through naive or confounded inference.\n",
"\n",
"Mislevy and Levy highlight this connection by focusing on the role of De Finetti's theorem in the recovery of exchangeablility through Bayesian inference. By De Finetti’s theorem a distribution of exchangeable sequence of variables be expressed as mixture of conditional independent variables.\n",
"\n",
Expand Down
4 changes: 2 additions & 2 deletions examples/case_studies/CFA_SEM.myst.md
Original file line number Diff line number Diff line change
Expand Up @@ -282,7 +282,7 @@ Which shows a relatively sound recovery of the observed data.

### Intermediate Cross-Loading Model

The idea of a measurment model is maybe a little opaque when we only see models that fit well. Instead we want to briefly show how an in-apt measurement model gets reflected in the estimated parameters for the factor loadings. Here we specify a measurement model which attempts to couple the `se_social` and `sup_parents` indicators and bundle them into the same factor.
The idea of a measurement model is maybe a little opaque when we only see models that fit well. Instead we want to briefly show how an in-apt measurement model gets reflected in the estimated parameters for the factor loadings. Here we specify a measurement model which attempts to couple the `se_social` and `sup_parents` indicators and bundle them into the same factor.

```{code-cell} ipython3
coords = {
Expand Down Expand Up @@ -1035,7 +1035,7 @@ compare_df

# Conclusion

We've just seen how we can go from thinking about the measurment of abstract psychometric constructs, through the evaluation of complex patterns of correlation and covariances among these latent constructs to evaluating hypothetical causal structures amongst the latent factors. This is a bit of whirlwind tour of psychometric models and the expressive power of SEM and CFA models, which we're ending by linking them to the realm of causal inference! This is not an accident, but rather evidence that causal concerns sit at the heart of most modeling endeavours. When we're interested in any kind of complex joint-distribution of variables, we're likely interested in the causal structure of the system - how are the realised values of some observed metrics dependent on or related to others? Importantly, we need to understand how these observations are realised without confusing simple correlation for cause through naive or confounded inference.
We've just seen how we can go from thinking about the measurement of abstract psychometric constructs, through the evaluation of complex patterns of correlation and covariances among these latent constructs to evaluating hypothetical causal structures amongst the latent factors. This is a bit of whirlwind tour of psychometric models and the expressive power of SEM and CFA models, which we're ending by linking them to the realm of causal inference! This is not an accident, but rather evidence that causal concerns sit at the heart of most modeling endeavours. When we're interested in any kind of complex joint-distribution of variables, we're likely interested in the causal structure of the system - how are the realised values of some observed metrics dependent on or related to others? Importantly, we need to understand how these observations are realised without confusing simple correlation for cause through naive or confounded inference.

Mislevy and Levy highlight this connection by focusing on the role of De Finetti's theorem in the recovery of exchangeablility through Bayesian inference. By De Finetti’s theorem a distribution of exchangeable sequence of variables be expressed as mixture of conditional independent variables.

Expand Down
2 changes: 1 addition & 1 deletion examples/case_studies/GEV.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -127,7 +127,7 @@
"cell_type": "markdown",
"metadata": {},
"source": [
"And now set up the model using priors estimated from a quick review of the historgram above:\n",
"And now set up the model using priors estimated from a quick review of the histogram above:\n",
"\n",
"- $\\mu$: there is no real basis for considering anything other than a `Normal` distribution with a standard deviation limiting negative outcomes;\n",
"- $\\sigma$: this must be positive, and has a small value, so use `HalfNormal` with a unit standard deviation;\n",
Expand Down
2 changes: 1 addition & 1 deletion examples/case_studies/GEV.myst.md
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ Consider then, the 10-year return period, for which $p = 1/10$:
p = 1 / 10
```

And now set up the model using priors estimated from a quick review of the historgram above:
And now set up the model using priors estimated from a quick review of the histogram above:

- $\mu$: there is no real basis for considering anything other than a `Normal` distribution with a standard deviation limiting negative outcomes;
- $\sigma$: this must be positive, and has a small value, so use `HalfNormal` with a unit standard deviation;
Expand Down
Loading